Kernel Traffic #299 For 6 Mar 2005

By Zack Brown

Table Of Contents

Mailing List Stats For This Week

We looked at 1452 posts in 9MB. See the Full Statistics.

There were 522 different contributors. 210 posted more than once. The average length of each message was 97 lines.

The top posters of the week were: The top subjects of the week were:
63 posts in 295KB by Ingo Molnar
49 posts in 241KB by joq@io.com
37 posts in 231KB by Con Kolivas
32 posts in 150KB by Vojtech Pavlik
28 posts in 158KB by Stelian Pop
108 posts in 508KB for "[RFC] Linux Kernel Subversion Howto"
75 posts in 384KB for "[patch, 2.6.11-rc2] sched: RLIMIT_RT_CPU_RATIO feature"
49 posts in 207KB for "Touchpad problems with 2.6.11-rc2"
44 posts in 264KB for "[PATCH]sched: Isochronous class v2 for unprivileged soft rt"
41 posts in 164KB for "[RFC] Reliable video POSTing on resume"

These stats generated by mboxstats version 2.2

1. Intel Software RAID Driver (iswraid) Going Into 2.4

28 Jan 2005 - 11 Feb 2005 (23 posts) Archive Link: "[ANNOUNCE] "iswraid" (ICHxR ataraid sub-driver) for 2.4.29"

Topics: Device Mapper, Disk Arrays: RAID, Disks: IDE, Disks: SCSI, Serial ATA

People: Martins KrikisArjan van de VenJeff GarzikMarcelo TosattiChristoph HellwigBartlomiej Zolnierkiewicz

Martins Krikis said:

Version 0.1.5 of the Intel Sofware RAID driver (iswraid) is now available for the 2.4 series kernels at http://prdownloads.sourceforge.net/iswraid/2.4.29-iswraid.patch.gz?download

It is an ataraid "subdriver" but uses the SCSI subsystem to find the RAID member disks. It depends on the libata library, particularly on either the ata_piix or the ahci driver, that enable the Serial ATA capabilities in ICH5/ICH6/ICH7 chipsets. More information is available at the project's home page at http://iswraid.sourceforge.net/.

Driver documentation is included in Documentation/iswraid.txt, which is part of the patch. The license is GPL.

The changes WRT version 0.1.4.3 are the following:

Please consider this driver for inclusion in the 2.4 kernel tree.

Jeff Garzik liked the patch, but Arjan van de Ven said, "personally I consider it a new feature, and I don't consider new features like this appropriate for a 2.4 deep maintenance stream." Bartlomiej Zolnierkiewicz sided with Arjan, and Jeff replied:

It sorts sucks for users with that hardware. The typical complaint comes from trying to share data between Windows and Linux, where "just use md" isn't a solution.

Without device mapper (another new feature) to enable dmraid, these users are just sorta S.O.L.

I consider it not a new feature, but a missing feature, since otherwise user data cannot be accessed in the RAID setups.

Christoph Hellwig said those people should upgrade to 2.6, and Arjan pointed out that Jeff's objections were true of all new hardware. The discussion went back and forth with no resolution, but at one point Marcelo Tosatti said:

I personally dislike and discourage the addition of ANY new drivers to v2.4 at this point, and I sincerely appreciate every argument against iswraid, but I have no problems with it because it looks like a valid special case since it allows users to access their ICH5/6 RAID partitions, as Jeff mentions.

Moreover the driver is going to die with v2.4 anyway, its not like any future compatibility problem is being introduced.

So I understand the argument against having it in the tree: the elegant way of doing it is to use dmraid.

But I dont buy it as an argument against merging it in a dying v2.4.x tree which purpose is to serve existing users.

You are mistaken in arguing that "oh, since this driver can be merged, its likely that any v2.6 HW support/driver will be accepted in v2.4".

So, its up to Jeff, and he seems to be OK with it.

2. HOWTO For Subversion Access To Kernel Sources

2 Feb 2005 - 11 Feb 2005 (108 posts) Archive Link: "[RFC] Linux Kernel Subversion Howto"

Topics: Version Control

People: Stelian PopLarry McVoyZack BrownBen Collins

Stelian Pop said:

I've played lately a bit with Subversion and used it for managing the kernel sources, using Larry McVoy's bk2cvs bridge and Ben Collins' bkcvs2svn conversion script.

Since there is little information on the web on how to properly set up a SVN repository and use it for tracking the latest kernel tree, I wrote a small howto (modeled after the bk kernel howto) in case it can be useful for other people too.

Feel free to comment on it (but let's not start a new BK flamewar or SVN bashing session please). If there is enough interest I'll submit a patch to include this in the kernel Documentation/ directory.

I've put it also on my web page along with the necessary scripts: http://popies.net/svn-kernel/

This was very well received, and Larry McVoy responded promptly to questions and requests regarding BitKeeper; nevertheless the discussion quickly degenerated to bickering between kernel folks who wanted more information than BitMover was prepared to give about their patch-handling features; and Larry, who wanted to impede the creation of competing software, and protect what he feels is his intellectual property.

(ed. [Zack Brown] I have to hand it to Larry. As the years go on and no free equivalent rises to replace BitKeeper, his claims become more and more justified. In the old days, open source proponents believed that no proprietary system could keep up with a horde of developers working on a free equivalent. Larry has apparently disproved this, much to the chagrin of many kernel developers and free software lovers everywhere.)

3. Preempt Real-Time For ARM

5 Feb 2005 - 10 Feb 2005 (9 posts) Archive Link: "Preempt Real-time for ARM"

Topics: Real-Time, SMP

People: Daniel WalkerThomas GleixnerRussell KingIngo Molnar

Daniel Walker said:

This is a release of Preempt Real-time for ARM . It includes everything up to CONFIG_PREEMPT_RT , and all of the latency tracing except interrupts off timing. The timing also excludes syscalls. This patch includes only a port to OMAP boards. However, it should be straight forward to get it working on other boards.

The biggest point of discussion relates to the interrupts in threads implementation. It is largely identical to what is implemented in the generic irq handling. However, ARM doesn't not implement generic irq handling, and will not support it in the near future. I am not in support of two different threaded interrupt implementations.

I recently made a proposal to separate the threaded interrupt handling from the generic irq handling, but I'm open to other ideas.

Thomas Gleixner said that on ARM, "We have done the conversion to the generic irq handling and it works fine on a couple of machines. I'm just waiting until the new SMP bits are there before I have another go and clean up the missing SMP bits." Ingo Molnar was very pleased to see this development, but Russell King had his doubts. He said:

Well, I remain unconvinced about the generic irq handling.

Back in 2.4 times, ARM used to use the x86 way to handle IRQs, and it caused lots of dropped IRQs for CF cards and the like, particularly in mixed level/edge triggered interrupt environments (where a mixture of level and edge based outputs are connected to edge triggered inputs.)

The ARM IRQ code got completely rewritten during 2.5 with a clean design, generated from the requirements of the machines.

This caused major changes throughout all the machine support files, and I'm _NOT_, repeat _NOT_ going to consider going back to some half baked approach which doesn't really fit the needs of the ARM architecture, just because "oh, it's generic." If it doesn't work reliably, I'm not interested.

This is especially so when it impacts so many machines in ways specific to each machine, and there's no way to get them tested in one go.

If this is to be done, doing it in the middle of a stable kernel series is NOT the time or place to do it. I have recently had people complaining about the "stability" of 2.6, particularly in relation to changes made by other people affecting drivers.

Consider these questions in relation to the generic IRQ code:

  1. Does it know the difference between handling level, edge-based and "simple" IRQs? ("simple" IRQs are those which are cascaded, but don't have their own individual interrupt mask controls.)
  2. Does the generic autoprobe code know which IRQs can be autoprobed and which can't? (cascade interrupts are just one example of interrupts which must not be autoprobed. There may be other reasons you wish to avoid probing other interrupts on a particular machine, which the machine support code knows about.)
  3. Does the generic IRQ code know which IRQs can be claimed and which can't? (IRQs 0 to NR_IRQS aren't always claimable, even when they appear to be available - iow, desc->handler != &no_irq_type.)
  4. Does it allow per "hw type" retriggering of interrupts, even if the hardware itself is not capable of such an action? (and running these interrupts at the next hardware interrupt?)
  5. Does it allow control of interrupt wakeup sources?
  6. Does it allow architectures to define their own irq_desc_t so that all the data for a particular IRQ is localised and contained within one data structure?

What you'll find is that the ARM interrupt structure is designed to efficiently meet the requirements of our wide range of hardware interrupt controllers, with chained interrupt controllers, with as low latency as possible.

In essence, I'm opposed to completely rewriting the ARM interrupt handling at this stage.

Thomas agreed in principle with the idea that reliability was more important then just making code generic for its own sake. And he agreed that the testing issue was significant. But he expressed his hope that Russell would be open to the conversion, especially since he expected other architectures to benefit from a conversion as well. Russell was unmoved, and after a couple more posts he abandoned the thread, saying, "I've said why per-IRQ locks are incorrect for the non-RT cases on ARM, but unfortunately just repeating the reasons why it's wrong isn't getting me anywhere either. So shrug, all I can to is explain why it's wrong, and if people choose not to listen there's nothing more I can do."

4. Kernel Size Reduction; Linus' Main System No Longer x86

6 Feb 2005 - 11 Feb 2005 (17 posts) Archive Link: "out-of-line x86 "put_user()" implementation"

People: Linus TorvaldsIngo MolnarAndrew MortonPavel Machek

Linus Torvalds said:

I was looking at some of the code we generate, and happened to notice that we have this strange situation where the x86 "get_user()" macros generate out-of-line code to do all the address verification etc, but the "put_user()" ones do not, and do everything inline.

I also noticed that (probably as a result of this), our "put_user()" on old i386 machines does not do the full magic manual page-following. Which means that copy-on-write doesn't necessarily work right due to the broken paging hw on the original 386 core.

I didn't fix the second part, but at least making things out-of-line makes it possible. And making "put_user()" be out-of-line seemed quite doable.

I no longer use x86 as my main machine, so this patch is totally untested. I've compiled it to see that things look somewhat sane, but that doesn't mean much. If I forgot some register or screwed something else up, this will result in a totally nonworking kernel, but I thought that maybe somebody else would be interested in looking at whether this (a) works, (b) migth even shrink the kernel and (c) might make us able to DTRT wrt the page table following crud (old i386 cores may be hard to find these days, so maybe people don't care).

Ingo Molnar confirmed that Linus' patch "boots fine and shrinks the image size quite noticeably" . Linus replied, "Goodie. Here's a slightly more recent version" , and added, "I'm not going to put this into 2.6.11, since I worry about compiler interactions, but the more people who test it anyway, the better." Pavel Machek suggested including the patch in Andrew Morton's -mm tree. At one point in the thread, Linus asked Andrew if this would be OK, and Andrew replied, "I'll take patches from anyone ;)" . And Linus said, "You'll never live it down. Once you get a name for being easy, you'll always be known as Andrew "patch-ho" Morton."

5. Elo Serial Touchscreen Driver; Generic Touchscreen Support

8 Feb 2005 - 10 Feb 2005 (23 posts) Archive Link: "[RFC/RFT] [patch] Elo serial touchscreen driver"

Topics: Touchscreen

People: Vojtech PavlikPaulo MarquesDmitry Torokhov

Vojtech Pavlik said:

I've written a driver for probably the most common touchscreen type - the serial Elo touchscreen.

The driver should handle all generations of serial Elos, as it handles Elo 10-byte, 6-byte, 4-byte and 3-byte protocols.

Dmitry Torokhov liked the patch, though he admitted he had no hardware to test it. Paulo Marques was very enthusiastic about touch-screen support, explaining, "I work for a company that develops software for restaurants, and we have a Linux port of our main application running in actual restaurants with a custom made Linux distribution for about 2 years now. We had to support a number of touchscreens, and we do it in the application itself, reading the serial port and processing the data. If this could go into the kernel, then our application needed only to read the input device, and handle events, no matter what touch screen was there. That would be a great improvement :)" Vojtech was thrilled to see such interest. Several posts down the line, Paulo Marques suggested:

I sometimes feel that we should have a "generic" touch screen driver from looking at the code for the different brands.

Almost all touch screen data goes something like this:

If this information could be passed as a module parameter, new touchscreens could be supported without any kernel modification.

We could parse a definition "string", like this:

"SIZE:10,SYNC:0:8:85,SYNC:8:8:54,X:24:8:1,X:32:8:256,Y:40:8:1,Y:48:8:256,T:16:2:1"

This string defines the touch driver for elotouch, 10 bytes packet (I didn't include the pressure reading, for simplification).

I currently have 6 different "drivers" that would all fit into this model. The same goes for all 3 elotouch protocols that you implemented.

Does this sound like a good idea?

But Vojtech said no, the amount of code saved by this approach would not justify the obfuscation it would produce. Nonetheless, a lively discussion ensued, in which folks discussed various technical issues involved in touchscreen support.

6. RelayFS Updated

9 Feb 2005 - 10 Feb 2005 (7 posts) Archive Link: "[PATCH] relayfs redux, part 4"

Topics: Assembly, SMP

People: Tom Zanussi

Tom Zanussi said:

Here's the latest relayfs patch, incorporating the previous round of suggestions. Thanks to everyone who sent comments. Here's a list of the major changes:

Also, there was some question as to whether or not the memcpy in relay_write() was being inlined properly - I looked at the generated assembly code, and it seems to be, but I'll be taking a closer look later.

This is what the API now looks like:

API functions:

rchan *relay_open(base_filename, parent, subbuf_size, n_subbufs, flags, callbacks);
void relay_close(chan);
dentry *relayfs_create_dir(name, parent);
int relayfs_remove_dir(dentry);
void relay_reset(chan);

void relay_write(chan, data, length);
void __relay_write(chan, data, length);
void *relay_reserve(chan, length);

void relay_subbufs_consumed(chan, subbufs_consumed, cpu);
void relay_commit(buf, subbuf_idx, count);

callbacks

int subbuf_start(buf, subbuf, prev_subbuf_idx);
int deliver(buffer, subbuf, subbuf_idx);
void buf_mapped(buf, filp);
void buf_unmapped(buf, filp);
void buf_full(buf);

As before, I've tested this code on a single proc machine using a hacked version of the kprobes network packet tracing module, which can be found here:

http://prdownloads.sourceforge.net/dprobes/plog.tar.gz?download

If people are more or less happy with the current version, I'll do some SMP testing and write some Documentation.

There seemed to be general support for Tom's work, with some technical criticism; though no real discussion ensured.

7. Linux 2.4.30-pre1 Released

10 Feb 2005 (1 post) Archive Link: "Linux 2.4.30-pre1"

Topics: Serial ATA

People: Marcelo Tosatti

Marcelo Tosatti announced Linux 2.4.30-pre1, saying:

Here goes v2.4.30-pre1.

It contains, amongst others, a SATA update, series of networking bug fixes, and v2.6 hardening backports.

8. Linux 2.6.11-rc4 Released; Status Of SIS5595 Driver In 2.6

12 Feb 2005 - 14 Feb 2005 (9 posts) Archive Link: "Linux 2.6.11-rc4"

Topics: I2C, Kernel Release Announcement, Version Control

People: Linus TorvaldsEnrico BartkyJean Delvare

Linus Torvalds announced Linux 2.6.11-rc4, saying:

this is hopefully the last -rc kernel before the real 2.6.11, so please give it a whirl, and complain loudly about anything broken.

As can be seen from the shortlog, most of the changes are pretty trivial. I think the biggest change is the radeon updates, and some of the NLS codepage things caused big diffs even if the changes themselves are pretty trivial (oh, and moving the ia64 "shubio.h" file accounts for about seven thousand lines of diffs, but no real changes ;)

In short: some driver updates, some arm/uml/sparc updates, and various random (mostly) one-liners all over. The most noticeable of the one-liners is hopefully that raid5/6 should work again.

Enrico Bartky asked, "It is possible to include the SIS5595 chip driver to the final release?" But Jean Delvare replied, "No, sorry. It's not even in -mm yet (in fact it's even not in Greg's bk-i2c tree yet). It needs to spend some time (and get some testing) in -mm before it can go to Linus. You are still welcome to get the patch (http://lkml.org/lkml/diff/2005/2/6/192/1) and apply it manually to your tree if you want support right now."

9. CPU Scheduler Documentation For Linux 2.6.8.1

13 Feb 2005 (3 posts) Archive Link: "Linux 2.6.8.1 CPU Scheduler Documentation"

People: Josh AasNick PigginWilly Tarreau

Josh Aas said:

I have written an introduction to the Linux 2.6.8.1 CPU scheduler implementation. It should help people to understand what is going on in the scheduler code faster than they would be able to by just reading through the code. The paper can be downloaded in PDF or LyX form from here:

http://josh.trancesoftware.com/linux/

This paper will never be "done," as I'd like to keep improving it over time, and updating it to newer versions of the kernel as time allows. If you have comments, suggestions, or corrections you'd like to make, please email me. Technical corrections in particular would be appreciated. Hopefully this can be as accurate and helpful as possible, and will inspire more people to look into the Linux scheduler.

My employer, SGI, did not ask me to write this paper - it was done as part of a school project last semester. While SGI owns the copyright to the paper, they have allowed me to release it under the GNU FDL.

Willy Tarreau was very interested in this documentation, and also offered suggestions on how to produce a better PDF file for it. Nick Piggin also liked seeing this work done.

10. In-Kernel Genetic Library Version 0.2 Released

15 Feb 2005 (5 posts) Archive Link: "[ANNOUNCE 0/4] Genetic-lib version 0.2"

People: Jake MoilanenPeter Williams

Jake Moilanen, continuing from Issue #294, Section #4  (6 Jan 2005: In-Kernel Genetic Algorithm Library) , said:

Here is the next release of the genetic library based against 2.6.10 kernel.

There were numerous changes from the first release, but the major change in this version is the introduction of phenotypes. A phenotype is a set of genes the affect an observable property. In genetic-library terms, it is a set of genes that will affect a particular fitness measurement. Each phenotype will have a set of children that contain genes that affect a fitness measure.

Now multiple fitness routines can be ran for each genetic library user. Then depending on the results of a particular fitness measure, the specific genes that directly affect that fitness measure can be modified. This introduces a finer granularity that was missing in the first release of the genetic-library.

I would like to thank Peter Williams for reworking the Zaphod Scheduler and help designing the phenotypes.

Some of the other features introduced is shifting the number of mutations depending on how well a phenotype is performing. If the current generation outperformed the previous generation, then the rate of mutation will go down. Conversely if the current generation performed worst then the previous generation, the mutation rate will go up. This mutation rate shift will do two things. When generations are improving, it will reduce the number unnecessary mutations and hone in on the optimal tunables. When a workload drastically changes, the fitness should go way down, and the mutation rate will increase in order to test a greater space of better values quicker. This should decrease the time it takes to adjust to a new workload. There is a limit at 45% of the genes being mutated every generation in order to prevent the mutation rate spiralling out of control.

SpecJBB and UnixBench are still yielding a 1-3% performance improvement, however (though it's subjective) the interactiveness has had noticeable improvements.

I have not broke the Anticipatory IO Scheduler down to a fine granularity in phenotypes yet. Any assistance would be greatly appreciated.

Currently I am hosting this project off of:

http://kernel.jakem.net

11. New -hf ("Hot Fix") Branch Of The 2.4 Tree

15 Feb 2005 (1 post) Archive Link: "[ANNOUNCE] kernel 2.4 hotfixes : 2.4.29-hf2"

Topics: Version Control

People: Willy TarreauAdrian Bunk

Willy Tarreau said:

after a short discussion with Marcelo, we quickly agreed that a hotfix tree would be a good thing for kernel 2.4, since a few months can separate two stable releases. I offered to help in this area because I already have to pick random patches from the BK changesets anyway, so the only additional work will be to pack them in a more presentable way than what I can do just for me. Marcelo offered to help me by telling me when he thinks that a particular patch needs to be merged or excluded.

Overall, the patches included are classified in 6 categories:

As much as possible, I will avoid changes in drivers because we all know that in this area, a fix for one user breaks another one.

Anyway, all of those patches will be extracted from -BK. I'm not building a parallel tree (I already have another one for that :-)). The main goal is that the diff between this kernel and the next release should be smaller than the diff between previous and next release.

Marcelo and I agreed on the "-hf" suffix (for "Hot Fix"). Two patches are systematically provided, one with the version in the Makefile, and one without. The goal is to ease both people using vanilla kernels in production (who need to check the real version) and people who want a virgin kernel base to apply external patches while limiting the number of rejects.

A tarball is also included with all the individual patches. A makefile relies on the "CONTENTS" file itself to rebuild the patch against the kernel of your choice. It is very easy to add/remove patches in the CONTENTS file, so I hope it will be convenient enough for people who don't want to include minor or documentation fixes for example. It is documented anyway.

I've started with 2.4.29 and released 2.4.29-hf2 a few days ago. In fact, I was waiting for the site to migrate from home to my company (EXOSEC) to benefit from more outgoing bandwidth, before sending this announce. Despite this, I will try to stay up to date within the shortest time, and release a new hotfix as soon as either something weird gets fixed, or Marcelo asks me to do so. In any case, I'll do my best to send an announce here for new releases.

The patches and tarballs are hosted here :

http://linux.exosec.net/kernel/2.4-hf/

I'm open to comments, advices, critics, etc, but flames such as "you lose your time" or "2.4 is dead" will feed /dev/null. One possible improvement I have already identified would be to publish the work in progress so that people could get fixes in real time, but if they really need so, they should download from BK instead.

If this branch gets enough demand, I will maintain several previous versions in parallel (eg: go back to 2.4.27).

In the mean time, please find here the CONTENTS file which details what patches have been included and basically what they do.

 

 

 

 

 

 

Sharon And Joy
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.