Kernel Traffic #293 For 9 Jan 2005

By Zack Brown

Table Of Contents


Now that Google has launched its new Groups2 ( feature, the script I used to determine the proper archive URL for each thread has broken. If anyone feels like scripting up a replacement I would be very grateful.

Note that searching on message-id doesn't work, because Google uses an NNTP gateway that replaces the true message-id with something else. It would be perfect if the script produced urls looking like <>, which is obtained by clicking on the 'show options' part of an email in Groups2, and then selecting the 'Individual Message' option.

Mailing List Stats For This Week

We looked at 1179 posts in 6749K.

There were 366 different contributors. 185 posted more than once. 109 posted last week too.

The top posters of the week were:

1. Forward Porting Some Big-RAM VM Fixes From 2.4 To 2.6

24 Dec 2004 - 3 Jan 2005 (5 posts) Archive Link: "VM fixes [2/4]"

Topics: Big Memory Support, Forward Port, Virtual Memory

People: Andrea ArcangeliNick PigginMarcelo TosattiAndrew Morton

Andrea Arcangeli said, "This is the forward port to 2.6 of the lowmem_reserved algorithm I invented in 2.4.1*, merged in 2.4.2x already and needed to fix workloads like google (especially without swap) on x86 with >1G of ram, but it's needed in all sort of workloads with lots of ram on x86, it's also needed on x86-64 for dma allocations. This brings 2.6 in sync with latest 2.4.2x." Nick Piggin took a look at this, and felt that it really simplified the code. But he asked, "should it be on by default? I don't think we ever reached an agreement. I'd say yes, after a run in -mm because it does potentially fix corner cases where lower zones get filled with un- freeable memory which could have been satisfied with higher zones." Andrea replied, "I definitely agree it should be on by default, I already had an hang report that was solved by more recent kernels and that probably can only be explained by lowmem_reserve since there aren't other mm changes in 2.6.5 based trees."

Nick in his original post, also asked if Andrea could port his patches to Andrew Morton's -mm tree; Andrea said, "I already had to port to 2.6.5 too, and that's enough for now unless I first get a positive ack that it will be merged (if I hadn't more interesting things to develop, I would be happily porting it)." Marcelo Tosatti replied, "I believe it can be accepted easily if you change the variable names from protection to lowmem_reserve. Is there a need for that or its just your taste? :)" Andrea replied:

The naming is in sync with 2.4, I called that feature lowmem_reserve when I wrote it. Protection doesn't actually mean anything. Memory protection, mprotect, what?

The object of the feature is to reserve lower memory in function of the classzone allocation, and in function of the zone we're allocating from. So lowmem_reserve sounds a much better name. And it wasn't me to change it, it was the 2.6 kernel calling it differently in the first place. Note that at first 2.6 was doing stuff very differently from 2.4 too (and it wasn't working right infact). Now it's in perfect sync with the 2.4 algorightm I wrote originally and so I thought it would be much cleaner to call it the same way as 2.4, which is more self explanatory too.

2. Linux 2.6.10 Released; Some Problems With Software Suspend

24 Dec 2004 - 30 Dec 2004 (25 posts) Archive Link: "Ho ho ho - Linux v2.6.10"

Topics: Digital Video Broadcasting, FS: CIFS, Kernel Release Announcement, Software Suspend, USB

People: Linus TorvaldsWichert AkkermanPavel MachekRafael J. Wysocki

Linus Torvalds announced Linux 2.6.10, saying:

Ok, with a lot of people taking an xmas break, here's something to play with over the holidays (not to mention an excuse for me to get into the Glögg for real ;)

Mostly a lot of small fixes since 2.6.10-rc3, with the biggest thing being probably the CIFS update and the switch-over to the new DVB frontend driver world order. Some MMC and USB work too, and ARM updates as usual.

In the course of discussion, Wichert Akkerman reported, "2.6.10 broke resume for me: when I resume it immediately tries to suspend the machine again but gets stuck after suspending USB." Rafael J. Wysocki also had trouble resuming after a suspend under 2.6.10, but only once in awhile. Pavel Machek took a stab at this, but it turned out Rafael was using AMD64, wihle Pavel was patching the i386 code. Debugging efforts stalled completely at that point, and the discussion petered out.

3. Big Speed And Reliability Improvements For Software Suspend

25 Dec 2004 - 31 Dec 2004 (4 posts) Archive Link: "swsusp: Kill O(n^2) algorithm in swsusp"

Topics: Big O Notation, Software Suspend

People: Pavel MachekRafael J. Wysocki

Pavel Machek said:

Some machines are spending minutes of CPU time during suspend in stupid O(n^2) algorithm. This patch replaces it with O(n) algorithm, making swsusp usable to some people.

I'd like people to test this. It should probably spend few weeks in -mm tree to get some beating. OTOH SUSE has variant of this patch in its kernel.

Someone reported tremendous improvements with this patch, saying that their system would suspend in about 5 seconds, as opposed to a minute or more without Pavel's patch. Eduard Bloch also found it very stable and reliable, even after many many uses. And Rafael J. Wysocki added, "Confirmed. I've been running it for quite some time with 2.6.10 on an AMD64 and it works great."

4. Linux 2.6.10-ac1 Released; Some Samba Improvements And Problems

26 Dec 2004 - 31 Dec 2004 (27 posts) Archive Link: "Linux 2.6.10-ac1"

Topics: Disks: SCSI, Forward Port, Kernel Release Announcement, Samba, USB

People: Alan CoxGene HeskettArjan van de Ven

Alan Cox announced Linux 2.6.10-ac1, saying:

Linux 2.6.10-ac1 is a merge of the stuff that has not yet been accepted upstream along with a couple of small extra changes that are needed because of changes in 2.6.10 base. In addition the generic IRQ work in 2.6.10 means that the forward port of the irqpoll code now covers a lot more platforms.

While this has had a lot less testing than 2.6.9-ac16 it does contain much better core USB and SCSI code so may in some cases be worth an early move.

Arjan van de Ven is now building RPMS of the kernel and those can be found in the RPM subdirectory and should be yum-able. Expect the RPMS to lag the diff a little as the RPM builds and tests do take time.

Gene Heskett, who had been having problems getting Samba to work under 2.6.10, found that these problems vanished with 2.6.10-ac1, and that he could mount and unmount Samba shares in small fractions of a second, better than he'd ever seen before. He did run into a lot of warning messages, but according to Alan these may have just been old errors exposed by the improved code. They didn't seem to interfere with Samba usage.

5. Support For CSB6 RAID

27 Dec 2004 - 30 Dec 2004 (2 posts) Archive Link: "PATCH: 2.6.10 - Add support for CSB6 RAID"

Topics: Disk Arrays: RAID

People: Alan CoxLinus TorvaldsBartlomiej Zolnierkiewicz

Alan Cox said, "The serverworks chips include a raid variant that the 2.6 driver didn't support. This" [patch] "enables support for this and removes a pile of #if and other pointless obfuscations. This removes the need to use various vendor binary only drivers for CSB6 RAID" . Bartlomiej Zolnierkiewicz liked the patch and accepted it for his set of submissions heading for Linus Torvalds.

6. Enhanced Linux Progress Patch Updated; New Maintainer

30 Dec 2004 (1 post) Archive Link: "[ANNOUNCE] Enhanced Linux Progress Patch v1.0-2.6.10"

People: Matthias Kunze

Matthias Kunze said, "I just wanted to announce that I've updated the Enhanced Linux Progress Patch to work with linux 2.6.10. As it doesn't seem to be maintained anymore i've put up a tiny homepage at where everything can be downloaded."

7. Linux 2.6.10-ac2 Released; PWC Driver Reintroduced

30 Dec 2004 - 2 Jan 2005 (4 posts) Archive Link: "Linux 2.6.10-ac2"

Topics: Kernel Release Announcement, Philips Webcam Driver

People: Alan CoxArjan van de VenLuc Saillard

Alan Cox announced Linux 2.6.10-ac2, saying, "Arjan van de Ven is now building RPMS of the kernel and those can be found in the RPM subdirectory and should be yum-able. Expect the RPMS to lag the diff a little as the RPM builds and tests do take time." Christian Hesse noticed that the PWC driver, newly restored to 2.6.10-ac2 by Luc Saillard, could only be built as a module. Christian posted a patch to allow it also to be built directly into the kernel.

8. util-linux 2.12q-pre1 Released; New Maintainer Found

1 Jan 2005 (1 post) Archive Link: "[OT] util-linux 2.12q-pre1"


People: Adrian BunkAndries Brouwer

Adrian Bunk announced util-linux version 2.12q-pre1 ( , and confirmed Andries Brouwer's statements that Adrian was now the maintainer of that project. In fact, part of Adrian's patch was an update to the MAINTAINERS file.

Andries first put util-linux up for adoption in the thread covered in Issue #278, Section #6  (19 Sep 2004: New Maintainers Sought For kbd, man, man-pages, And util-linux) .

9. Getting Started With Kernel Hacking

1 Jan 2005 - 3 Jan 2005 (9 posts) Archive Link: "How to start"

People: Jim NelsonPedro VendaJonathan CorbetAlessandro Rubini

Someone asked how best to get started with kernel development, and Christoph Anton Mitterer (another newcomer) pointed him to Kernelnewbies ( . Jim Nelson said, "Hit - has a good selection of dead-tree and online references. The kernel-janitors project - - is a good starting point; that's where a lot of kernel hackers get their start." Pedro Venda also remarked:

this has been very recently asked on the list. some of the suggested answers were:


(last two will have new editions soon covering 2.6 kernels)

Deepak Kotian had read and liked 'Linux Kernel Development', but wondered when the 3rd edition of 'Linux Device Drivers' would be out; and Jonathan Corbet replied, "LDD3 (by Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman) will, it is hoped, be ready to be the star of the show at LinuxWorld in Boston, next month. The online release will, as usual, take some time to prepare; I can't predict just when that will be."

10. Root Exploit. Or Not.

2 Jan 2005 - 4 Jan 2005 (16 posts) Archive Link: "[PATCH] disallow modular capabilities"

People: Christoph HellwigLee RevellLinus TorvaldsChris Wright

Christoph Hellwig reported:

There's been a bugtraq report about a root exploit with modular capabilities LSM support out for more than a week.

This patch fixes it the hard way by disallowing to build the code modular. In fact I think allowing modular security policies is a really, really bad idea because loading it after boot loses far too much state. Would you take a patch killing the exports in security/ ?

Lee Revell replied, "And I posted this to LKML almost a week ago, and a real fix was posted in response." . Linus Torvalds replied, "Well, I realize that it has been on bugtraq, but does that make it a real concern? I'll make the tristate a boolean, but has anybody half-way sane ever _done_ what is described by the bugtraq posting? IOW, it looks pretty much like a made-up example, also known as a "don't do that then" kind of buglet ;)" Christoph agreed that this particular case probably wasn't much to worry about, but he said, "I think we'll see more serious issues with other modular security modules. The security modules aren't really as isolated as all the driver modules we have as they're deeply interwinded with the process/file/etc state." And Chris Wright replied, "It's only a problem when you care about the state of things that have run before the module is loaded. This ranges between no problem and major problem on a case by case basis. For example, really makes sense to have SELinux only compiled in. For this one, we can just track capabilities bits in default dummy stub code, it's painless and allows keeping capabilities modular for those who use it that way."

11. Clarifying Subscriber-Only Mailing Lists In The MAINTAINERS File

3 Jan 2005 (1 post) Archive Link: "[patch] maintainers: mark linux-arm-kernel as subscription only"


People: Domen Puncer

Domen Puncer posted a patch against the MAINTAINERS file, to mark the linux-arm-kernel mailing list as requiring subscription to post.

12. Getting/Setting FAT Filesystem Attribute Bits

3 Jan 2005 - 5 Jan 2005 (20 posts) Archive Link: "[PATCH] get/set FAT filesystem attribute bits"

Topics: FS: FAT, FS: NTFS, Ioctls, Microsoft

People: H. Peter AnvinNicholas Miell

H. Peter Anvin said, "This patch adds a set of ioctls to get and set the FAT filesystem native attribute bits, including the unused bits (6 and 7.)" Nicholas Miell suggested, "Instead of adding another ioctl, wouldn't an xattr be more appropriate? For instance, system.fatattrs containing a text representation of the attribute bits." H. Peter replied, "This really worries me, because it's not clear to me that Microsoft isn't going to add NTFS-style xattrs to FAT in the future. There is a very specific reason why they might want to do that: since they want to keep NTFS secret and proprietary, FAT is the published interchange format that other devices can use to exchange data with MS operating systems. If we then have overloaded the xattr mechanism, that would be very ugly." Nicholas replied:

That's why I put fatattrs in the system namespace, which is wholly owned by the Linux kernel. Any theoretical FAT-with-xattrs variant would put those xattrs in the user namespace.

On another note, NTFS-style xattrs (aka named streams) are unrelated to Linux xattrs. A named stream is a separate file with a funny name, while a Linux xattr is a named extension to struct stat.

This made sense to H. Peter, and they dove into some technical details together.

13. More On FAT Attributes

3 Jan 2005 - 4 Jan 2005 (37 posts) Archive Link: "FAT, NTFS, CIFS and DOS attributes"

Topics: Extended Attributes, FS: CIFS, FS: FAT, FS: NTFS, FS: ext2, Ioctls

People: H. Peter AnvinAnton AltaparmakovNicholas Miell

H. Peter Anvin said:

I recently posted to LKML a patch to get or set DOS attribute flags for fatfs. That patch used ioctl(). It was suggested that a better way would be using xattrs, although the xattr mechanism seems clumsy to me, and has namespace issues.

I also think it would be good to have a unified interface for FAT, NTFS and CIFS for these attributes.

I noticed that CIFS has a placeholder "user.DosAttrib" in cifs/xattr.c, although it doesn't seem to be implemented.


a) is xattr the right thing? It seems to be a fairly complex and ill-thought-out mechanism all along, especially the whole namespace business (what is a system attribute to one filesystem is a user attribute to another, for example.)

b) if xattr is the right thing, shouldn't this be in the system namespace rather than the user namespace?

c) What should the representation be? Binary byte? String containing a subset of "rhsvda67" (barf)?

Anton Altaparmakov said, definitely not the string subset in item (c). He said:

In NTFS, the "dos attribute flags" are part of the system information attribute which is an entity in its own right, totally separate from extended attributes (and named streams for that matter). So if I were to be thinking in an NTFS-only world I would be inclined to use an ioctl() to access/modify them (i.e. not b either). So if you implement an ioctl() for vfat I will probably be able to provide the same in NTFS with almost zero effort (we already have the code to read and write the attribute flags in the kernel ntfs driver, we just do not provide an interface for it).

But please note that it would be best if you could use 32-bits for the flags. At the very least 16-bits though as on NTFS there are currently in use 16-bits in the standard information but the field is u32 sized on disk (little endian) and two of the higher bits are in use in the file name attribute as well and I would not be surprised if more bits get used in future NTFS releases.

Nicholas Miell thought there was nothing wrong with the string subset idea; while on the other hand he felt an ioctl would definitely be the wrong way to go. He said:

Remember, the point of this exercise is to expose these attributes in such a way that tools don't have to have any special knowledge to correctly preserve them.

If I want to be able to copy files from one NTFS volume to another (preserving all their NTFS attributes), I don't want to have to teach cp to run a Linux-specific and NTFS-specific ioctl on each file on the source and destination for it to work, it should be able to see the xattrs and just do the right thing.

The fact that the NTFS "dos attribute flags" are seperate from real extended attributes isn't a problem, either. Real extended attributes can be exported in the user namespace, just like ext2/3 does. (Or are the real extended attributes something other than inert blobs of data -- does Windows care about their contents at all, or does it just store them for users who do?)

Regarding the string subset issue, Anton argued that strings could get really ugly, and could be a problem for internationalization; while binary was shorter and better-defined. He said if a string was desired, a translation library would be the way to go, keeping binary in the back-end.

Regarding ioctl vs. xattr, Anton and Nicholas continued to disagree, without reaching any resolution. Nicholas said he didn't really care about the string vs. binary issue, saying he'd only suggested the string idea for the sake of readability of /proc files and other places.

Other folks duked it out elsewhere as well, with about as much agreement.

14. Problems Viewing Files In /proc

4 Jan 2005 - 5 Jan 2005 (8 posts) Archive Link: "[PATCH] request_irq: avoid slash in proc directory entries"

People: Olaf HeringAndrew Morton

Olaf Hering said:

A few users of request_irq pass a string with '/index.html'. As a result, ls -l /proc/irq/*/* will fail to list these entries.

 drivers/input/serio/maceps2.c     |    2 +-
 drivers/macintosh/via-pmu.c       |    2 +-
 drivers/net/wan/hostess_sv11.c    |    2 +-
 include/asm-sh/mpc1211/keyboard.h |    2 +-
 include/asm-sh64/keyboard.h       |    2 +-
 sound/isa/opl3sa2.c               |    2 +-

Andrew Morton replied:

hrm, interesting. So how do these entries appear in /proc? Do they actually have slashes in them?

I get the feeling that something somewhere should be detecting this and should be propagating an error back.

Olaf agreed that a quick sanity check would help, and clarified, "ls /proc/irq/*/* works, but ls -l does not because you have to stat() the entry. I havent looked in detail, just poked around in /proc."

Nathan Lynch tried hacking up the sanity check, but the thread petered out inconclusively.

15. mkdump Updated

5 Jan 2005 (1 post) Archive Link: "mkdump updated"

People: Itsuro Oda

Itsuro Oda said:

We released the Beta-3 version of mkdump end of last year.

We checked the code from crash occur to the mini kernel start carefully and eliminate the possibility of the deadlock/hang condition. (We hope :-))

Please check it.







Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.