Kernel Traffic #199 For 6 Jan 2003

By Zack Brown

Table Of Contents


The past two issues and much of this one were written far from home, and I'd like to thank Lisa Goldstein of the FSF for letting me use one of their computers over the past couple weeks. Thanks!

Mailing List Stats For This Week

We looked at 1103 posts in 5308K.

There were 331 different contributors. 173 posted more than once. 121 posted last week too.

The top posters of the week were:

1. System Call Handling; Feature Freeze; Code Freeze; BitKeeper Flames

9 Dec 2002 - 30 Dec 2002 (306 posts) Archive Link: "Intel P6 vs P7 system call performance"

Topics: Bug Tracking, Code Freeze, Disks: IDE, Feature Freeze, Ottawa Linux Symposium, Patents, Spam, Version Control

People: Linus TorvaldsHorst von BrandMark MielkeAlan CoxAndrew MortonDave JonesJeff GarzikLarry McVoyJohn BradfordSean NeakumsUlrich Drepper

There was a long discussion about adapting Linux to Intel's new system call handlers, sysenter and sysexit. According to a variety of folks involved in the discussion, handling these features without sacrificing too much efficiency is the real key. A number of solutions were proposed by various folks (including Linus Torvalds), but all seemed to be compromises of one sort or another. Ulrich Drepper also posted in this thread, saying he'd created a glibc that used the new syscall code. He got it to work, but ended up having to use some ugly hacks. A bunch of folks continued the discussion, and there was some suggestion (from Linus), that it would be important to get this right from the start, as there would be problems changing it later. The main problem (aside from efficiency) seemed to boil down to supporting programs that might run on multiple kernel versions. And just as it seemed Linus and Ulrich (and others) were getting close to a solution, Linus also pointed out that the number of syscall arguments was also an issue; so any solution had to ensure support for six syscall arguments, on top of all the other issues. The discussion went on with advancements and new problems, and at one point Linus said:

I'm pushing what looks like the "final" version of sysenter/sysexit for now. There may be bugs left, but all the known issues are resolved:

This is in addition to the six-argument issues and the glibc address query issues that were resolved yesterday.

The technical discussion went on and on, but at one point, Horst von Brand asked, "What happened to "feature freeze"?" Sean Neakums pointed out that this was technically not a new feature, since folks were only talking about optimizing the system call interface, which had existed for a long time. But Horst replied, "This "optimizing" is very much userspace-visible, and a radical change in an interface this fundamental counts as a new feature in my book." Mark Mielke remarked, "Since operating systems like WIN32 are at least published to take advantage of SYSENTER, it may not be in Linux's interest to purposefully use a slower interface until 2.8 (how long will that be until people can use?). The last thing I want to read about in a technical journal is how WIN32 has lower system call overhead than Linux on IA-32 architectures. That might just be selfish of me for the Linux community... :-)"

But Alan Cox agreed with Horst, saying he'd been wondering what happened to the feature freeze as well. He said:

2.5.49 was usable for devel work, no kernel since has been. Its stopped IDE getting touched until January.

Linus. you are doing the slow slide into a second round of development work again, just like mid 2.3, just like 1.3.60, ...

At one point Linus said:

it's a fair question.

I've been wondering how to formalize patch acceptance at code freeze, but it might be a good idea to start talking about some way to maybe put brakes on patches earlier, ie some kind of "required approval process".

I think the system call thing is very localized and thus not a big issue, but in general we do need to have something in place.

I just don't know what that "something" should be. Any ideas? I thought about the code freeze require buy-in from three of four people (me, Alan, Dave and Andrew come to mind) for a patch to go in, but that's probably too draconian for now. Or is it (maybe start with "needs approval by two" and switch it to three when going into code freeze)?

Andrew Morton replied:

It does sound a little bureacratic for this point in development.

The first thing we need is a set of widely-understood guidelines. Such as:


Once everyone understands this framework then it becomes easy to decide what to drop, what not.

So right now, sysenter is "in". Later, even "speedups" falls off the list and sysenter would at that stage be "out".

Can it be that simple?

Elsewhere, Dave Jones said:

You'd likely need an odd number of folks in this cabal^Winner circle though, or would you just do it and be damned if you got an equal number of 'aye's and 'nay's ? 8-)

Other than that, it reminds me of the way the gcc folks work, with a number of people reviewing patches before acceptance [not that this doesn't happen on l-k already], and at least 1 approval from someone prepared to approve submissions.

The approval process does seem to be quite a lot of work though. I think it was rth last year at OLS who told me that at that time he'd been doing more approving of other peoples stuff than coding himself.

Linus replied:

Quite frankly, I wouldn't expect a lot of dissent.

I suspect a group approach has very little inherent disagreement, and to me the main result of having an "approval process" is to really just slow things down and make people think about the submitting. The actual approval itself is secondary (it _looks_ like a primary objective, but in real life it's just the _existence_ of rules that make more of a difference).

He added:

I heartily disagree with the approval process for development, just because it gets so much in the way and just annoys people. But for stabilization, that's exactly what you want. So I think gcc is using the approval process much too much, but apparently it works for them.

And I think it could work for the kernel too, especially the stable releases and for the process of getting there. I just don't really know how to set it up well.

Jeff Garzik also said, "gcc's approval process looks a lot like the Linux approval process. Dave's description of rth's work sounds a lot like the Linus Role in Linux... with the exception I guess that there are multiple peer Linii in gcc, and they read every patch <runs for cover> More seriously, gcc appears to be "post the patch to gcc-patches, hope someone applies it" which is a lot more like Linux than some think :)" Alan also replied to Linus, saying:

A start might be

  1. Ack large patches you don't want with "Not for 2.6" instead of ignoring them. I'm bored of seeing the 18th resend of this and that wildly bogus patch.

    Then people know the status

  2. Apply patches only after they have been approved by the maintainer of that code area.

    Where it is core code run it past Andrew, Al and other people with extremely good taste.

  3. Anything which changes core stuff and needs new tools, setup etc please just say NO to for now. Modules was a mistake (hindsight I grant is a great thing), but its done. We don't want any more
  4. Violate 1-3 when appropriate as always, but preferably not to often and after consulting the good taste department 8)

This led into an interesting mini-flamewar, when Larry McVoy suggested:

Make it async. So anyone can review stuff and record their feelings in a centralized place. We have a spare machine set up,, that could be used as a dumping grounds for patches and reviews if is too locked down.

If you force the review process into a "push" model where patches are sent to someone, then you are stuck waiting for them to review it and it may or may not happen. Do the reviews in a centralized place where everyone can see them and add their own comments.

Alan replied only, "We've got one - its called linux-kernel." Larry replied:

Huh? That's like saying "we don't need a bug database, we have a mailing list". That's patently wrong and so is your statement. If you want reviews you need some place to store them. A mailing list isn't storage.

You'll do it however you want of course, but you are being stupid about it. Why is that?

Alan said, "We've got a bug database (bugzilla), we've got a system for seeing what opinion appears to be -kernel-list" . Larry replied, "And exactly how is your statement different than "we have a system for seeing what bugs appear to be -kernel-list"?" John Bradford pegged the discussion for what it was, and said with a smile, "This forthcoming BK-related flamewar falls in to category 1, I.E. is not a 2.6 feature :-)" But Larry just came back with, "I don't understand why BK is part of the conversation. It has nothing to do with it. If every time I post to this list the assumption is that it's "time to beat larry up about BK" then it's time for me to get off the list. I can understand it when we're discussing BK; other than that, it's pretty friggin lame. If that's what was behind your posts, Alan, there is an easy procmail fix for that." Alan replied, "It wasnt me who brought up bitkeeper." To which, Larry replied privately to Alan, "PLONK. Into kernel-spam you go. I've had it with ax grinders." Alan quoted this email publically, and said:

Oh dear me. Larry McVoy has flipped

I'm now being added to his spam list for *not* mentioning bitkeeper

Poor Larry, I hope has a nice christmas break, he clearly needs it

2. Linux 2.5.53 Released

23 Dec 2002 - 30 Dec 2002 (11 posts) Subject: "Linux v2.5.53"

Topics: Device Mapper, Disks: SCSI, Power Management: ACPI, Sound: ALSA, USB

People: Linus Torvalds

Linus Torvalds announced 2.5.53 ( and said:

A happy christmas to you all, and in case I'm too busy putting batteries in the kids presents the rest of the year, here's a 2.5.53 for you.

It's got stuff all over - SCSI updates, ACPI, ia64, sparc, USB, net, device mapper, AGP, ALSA, you name it. Meanwhile I worked mostly on the sysenter support, we'll have to wait for glibc releases to test that out more.

Oh, and merges with Andrew and Dave.

3. User-Mode Linux 2.5.53-1 Released

27 Dec 2002 (1 post) Archive Link: "uml-patch-2.5.53-1"

Topics: User-Mode Linux

People: Jeff Dike

Jeff Dike announced:

This patch updates UML to 2.5.53. As far as UML itself is concerned, this is identical to all recent 2.5 UML releases, except that I tossed in a small fix for a race involving multiple xterms popping up at once.

I'm in the process of merging my recent 2.4 changes into my 2.5 tree, but I figured I'd get this patch out first.

The 2.5.53 UML patch is available at

For the other UML mirrors and other downloads, see

Other links of interest:

The UML project home page :
The UML Community site :

4. User-Mode Linux Security And Performance Enhancements

28 Dec 2002 (10 posts) Archive Link: "[PATCH] Allow UML kernel to run in a separate host address space"

Topics: User-Mode Linux, Version Control

People: Jeff DikeLinus TorvaldsJeremy Fitzhardinge

Jeff Dike offered:

Please pull either or

This allows the UML kernel to run in a different address space from its processes. The benefits include better security and much improved performance. This is a large patch, but

it's all under arch/um and include/asm-um
a lot of it is code movement

This is described fairly completely in

Linus Torvalds replied:

Pulled, but that /proc/mm crap has to go (it wasn't in this patch, or I would have rejected it).

What are the semantics the host code wants/needs, and how can we implement a sane generic mechanism that doesn't involve opening magic files?

Having co-processes isn't wrong in itself, I just want the support to be clean and generic, instead of a huge hack.

Jeff replied that he knew Linus didn't like the /proc/mm part of the patch, which was why he'd left it out this time. He said, "I realize that it's a lousy interface - I'm putting it out there because I don't really have any better ideas and I'm hoping other people do. The next iteration of that patch will turn /proc/mm into /dev/mm, but that's not really a great improvement. It just improves things around the edges a little." He also answered Linus question about the semantics:

  1. Multiple address spaces per process
  2. Ability to make a child switch between address spaces
  3. Ability to manipulate a child's address space (i.e. mmap, munmap, mprotect on an address space which is not current->mm)

Jeremy Fitzhardinge remarked, "I suspect Valgrind could use this too at some point. There hasn't been much discussion about it yet, but I think Valgrind may well move towards a more complete virtualization in a later round of development, and isolating the virtual virtual address space from the Valgrind's real virtual address space would be very useful. (Jeff suggested the idea of merging Valgrind and UML at some level, which does raise some interesting possibilities.)" And Jeff said:

Yes, valgrind already has a pseudo-scheduler, a psuedo-threads library, it delivers signals by hand, and it wants to run its client in a separate thread so it can get out of the business of being an LD_PRELOAD shared library.

This is all stuff that UML has, that UML does right (/me crosses fingers), and that is usable by Valgrind (and anything else that's interested) with some repackaging of UML as a library.

Replacing Valgrind's signal delivery with UML's is a no-brainer. Replacing its scheduler and threads library would involve it creating UML processes by calling UML's do_fork(). Valgrind would need to provide the low-level switch_to, I think. There are probably other things that Valgrind would need to provide, but I see no reason this wouldn't work.

Linus wasn't 100% happy with Jeff's description however, and the two of them went over some of the technical details. Linus also had some comments on Jeff's suggestion that they add a new file descriptor argument to mmap() and related system calls. Linus replied:

I do believe that fd's are a natural way to handle it, since it needs _some_ kind of handle, and the only generic handles the kernel has is a file descriptor. We could create a new kind of handle, but it would be likely to be just more complexity.

HOWEVER, the part I worry about is creating tons of new system calls that just duplicate existing ones by adding a "fd" argument. That part I really don't much like. Because if this were to really be a generic feature, it really wants pretty much _all_ system calls supported, ie things like

        fd = open(<mm,ptr>ags, ...);

        retval = read(<tr>

to allow the user to not just mmap but generally "take the guise of" any other mm for the duration of the system call.

Which really means that I _think_ the right approach would be to literally have a "indirect-system-call-using-this-mm" system call, which does something like

        asmlinkage sys_mm_indirect(int fd, struct syscall_descriptor_block *user_args)
                struct mm_struct *old_mm;
                struct syscall_descriptor_block args;

                if (memcpy_from_user(&args, user_args, sizeof(args)))
                        return -EFAULT;

                mm = get_fd_mm(fd);
                old_mm = current->mm;
                current->mm = mm;


                current->mm = old_mm;

which allows _any_ system call to be made for that mm.

Jeff said, "Hmmm, I wasn't planning on going that far, but this certainly works for UML"

5. Possible Violation Of GPL By TimeSys

29 Dec 2002 (2 posts) Archive Link: "TimeSys violating GPL?"

People: Martijn SipkemaRik van Riel

Martijn Sipkema asked, "Is TimeSys ( violating the GPL by extending Linux with new features (high resolution clocks and timers, protection against priority inversion) by adding a proprietary loadable kernel module?" Rik van Riel replied:

If their module is a derivative of GPL code, then yes.

If the total work consisting of GPL code and their proprietary module is a derivative of GPL code, then probably.

There were no other replies.

6. Status Of 2.5 Alpha Port

29 Dec 2002 - 30 Dec 2002 (15 posts) Archive Link: "Alpha port still maintained in 2.5"

People: Sam RavnborgHannes ReineckeIvan KokshayskyRichard Henderson

Markus Pfeiffer noticed that the 2.5.53 kernel was obviously broken for Alpha. Before banging his head against all the problems, he wanted to know if anyone else had left any brow marks he could follow. Sam Ravnborg replied:

Richard Henderson is working with tgafb on alpha. I'm looking into the architecture specific Makefiles in cooperation with Richard.

I recall alpha patches from others as well, but do not recall anything about module support.

Hannes Reinecke also gave a link to some fixes ( , and added, "to answer the original question: _Actively_ being maintained is a bit of an euphemism, 'occasionally being patched' is probably more accurate. Richard Henderson and Ivan Kokshaysky are the main men behind the port. I try to give the port the occasional bug-fix."

7. Adding sysenter Support To glibc

29 Dec 2002 - 30 Dec 2002 (4 posts) Archive Link: "glibc binaries w/ sysenter support"

Topics: Bug Tracking, Version Control

People: Ulrich DrepperLinus Torvalds

Ulrich Drepper announced:

After quite some fiddling we finally have some glibc binaries with sysenter support. The problems were not in the sysenter code but in coordinating everything in so that it works on old kernels (without TLS support).

Anyway, the result can be downloaded from

These RPMs are drop-in replacements for the ones in the last Red Hat beta, released about a week ago. They haven't been tested in any other environment. They also use NPTL as the default libpthread. As is the case with every beta release code, do *not* install them on production machines. We see no problems with the new code but your mileage may vary. If you see problems ideally file them in Red Hat's bugzilla (remember these are RH-specific binaries). Alternatively send reports to ( . If you suspect the problem is related to the kernel side you know where to post.

Linus Torvalds replied:


Having a full system like this showed a few special cases in sysenter handling, where some system calls really depend on the old "iret" return path.

Notably, "sys_iopl()" requires the iret path because that's the only way to restore the full eflags, and "execve()" requires the iret return path because it needs to start up the new process with fixed values in %edx/%ebx, and the stack has a new layout and no longer contains the required sysexit fixup code.

I've pushed the fix for both of these issues to the kernel -bk trees.

Without the fix, a system with sysenter support would not boot up cleanly with these libraries due to the execve() issues, and X wouldn't start because of the iopl() problem.

With this in place, I've not seen any strange behaviour.

8. Unclaimed Bugs In Bugzilla

29 Dec 2002 - 30 Dec 2002 (9 posts) Archive Link: "Current unclaimed 2.5 bugs on"

Topics: Bug Tracking

People: Martin J. Bligh

Martin J. Bligh reported, "We have a growing number of unclaimed bugs on These have defaulted back to Khoa or myself, as the category does not have an owner ... if anyone is interested in working on these, that'd be a great help. Either just append comments to the bugs, or contact me and I can reassign them to you if you're going to work on them real soon (let me know your account name). Some may be fixed already, and just need confirmation." A bunch of folks claimed various bugs, and it turned out many straggling bugs already had fixes in one tree or another.

9. Support For The Promise 20376 RAID Controller

30 Dec 2002 - 31 Dec 2002 (5 posts) Archive Link: "Promise 20376 support"

Topics: Disk Arrays: RAID, Disks: IDE, PCI, Serial ATA

People: Marcel J.E. MolAlan Cox

Marcel J.E. Mol asked, "I've got this Asus A7V8X motherboard that contains a promise 20376 sata-ide (raid) controller. In the latest kernel sources (2.4 and 2.5) I don't see any mention of this chip yet. Also a google search does not reveal much about linux support. Is there already any work in progress for it?" Alan Cox replied, "No work, no documentation. If its just a SATA bridge with an existing ATA controller then you may find you can just add the PCI identifiers and pretend its a 20276. If it has other new and wonderous features you may be completely screwed."

10. Possible Replacement For Bugzilla

30 Dec 2002 - 31 Dec 2002 (4 posts) Archive Link: "New kernel bug database on-line"

Topics: Bug Tracking

People: John BradfordGreg KH

John Bradford announced:

A couple of weeks ago, I started a thread about writing a bug database dedicated to Linux kernel development.

My theory is that by making it Linux kernel development specific, it can save more time, and make bug tracking easier than a generic bug database.

Anyway, version 1.0 is now on-line:

For the time being, you'll have to E-Mail me a request for a user account, (which you need to do anything with it), but I've also put some screenshots on-line here:

Basically, it's designed around two main principles:

There is also a command line interface, which will eventually be accessible via E-Mail, but for the time being it is only accessible via the web. The command line interface currently allows you to list the bugs, get details about them, and add comments.

Any comments on this new bug database would be very much appreciated!

Greg KH replied, as far as emailing requests for user accounts, that "Automated account creation should be your first new feature you add to this program, almost no one will use this if you make them do that." And John replied, "The server runs about 100 other websites, and I had visions of it getting posted to Slashdot... Besides, I finished the code at about 11 PM yesterday, I want to have a look at it again this morning to check for security holes before everybody r00ts the box :-)."

Elsewhere and a few hours later, under the Subject: Guest access to new bug database ( , John announced:

OK, I've fixed some bugs and added a guest account to my new bug database:

Username: guest
Password: guest

You can't submit bug reports or add comments using the guest account, (it will appear to let you, but then just not add the data), but you can search, and try out the command line interface.

If you want a real account, drop me an E-Mail.

11. Futex Documentation

30 Dec 2002 (1 post) Archive Link: "[DOCUMENTATION] Futex manpages"

People: Bert Hubert

Bert Hubert announced:

After consulting with Rusty, I'm happy to present an initial cut at manpages for Futexes in Linux 2.5.40 and onwards.

Please find DocBook, HTML and troff on and on

Andries, please consider these for adoption.

12. Fast Access To The Process List

31 Dec 2002 (1 post) Subject: "[ANNOUNCE] fast access to process list"

Topics: Big O Notation

People: Alex Tomas

Alex Tomas announced:

I'd like to present 2nd version of fastps.

Changes at kernel side:

Changes in userspace tool fps:

Patches against 2.4.20/2.5.53 and userspace tool may be found at

13. Linux Hacker Running For President Of USA

31 Dec 2002 (4 posts) Archive Link: "The only way around Microsoft"

Topics: Microsoft

People: Rick HohenseeDr. David Alan Gilbert

Rick Hohensee announced:

This is to announce my availability for President of the United States of America. I am a 46 year old US citizen. My platform is extreme openness. As President, I will install high-quality motion-picture cameras everywhere I sleep, with some distribution mechanism for the interested adult public to audit the produce of such cameras. Rest assured, there is currently no reason to suspect that such cameras would detect anything notable. As President, I will make every effort to release most government-held secrets. Under me, the Executive Branch will convert all personal IBM PC-type computers to unix-like or other open-source operating systems. Et cetera. Extreme openness. I will accept the nomination of either major US political party. If I don't get a major nomination or huge independant support, I support the Democratic Party candidate. Until subsumed by a major established party, consider me the leader of the "Responsible Party".

Send contributions to

Rick Hohensee
3234 Powder Mill Rd.
Adelphi, Maryland

Please identify yourself accurately with any contributions. Anonymous or inadequately accounted contributions will be noted on the Internet and donated to a D.C. service for the homeless, or will be distributed to the homeless by me personally. Contributions and contributors, if any, will be listed on the Internet. If you are contributing for or as an organization, detail the contributors to the organization to at least a 1% resolution. As far as I know, I may have already hereby violated some campaign finance law. If so, please advise. Looking at a campaign one step at a time, if you contribute by check, please note that I don't currently have a checking account and it requires about $100-- to start one.

Class is accountable. Party responsibly.

Dr. David Alan Gilbert replied, "From what I hear it isn't what presidents do while they are asleep which needs watching." To which Rick said, "Something tells me I won't be getting much sleep as President." David also asked, "Where do you stand on the Free Fish For Penguins issue?" and Rick replied, "Penguin #1 gets his caviar from Microsoft. I'll probably make the State Department use Plan 9."

A bit later, Rick added:

Oh by the way, I need hackers for my cabinet. I have the US Code on my FTP site,

It took a week to pull it out of the retarded privatized interface the GPO has it behind. Talk about bloat. It makes Common Lisp look like Chuck Moore's latest Forth chip. I also need somebody to liberate the CFR, the Code of Federal Regulations.

I need some people that haven't forgotten how to use a computer to do basic cleanups like convert "intelligence" in the CIA sense to "strinfo", shut down the war on herbs, eliminate using the tax laws as a form of subsidy, et cetera. The better hackers will get cabinet positions. Otherwise I'll promote from the existing staff.







Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.