Kernel Traffic #149 For 7�Jan�2002

By Zack Brown

Table Of Contents

Introduction

I'd like to take a moment to mention Bookfinder (http://www.bookfinder.com/) , a very nice free book finding service. What they do is query many different booksellers on the net, and return the results. You can specify whether you want new or used books, particular price ranges, and a lot of other stuff. I use them for all my book searches. They also have a couple mailing lists (http://www.bookfinder.com/interact/lists/) for discussing the service. They're really committed to making the site kick ass.

To test it out, try searching for books on Linux (http://www.bookfinder.com/search/?title=linux&st=xl&ac=qr) .

Mailing List Stats For This Week

We looked at 1382 posts in 5590K.

There were 366 different contributors. 187 posted more than once. 137 posted last week too.

The top posters of the week were:

1. The Scheduler; Development Philosophy; IRC

15�Dec�2001�-�27�Dec�2001 (174 posts) Archive Link: "Re: Just a second ..."

Topics: Development Philosophy, FS: ext3, Ottawa Linux Symposium, Process Scheduling, SMP, Scheduler, Spam, Virtual Memory

People: Linus Torvalds,�Davide Libenzi,�Benjamin LaHaise,�H. Peter Anvin,�Alexander Viro,�Rik van Riel,�Ingo Molnar,�Alan Cox

Davide Libenzi asked Linus Torvalds why he had abstained from any discussion of the future of the schedule code. Linus replied:

I just don't find it very interesting. The scheduler is about 100 lines out of however-many-million (3.8 at least count), and doesn't even impact most normal performace very much.

We'll clearly do per-CPU runqueues or something some day. And that worries me not one whit, compared to thigns like VM and block device layer ;)

I know a lot of people think schedulers are important, and the operating system theory about them is overflowing - it's one of those things that people can argue about forever, yet is conceptually simple enough that people aren't afraid of it. I just personally never found it to be a major issue.

Let's face it - the current scheduler has the same old basic structure that it did almost 10 years ago, and yes, it's not optimal, but there really aren't that many real-world loads where people really care. I'm sorry, but it's true.

And you have to realize that there are not very many things that have aged as well as the scheduler. Which is just another proof that scheduling is easy.

We've rewritten the VM several times in the last ten years, and I expect it will be changed several more times in the next few years. Withing five years we'll almost certainly have to make the current three-level page tables be four levels etc.

In comparison to those kinds of issues, I suspect that making the scheduler use per-CPU queues together with some inter-CPU load balancing logic is probably _trivial_. Patches already exist, and I don't feel that people can screw up the few hundred lines too badly.

Davide replied that he saw no reason why the scheduler couldn't be addressed in the 2.5 timeframe, saying, "It's no more important of anything else, it's just one of the remaining scalability/design issues. No, it's not more important than VM but there're enough people working on VM. And the hope is to get the scheduler right with an ETA of less than 10 years." To Linus' statement that achieving the perfect scheduler was not so important, Davide said, "Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane for UP systems, starts to matter. Just to leave out cache line effects. Just to leave out the way the current scheduler moves tasks around CPUs. Linus, it's not only about performance benchmarks with 2451 processes jumping on the run queue, that i could not care less about, it's just a sum of sucky "things" that make an issue. You can look at it like a cosmetic/design patch more than a strict performance patch if you like." Finally, Davide took exception to all the times Linus said that various things were easy or trivial. He rebutted, "I would not call selecting the right task to run in an SMP system trivial. The difference between selecting the right task to run and selecting the right page to swap is that if you screw up with the task the system impact is lower. But, if you screw up, your design will suck in both cases. Anyway, given that 1) real men do VM ( i thought they didn't eat quiche ) and easy-coders do scheduling 2) the schdeuler is easy/trivial and you do not seem interested in working on it 3) whoever is doing the scheduler cannot screw up things, why don't you give the responsibility for example to Alan or Ingo so that a discussion ( obviously easy ) about the future of the schdeuler can be started w/out hurting real men doing VM ? I'm talking about, you know, that kind of discussions where people bring solutions, code and numbers, they talk about the good and bad of certain approaches and they finally come up ( after some sane fight ) with a much or less widely approved solution. The scheduler, besides the real men crap, is one of the basic components of an OS, and having a public debate, i'm not saying every month and neither every year, but at least once every four years ( this is the last i remember ) could be a nice thing. And no, if you do not give to someone that you trust the "power" to redesign the scheduler, no schdeuler discussions will start simply because people don't like the result of a debate to be dumped to /dev/null."

Linus replied that 2.5 might be a reasonable time-frame, although he added, "There are issues that are about a million times more important." He explained, "4 cpu's are "high end" today. We can probably point to tens of thousands of UP machines for each 4-way out there. The ratio gets even worse for 8, and 16 CPU's is basically a rounding error. You have to prioritize. Scheduling overhead is way down the list." Davide replied that there was no need to serialize every task, and Linus said, "Well, you explicitly _asked_ me why I had been silent on the issue. I told you." Elsewhere, he added:

Fight it out. Don't involve me, because I don't think it's even a challenging thing. I wrote what is _still_ largely the algorithm in 1991, and it's damn near the only piece of code from back then that even _has_ some similarity to the original code still. All the "recompute count when everybody has gone down to zero" was there pretty much from day 1.

Which makes me say: "oh, a quick hack from 1991 works on most machines in 2001, so how hard a problem can it be?"

Fight it out. People asked whether I was interested, and I said "no". Take a clue: do benchmarks on all the competing patches, and try to create the best one, and present it to me as a done deal.

Back in the main subthread, a few posts down the line, Benjamin LaHaise asked, "Well, what about those of us who need syscall numbers assigned for which you are the only official assigned number registry?" And Linus replied, "I've told you a number of times that I'd like to see the preliminary implementation publicly discussed and some uses outside of private companies that I have no insight into.." H. Peter Anvin mentioned, "There was a group at IBM who presented on an alternate SMP scheduler at this year's OLS; it generated quite a bit of good discussion." But Benjamin felt that Linus' requirements posed a serious chicken-and-egg problem. Linus replied, "Why? I'd rather have people playing around with new system calls and _test_ them, and then have to recompile their apps if the system calls move later, than introduce new system calls that haven't gotten any public testing at all.." Benjamin replied that he had posted code, and that people were testing it out, but that he couldn't force them to post their comments on the list. To this, Linus said:

Well, if people aren't interested, then it doesn't _ever_ go in.

Remember: we do not add features just because we can.

Quite frankly, I don't think you've told that many people. I haven't seen any discussion about the aio stuff on linux-kernel, which may be because you posted several announcements and nobody cared, or it may be that you've only mentioned it fleetingly and people didn't notice.

Take a look at how long it took for ext3 to be "standard" - I put them in my tree when I started getting real feedback that it was used and people liked using it. I simply do not like applying patches "just to get users". Not even reservations - because I reserve the right to _never_ apply something if critical review ends up saying that "that doesn't make sense".

Quite frankly, the fact that it is being tested out at places like Oracle etc is secondary - those people will use anything. That's proven by history. That doesn't mean that _I_ accept anything.

Now, the fact that I like the interfaces is actually secondary - it does make me much more likely to include it even in a half-baked thing, but it does NOT mean that I trust my own taste so much that I'd do it "under the covers" with little open discussion, use and modification.

Where _is_ the discussion on linux-kernel?

Where are the negative comments from Al? (Al _always_ has negative comments and suggestions for improvements, don't try to say that he also liked it unconditionally ;)

Alexander Viro replied:

Heh.

Aside of a _big_ problem with exposing async API to userland (for a lot of reasons, including usual quality of async code in general and event-drivel one in particular) there is more specific one - Ben's long-promised full-async writepage() and friends. I'll believe it when I see it and so far it didn't appear.

So for the time being I'm staying the fsck out of that - I don't like it, but I'm sick and tired of this sort of religious wars.

In reply to Linus' asking where the linux-kernel list discussion was, Rik van Riel replied with a smirk, "Which mailing lists do you want to be subscribed to ? ;)" To which Linus said:

I'm not subscribed to any, thank you very much. I read them through a news gateway, which gives me access to the common ones.

And if the discussion wasn't on the common ones, then it wasn't an open discussion.

And no, I don't think IRC counts either, sorry.

Rik replied, "Whether you think it counts or not, IRC is where most stuff is happening nowadays." To which Ingo Molnar said:

most of the useful traffic on lkml cannot be expressed well on IRC. While IRC might be useful as an additional form of communication channel, email lists IMO should still be the main driving force of Linux kernel development, else we'll only concentrate on those minute ideas that can be expressed in 1-2 lines on irc and which are simple enough to be understood until the next message comes. Also, the lack of reliable archiving of IRC traffic prevents newcomers of reproducing the thought process afterwards. While IRC might result in the seasoned kernel developer doing the next super-patch quickly, it will in the end effect only isolate and alienate newcomers and will only result in an aging, personality-driven elitist old-boys network and a dying OS.

Regarding the use of IRC as the main development medium for the Linux kernel - the fast pace of IRC often prevents deeper thoughts - while this is definitely the point for many people who use IRC, it cannot result in a much better kernel. [that having said, i'm using irc on a daily basis as well so this is not irc-bashing, but i rarely use it for development purposes.]

It's true that reading off-topic emails on lkml isnt a wise use of developer powers either, but this has to be taken into account just like spam - it's the price of having an open forum.

and honestly, much of the complaints about lkml's quality are exagerated. What you dont take into account is the fact that while 3 or 5 years ago you found perhaps every email on lkml exciting and challenging, today you are an experienced kernel hacker and find perhaps 90% of the traffic 'boring'. I've just done a test - and perhaps i picked the wrong set of emails - but the majority of lkml traffic is pretty legitimate, and i would have found most of them 'interesting and exciting' just 5 years ago. Today i know what they mean and might find them less challenging to understand - but that is one of the bad side-effects of experience. Today there are more people on lkml, more bugs get reported, and more patches are discussed - so keeping up with lkml traffic is harder. Perhaps it might make sense to separate linux-kernel into two lists: linux-kernel-bugs and linux-kernel-devel (without moderation), but otherwise the current form and quality of discussions (knock on wood) is pretty OK i think.

also, more formal emails match actual source code format better than the informal IRC traffic. So by being kindof forced to structure information into a larger set of ASCII text, it will also be the first step towards good kernel code.

(on IRC one might be the super-hacker with a well-known nick, entering and exiting channels, being talked to by newbies. It might boost one's ego. But it should not cloud one's judgement.)

Alan Cox also replied to Linus, saying, "If the discussion was on the l/k list then most kernel developers arent going to read it because tey dont have time to wade through all the crap that doesnt matter to them." He added, "IRC is where most stuff, especially cross vendor stuff is initially discussed nowdays, along with kernelnewbies where most of the intro stuff is - but thats disussed rather than formally proposed and studied."

The discussion then veered off, meandering through a number of different topics, never lingering long enough to draw any conclusions.

2. Some Discussion Of Development Philosophy

17�Dec�2001�-�2�Jan�2002 (139 posts) Archive Link: "The direction linux is taking"

Topics: Bug Tracking, Development Philosophy, Samba, Spam, Version Control

People: Eyal Sohya,�Ed Borasky,�David Weinehall,�Dana Lacoste,�Alan Cox,�Rik van Riel,�Russell King,�Linus Torvalds,�Troy Benjegerdes,�Daniel Phillips,�Andrew Tridgell,�Jeff Garzik,�Richard Gooch

Eyal Sohya stepped forward and said:

I've watched this List and have some questions to ask which i would appreciate are answered. Some might not have definite answers and we might be divided on them.

  1. Are we satisfied with the source code control system ?
  2. Is there enough planning for documentation ? As another poster mentioned, there are new API and we dont know about them.
  3. There is no central bug tracking database. At least people should know the status of the bugs they have found with some releases.
  4. Aggressive nature of this mailing list itself may be a turn off to many who would like to contribute.

Ed Borasky suggested, "I personally favor a full SEI CMM (http://www.sei.cmu.edu/cmm/cmm.html) level 2 or even level 3 process. Whether there are open source tools to facilitate that process is another story." But David Weinehall objected:

With SEI CMM level 3 for the kernel, complete testing and documentation, we'd be able to release a new kernel every 5 months, with new drivers 2 years after release of the device, and support for new platforms 2-3 years after their availability, as opposed to 1-2 years before (IA-64, for instance...)

We'd also kill off all the advantages that the bazaar-style development style actually has, while gaining nothing in particular, except for a slow machinery of paper-work. No thanks.

I don't complain when people do proper documentation and testing of their work; rather the opposite, but it needs to be done on a volunteer basis, not being forced by some standard. Do you really think Linus would be able to take all the extra work of software engineering? Think again. Do you honestly believe he'd accept doing so in a million years? Fat chance.

Grand software engineering based on PSP/CMM/whatever is fine when you have a clear goal in mind; a plan stating what to do, detailing everything meticously. Not so for something that changes directions on pure whim from one week to the next, with the only goal being improvement, expansion and (sometimes) simplification.

Eyal said that such a rigorous system was not necessary, but that "A system of checks and development so that things that used to work dont get broken is hardly too much to expect." There was no reply to this, but elsewhere, Dana Lacoste argued that, "Alan (2.2) and Marcelo (2.4) and Linus (2.5) are doing a good job with source control. The fact the 'source control' is a person and not a piece of software is irrelevant." Alan Cox replied, "Not really. We do a passable job. Stuff gets dropped, lost, deferred and forgotten, applied when it conflicts with other work - much of this stuff that software wouldnt actually improve on over a person."

Further along in the thread, Dana took Alan's statement to mean that if a better solution could be found, it would be adopted. He suggested discussing it further. Rik van Riel replied:

Sounds like an idea, except that up to now I haven't seen any suitable solution for this.

The biggest problem right now seems to be that of patches being dropped, which is a direct result of the kernel maintainers not having infinite time.

A system to solve this problem would have to make it easier for the kernel maintainers to remember patches, while at the same time saving them time. I guess it would have something like the following ingredients:

  1. remember the patches and their descriptions
  2. have the possibility for other people (subsystem maintainers?) to de-queue or update pending patches
  3. check at each pre-release if the patches still apply, notify the submitter if the patch no longer applies
  4. make an easy "one-click" solution for the maintainers to apply the patch and add a line to the changelog ;) (all patches apply without rejects, patches which don't apply have already been bounced back to the maintainer by #3)
  5. after a new pre-patch, send the kernel maintainer a quick overview of pending patches
  6. patches can get different priorities assigned, so the kernel maintainers can spend their time with the highest-priority patches first
  7. .. ?

All in all, if such a system is ever going to exist, it needs to _reduce_ the amount of work the kernel maintainers need to do, otherwise it'll never get used.

Alan said that Andrew Tridgell had written jitterbug (http://samba.anu.edu.au/jitterbug/) years ago, which solved all those problems, but that Linus Torvalds refused to use it. Rik spat out, "I don't care about Linus, he drops so many bugfixes his kernel have done nothing but suck rocks since the 2.1 era. This system could be useful for people who _are_ maintainers, however." Alan pointed him to the jitterbug page, and added after a post or two, "I have a system I am happy with, I save stuff that looks worth applying into a TO_APPLY directory then merge it in logical chunks."

At one point, Russell King (ARM port maintainer) said:

Speaking as someone who _does_ use a system for tracking patches, I believe that patch management systems are a right pain in the arse.

If the quality of patches aren't good, then it throws you into a problem. You have to provide people with a reason why you discarded their patch, which provides people with the perfect opportunity to immediately start bugging you about exactly how to make it better. If you get lots of such patches, eventually you've got a mailbox of people wanting to know how to make their patches better.

I envy Alan, Linus, and Marcelo for having the ability to silently drop patches and wait for resends. I personally don't believe a patch tracking system makes life any easier. Yes, it means you can't loose patches, but it means you can't accidentally loose them on purpose. This, imho, makes life very much harder.

But Alan replied (regarding the ability to silently drop patches), "I go to great lengths to try and avoid that. People often get very short replies but I try to make sure if the patch isnt queued to apply they get a reply. Sometimes I have to sit on them for a week until I understand why I don't like them. The things I happily drop are people arguing about why I dropped their patch."

Rik said to Russell, "I'm not going to resend more than twice. If after that a critical bugfix isn't applied, I'll put it in our kernel RPM and the rest of the world has tough luck." And Linus replied to this:

Which, btw, explains why I don't consider you a kernel maintainer, Rik, and I don't tend to apply any patches at all from you. It's just not worth my time to worry about people who aren't willing to sustain their patches.

When Al Viro sends me a patch that I apply, and later sends me a fix to it that I miss for whatever reason, I can feel comfortable in the knowledge that he _will_ follow up, not just whine. This makes me very willing to apply his patches in the first place.

Replace "Al Viro" with Jeff Garzik, David Miller, Alan Cox, etc etc. See my point?

This is not about technology. This is about sustainable development. The most important part to that is the developers themselves - I refuse to put myself in a situation where _I_ need to scale, because that would be stupid - people simply do not scale. So I require others to do more of the work. Think distributed development.

Note that things like CVS do not help the fundamental problem at all. They allow automatic acceptance of patches, and positively _encourage_ people to "dump" their patches on other people, and not act as real maintainers.

We've seen this several times in Linux - David, for example, used to maintain his CVS tree, and he ended up being rather frustrated about having to then maintain it all and clean up the bad parts because I didn't want to apply them (and he didn't really want me to) and he couldn't make people clean up themselves because "once it was in, it was in".

I know that source control advocates say that using source control makes it easy to revert bad stuff, but that's simply not TRUE. It's _not_ easy to revert bad stuff. The only way to handle bad stuff is to make people _responsible_ for their own sh*t, and have them maintain it themselves.

And you refuse to do that, and then you complain when others do not want to maintain your code for you.

Rik replied, "OK, I'll setup something to automatically send you patches as long as they're not applied, don't get any reaction and still apply cleanly." But Linus said, "Did you read the part about "maintainership" at all? I ignore automatic emails, the same way I ignore spam. Automating patch-sending is _not_ maintainership." Rik replied, "Of course the patch will be updated when needed, but I still have a few 6-month old patches lying around that still work as expected and don't need any change." And Linus said:

Sure. Automatic re-mailing can be part of the maintainership, if the testing of the validity of the patch is also automated (ie add a automated note that says that it has been verified).

It's just that I actually _have_ had people who just put "mail torvalds < crap" in their cron routines. It quickly caused them to become part of my spam-filter, and thus _nothing_ ever showed up from them, whether automated or not..

Rik said he'd add intelligence to his scripts, to prevent patches from being resent when Linus was too busy or not interested, and to queue rejected patches for manual inspection. Richard Gooch suggested making this automation system public, so others could use it too. Linus liked the way the discussion was heading, and added, "We actually talked inside Transmeta about doing a lot of this automation centralized (and OSDL took up some of that idea), but yes, from a resource usage sanity standpoint this is something that _trivially_ can be done at the sending side, and thus scales out perfectly (while trying to do it at the receiving end requires some _mondo_ hardware that definitely doesn't scale, especially for the "compiles cleanly" part). Troy Benjegerdes offered:"

here's an idea:

Maintainers for a specific area of interest/kernel tree/whatever can run a 'canned' set of scripts on a web server which act as a controller for a patchbot, and a set of 'build machines' that actually do the compiles.

(i.e., davej, andrea, riel, etc would have their own webserver which acts as a central location for data collection, as well as a place for users to download stuff from)

Actually compiling gets done either by users that want to use that kernel, or in the case of a vendor, an internal build farm. The users have another 'canned' script that downloads the kernel, patches it, and builds it with a user-supplied or server-supplied config file. The script uploads the results of the build so maintainers can see what happened, and the web server provides some mechanism for users to say what did and did not work.

Once the webserver gets some data back, the patchbot can figure out whether a particular patch was a 'success' or not, and decide whether to send it, dequeue it, or whatever.

We should probably also add the ability for end-users to submit their own patches to a maintainer, or provide a way for end-users to setup the webserver system so they can do the same thing the maintainers are doing.

The most important part here is that this system has to be less work for maintainers than it is responding to hundreds of emails and checking if a patch made it in all the time. (I think this should be relatively easy). It's got to be easy to set up, both for maintainers and users.

I've got some reasonably nice python scripts that currently act as the 'build system' part of this, and some somewhat ugly scripts that run on a webserver. A brief description is available here.

http://altus.drgw.net/description.html

I'll volunteer these scripts as well as whatever amount of time I can spare from 'real' work ;)

Rik was thrilled by this, and set up a mailing list to discuss it. He said:

patchbot@nl.linux.org

You can subscribe by mailing to listar@nl.linux.org with "subscribe patchbot" in the message.

Once I've gotten around to it, http://patchbot.nl.linux.org/ should contain some content, too. (or once somebody else has gotten around to it)

Elsewhere, under the Subject: [PATCH] rlimit_nproc (http://www.uwsg.indiana.edu/hypermail/linux/kernel/0112.3/0519.html) , Rik posted a patch, saying that although not yet automated, it "would be a typical candidate ... are you happy with the way the description and patch are combined ?" Linus replied:

Looks fine, except for the fact that nowhere did it say which kernel version the patch was generated against. Which is often a rather important clue ;)

Now if you automate this, I would suggest adding a section in between the explanation and the patch: the "diffstat" output of the patch. It doesn't matter much for this example, because obviously the patch is small enough that just scrolling down shows what's up, but..

I would also suggest that whatever activates the patch asks for a subject-line that is more than 12 characters long ;)

Also worthwhile for automation is an md5sum or similar (for verifying that the mail made it though the mail system unscathed). A pgp signature would be even better, of course - especially useful as I suspect it would be good to also cc the things to some patch-list, and having a clear identity on the sender is always a good idea in these things.

Someone suggested a mailing list just for patches, and Daniel Phillips said, "Exactly what I was thinking of: 'linux-patches@kernel.org'. The idea is, instead of putting [PATCH] on your subject line and cc'ing it to Linus, you mail it to linux-patches with a cc to lkml if you like (depending on size of patch, how interesting, etc). In any event, linux-patches will forward a copy to Linus."

3. Status Of ramfs

25�Dec�2001�-�27�Dec�2001 (9 posts) Archive Link: "2.5.2-pre2 forces ramfs on"

Topics: FS: ramfs, FS: rootfs, Modules

People: Linus Torvalds,�Alan Cox,�Keith Owens

Keith Owens asked why ramfs would always compile into the kernel, instead of being optional; Linus Torvalds replied, "Because it's small, and if it wasn't there, we'd have to have the small "rootfs" anyway (which basically duplicated ramfs functionality)." Alan Cox took the opportunity to ask, "Can ramfs=N longer term actually come back to be "use __init for the RAM fs functions". That would seem to address any space issues even the most embedded fanatic has." Linus replied:

Hmm.. That might work, but at the same time I suspect that the most fanatic embedded users are actually the ones that may benefit most from ramfs in the first place. That was certainly why it came to be..

We'll see. We'll end up using ramfs for the initial init bootup (ie the "tar.gz->ramfs" stage of bootup), so making it __init may not be practical for other reasons. We'd have to unload it not after the __init stage, but after the first root filesystem is unused (which may be later, depending on what people put in the filesystem).

4. Selecting Patches For 2.4

26�Dec�2001�-�27�Dec�2001 (4 posts) Archive Link: "2.4.17: Dell Laptop extra buttons patch (fwd)"

People: Alan Ford,�Andreas Dilger,�Alan Cox,�Marcelo Tosatti,�Andries Brouwer

Marcelo Tosatti posted a patch he'd received via private email from Alan Ford. Alan had explained, "It adds keycodes for the four shortcut buttons that are provided on Dell Inspiron laptops." Andreas Dilger replied, "I don't have a Dell laptop, but AFAIK in the past patches like this have been rejected by Andries Brouwer (I think) because it is possible to do this from user space with key mapping tools like setkeycodes, xkeycaps, or xmodmap." Marcelo replied that Alan Cox had also explained this to him.

5. Linus Responds To Some Criticism

27�Dec�2001�-�28�Dec�2001 (4 posts) Archive Link: "Re: your mail"

Topics: Development Philosophy, Development Strategy, Disks: IDE, Disks: SCSI, FS, Ioctls, Kernel Build System

People: Andre Hedrick,�Linus Torvalds

In response to an earlier thread in which Linus Torvalds had said he wouldn't take kbuild into the kernel until the block IO layer had gotten more into shape, Andre Hedrick now said, "please pass your crack pipe arounds so the rest of us idiots can see your vision or lack of ..." Linus replied:

Heh. I think I must have passed it on to you long ago, and you never gave it back, you sneaky bastard ;)

The vision, btw, is to get the request layer in good enough shape that we can dispense with the mid-layer approaches of SCSI/IDE, and block devices turn into _just_ device drivers.

For example, ide-scsi is heading for that big scrap-yard in the sky: it's not the SCSI layer that handles special ioctl requests any more, because the upper layers are going to be flexible enough that you can just pass the requests down the regular pipe.

(Right now you can see this in block_ioctl.c - while only a few of the ioctl's have been converted, you get the idea. I'm actually surprised that nobody seems to have commented on that part).

The final end result of this (I sincerely hope) is that we can get rid of some of the couplings that we've had in the block layer. ide-scsi is just the most obvious strange coupling - things like "sg.c" in general are rather horrible. There's very little _SCSI_ in sg.c - it's really about sending commands down to the block devices.

The reason I want to get rid of the couplings is that they end up being big anchors holding down development: you can create a clean driver that isn't dependent on the SCSI layer overheads (and people do, for things like DAC etc), but when you do that you lose _all_ of the support infrastructure, not just the bloat. Which is sad.

(And which is why things like ide-scsi exist - IDE didn't really want to be a SCSI driver, but people _did_ want to be able to use some of the generic support routines that the SCSI layer offers. You couldn't just cherry-pick the parts you wanted).

The other part of the bio rewrite has been to get rid of another coupling: the coupling between "struct buffer_head" (which is used for a limited kind of memory management by a number of filesystems) and the act of actually just doing IO.

I used to think that we could just relegate "struct buffer_head" to _be_ the IO entity, but it turns out to be much easier to just split off the IO part, which is why you now have a separate "bio" structure for the block IO part, and the buffer_head stuff uses that to get the work done.

Andre, I know that you're worried about the low-level drivers, but:

And note that the "Jens" and "communication" part is important. If you have patches, please talk to Jens, tell him what the issues are, and I know I can communicate with him.

Andre did not reply.

6. SiS7012 Audio Driver

27�Dec�2001 (2 posts) Archive Link: "[patch] SiS7012 audio driver"

Topics: Modules, Sound: OSS, Sound: SiS7012, Sound: i810

People: Thomas Gschwind,�Alan Cox,�Doug Ledford

Thomas Gschwind posted a patch and announced:

I have added support for the SiS7012 audio driver to the i810 audio driver. Playback works perfectly fine for me on an ECS K7S5A mainboard. In some situations, recording causes my system to crash but I am still working on it.

BTW, does anybody have a datasheet? Currently, this driver is based on the fact that OSS uses the same driver for the i810 and SiS7012 chipsets and some experimentations.

Alan Cox replied that he'd asked for datasheets but had not received any. He added, "Doug Ledford <dledford@redhat.com> is working on this driver and has much updated the i810 support and fixed bugs. Send him a copy. Also btw the nvidia chipset also seems to use an i810 clone."

7. UML Poised To Go Into 2.5

27�Dec�2001�-�28�Dec�2001 (5 posts) Archive Link: "UML has been sent to Linus"

Topics: History, User-Mode Linux

People: Daniel Phillips,�Jeff Dike,�Linus Torvalds

Jeff Dike said he'd sent a patch for user-mode Linux to Linus Torvalds, and gave a link to it (http://prdownloads.sourceforge.net/user-mode-linux/uml-patch-2.5.1-1.bz2) . Daniel Phillips said:

This is good news. I want to add something here that's a little less lame than 'me too'...

Besides being an essential development tool I use every day, I believe there is great potential for UML as a 'perfect jail'. There are interesting applications we'll start to see when UML is more widely available, such as simulation of clusters, or 'Linux Bubbles' under Windows.

I think you've done a great job maintaining UML out-of-tree for more than a year, with very little assistance, and I hope you won't have to shoulder that extra burden much longer.

Jeff agreed with the vast array of possibilities, and thanked Daniel for the praise. He added, "UML is approaching three years old (I started hacking in Feb 1998; the first public sign of it was the following June)." He finished with, "I'm currently banging on bugs and residual missing functionality. When I think that's all done, that will be what I call UML V1.0 and I will send it to Marcelo. At that point, the out-of-tree phase of UML will be over."

8. Alan Continues 2.2 Maintenance

29�Dec�2001 (2 posts) Archive Link: "Linux 2.2.21pre1"

Topics: Disks: IDE, FS: NFS, FS: procfs, USB, Virtual Memory

People: Alan Cox,�Matthias Andree,�Manfred Spraul,�Trond Myklebust,�Andrea Arcangeli,�Tom Rini,�Ralf Baechle,�Gerard Roudier

Alan Cox put out a new 2.2 pre-release, 2.2.21pre1, and gave the changelog:

Matthias Andree asked, "Hum, what's the status of the IDE stuff? Last time I looked, the Hedrick IDE patches were for 2.2.19, and I find those rather necessary to use e. g. PDC20265 or UDMA on VIA chip sets. I could understand if there were not being merged into 2.2.x because it's stable and not playground release, but who is maintaining them now? Or did I miss anything about 2.2.20?" There was no reply.

9. New 2.4 Fork

31�Dec�2001�-�2�Jan�2002 (8 posts) Archive Link: "New tree started ;)"

Topics: Disks: IDE, FS: NFS, Real-Time, SMP, Scheduler, Software Suspend, Version Control

People: Michael Cohen,�Roberto Nibali,�H. Peter Anvin,�Trond Myklebust,�Craig I. Hagan,�Andre Hedrick,�Rik van Riel,�Robert Love

Michael Cohen announced:

After hanging out for a while on openprojects.net, I've decided to create a new 2.4 tree. I feel that there's need for a rapidly developing "-ac alike" tree, and so, here we go. Feel free to test it. I've attached patch-2.4.17-mjc1.bz2. New versions can be found at http://iamnotanimatedtoexplode.com/patches/mjc. Currently the patch includes:

Reverse Mapping patch #9 (Rik van Riel)
Preemptible Kernel Patch (Robert Love)
Lock-Break Patch (Robert Love)
CPU affinity /proc entry (Robert Love)
Netdev-random (Robert Love)
Software Suspend (Gabor Kuti?)
Real Time Scheduler for Linux (?)
IDE updates (Taskfile IO and others) (Andre Hedrick}

Ideally I'd like to have this maintained (possibly using bk) by those at #kernelnewbies.

Linus once said something about having more trees being a good thing. I'll try to keep this as close to the 2.4.x line as possible, though. :)

Craig I. Hagan suggested checking out Trond Myklebust patches at http://www.fys.uio.no/~trondmy/src/2.4.17/linux-2.4.17-NFS_ALL.dif. Michael was excited, and rushed off to take a look at it. Roberto Nibali also said, "Thanks for doing this. Could you tell me which criteria you have to choose patches to go in? I see grsecurity in the to be merged queue which is not likely to ever go into 2.4.x. It's more likely to be converted to the LSM framework (parts of it if still needed) and then integrated into 2.5.x."

Elsewhere, H. Peter Anvin offered to host the new tree on kernel.org, and Michael gratefully accepted. H. Peter said, "Please send a GPG key, desired user name and a brief description of what you plan to do to ftpadmin@kernel.org. It might take a week or so to get done."

10. Comparing 2.4 With 2.2

2�Jan�2002 (7 posts) Archive Link: "Linux 2.4.17 vs 2.2.19 vs rml new VM"

Topics: Big Memory Support, Microkernels, Virtual Memory

People: Brian Litzinger,�Alan Cox,�Rik van Riel,�Andrew Morton

Brian Litzinger gave his report on 2.4.17:

I'd like to say that as of 2.4.17 w/preempt patch, the linux kernel seems again to perform as well as 2.2.19 for interactive use and reliability, at least in my use.

2.4.17 still croaks running some of the giant memory applications that I run successfully on 2.2.19. (Machines with 2GB of RAM running 3GB+ apps.)

I tried rmap-10 new VM and under my typical load my desktop machine froze repeatedly. Seemed the memory pool was going down the drain before the freeze. Meaning apps were failing and getting stuck in various odd states.

No doubt, preempt and rmap-10 are incompatible, but I'm not going to give up the preempt patch any time soon.

All in all 2.4.17 w/preempt is very satisfactory.

Alan Cox diagnosed:

I suspect its rmap-10 not the pre-empt patch. If you have the time/inclination then testing just that load with rmap10a (the fixed rmap10) would be interesting just to know which bit is the buggy half.

Similarly the low latency patch which on the whole seems to give better results than the preempt patches is much less likely to cause problems as it doesn't really change the system semantics in the same kind of way.

Rik van Riel went into more depth, with:

There's a stupid logic inversion bug in rmap-10, which is fixed in rmap-10a. Andrew Morton tracked it down about an hour after I released rmap-10.

Basically in wakeup_kswapd() user processes go to sleep if the pressure on the VM is _really_ high *and* the user process has all the same GFP options set as kswapd itself, so the process can sleep on kswapd.

He said he'd put out rmap-11 soon.

Elsewhere, Alan remarked, "The measurements I've seen put lowlatency ahead of pre-empt in quality of results. Since low latency fixes some of the locked latencies it might be interesting for someone with time to benchmark."

Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.