Table Of Contents
|1.||25 Jun 2000 - 29 Jun 2000||(6 posts)||New Plans For the Virtual Memory Subsystem|
|2.||6 Jul 2000 - 15 Jul 2000||(22 posts)||Status Of NTFS Support|
|3.||6 Jul 2000 - 12 Jul 2000||(40 posts)||Per-User Resource Limits Planned For 2.5|
|4.||10 Jul 2000 - 12 Jul 2000||(6 posts)||Status Of Intel 82802 RNG Support|
|5.||11 Jul 2000 - 13 Jul 2000||(6 posts)||Some Discussion Of The SMP Booting Code|
|6.||12 Jul 2000 - 13 Jul 2000||(10 posts)||Smaller Timeslices And New Algorithm For 2.4|
|7.||13 Jul 2000 - 16 Jul 2000||(82 posts)||Improving The Kernel Release Schedule|
|8.||13 Jul 2000 - 14 Jul 2000||(13 posts)||One Way To Hunt For Bugs|
|9.||14 Jul 2000 - 17 Jul 2000||(8 posts)||'gcc-2.91.66' Recommended For Kernel Compilation|
|10.||15 Jul 2000||(2 posts)||Band-Aids On Virtual Memory While New Design Coalesces|
|11.||16 Jul 2000 - 17 Jul 2000||(4 posts)||Joe Pranevich's Latest Summary Of Kernel Changes For 2.4|
Thanks go to Richard Dawe, who reported a broken link in the Cousin News (../news.html) page. Thanks!
Thanks also go to all the people who wrote to me about important threads on 'linux-kernel'. Special thanks go to Dylan Griffiths, who really went above and beyond to get me some cool information. Thanks also go to Arjan van de Ven and the other folks on #kernelnewbies for pointing me to the VM threads on 'linux-mm'.
For folks wanting to let me know about important threads, the most important thing to include is the 'Subject:' line! If you want to add your take on the thread as well, that would be excellent, and more than I'd hope for.
Mailing List Stats For This Week
We looked at 1229 posts in 5339K.
There were 460 different contributors. 192 posted more than once. 132 posted last week too.
The top posters of the week were:
New Plans For the Virtual Memory Subsystem
25 Jun 2000 - 29 Jun 2000 (6 posts) Archive Link: "2.4 / 2.5 VM plans"
Topics: Big Memory Support, Clustering, FS: XFS, Virtual Memory
People: Rik van Riel, Stephen C. Tweedie, Juan J. Quintela
In the 'linux-mm' mailing list, Rik van Riel proposed:
since I've heard some rumours of you folks having come up with nice VM ideas at USENIX and since I've been working on various VM things (and experimental 2.5 things) for the last months, maybe it's a good idea to see which of your ideas have already been put into code and to see which ideas fit together or are mutually exclusive. :)
To start the discussion, here's my flameba^Wlist of ideas:
- re-introduce page aging, my small and simple experiments seem to indicate that page aging takes *less* cpu time than copying pages to/from highmem all the time (let alone making your applications wait for disk because we replaced the wrong page last time)
- fix the latency problems of applications calling shrink_mmap and flushing infinite amounts of pages (mostly fixed)
separate page replacement (page aging) and page flushing, currently we'll happily free a referenced clean page just because the unreferenced pages haven't been flushed to disk yet ... this is very bad since the unreferenced pages often turn out to be things like executable code
we could achieve this by augmenting the current MM subsystem with an inactive and scavenge list, in the process splitting shrink_mmap() into three better readable functions ... I have this mostly done
- fix balance_dirty() to include inactive pages and have kflushd help kswapd by proactively flushing some of the inactive pages _before_ we run into trouble
- implement some form of write throttling for VMAs so it'll be impossible for big mmap()s, etc, to competely fill memory with dirty pages
Stephen C. Tweedie replied to each of these items, remarking, "Right. :-) The following includes a lot of the stuff that Ben and I bashed out at Usenix. I don't count this as new feature stuff --- most of what follows is just identifying places where the current VM is plain broken!"
First, he agreed with the idea of reintroducing page aging. To item 2, Stephen felt it should be possible to get really aggressive about which cached pages to flush. He figured the cache didn't change all that much between times it was checked, so it shouldn't be necessary to walk all the lists of least-recently-used pages each time. To item 3, he was vehemently in agreement with separate page replacement and page flushing.
However, to item 4, he disagreed. He felt Rik was trying to yoke two different animals, and it wouldn't work.
To item 5 he was back in agreement, though he added that this was orthogonal to the other problems, and he felt that some of the solutions already discussed would also go a long way toward minimizing these problems as well. Finally, he introduced:
Other things to consider:
- The page aging loops need to have early break-out when the number of free pages suddenly increases (exit, munmap, whatever);
- The page stealer shouldn't block just because kswapd is blocked on synchronous swapping (this comes for free if we have separate page flushing)
- shrink_dentry should probably skip inodes which have still got pages attached, as otherwise we get a lot of unnecessary cache flushes
- We MUST quantify the current VM pressure as a way of controlling page aging. That way aging can be proactive under load, but we don't necessarily have to evict pages from memory too early (we can age pages without flushing them).
- RSS accounting needs to be audited. Right now, the per-mm rss isn't an atomic type, and it doesn't seem to be consistently protected by the page table locks.
A few other ideas Ben and I threw about are much more long-term.
- We think it should be possible to share page tables for large shared mmaps (think of libc and big sysv shm segments).
- We can do reverse pte maps pretty cheaply by the following:
- Reverse maps for shared mmaps are easy enough by following the per-inode vma list
- The pte for unshared anon pages can be encoded in the page struct easily.
- Shared anon pages are the tricky ones; but it's simple to maintain a hash list of all such ptes, and there aren't many in a typical system. Fork() is, of course, the one place where lots of these occur, but we can minimise the number of shared anon pages over fork by implementing COW on page tables (that way, we share the page tables but NOT the pages!)
- Think about having a list of all page tables in memory. With that, we can do aging in the VM without *EVER* having to walk through vmas at all: we can walk through the ptes in the system performing atomic bitops on the ptes and age counts without caring about the higher level layers until a given page's age reaches zero. Only at that point do we care about invoking the swapper for that page's vma.
Food for thought. 3) in particular seems to open up a whole new set of possibilities, but it's definitely something for an experimental post-2.4 branch. :-)
There was no reply, but Juan J. Quintela replied to Rik's initial list. At the end of the list, he added a sixth item for 2.4, "Integrate the shm code in the page cache, to evict having Yet another Cache to balance." He also listed a new set of items for 2.5:
- Make a ->flush method in the address_space operations, Rik mentioned it in some previous mail, it should return the number of pages that it has flushed. That would make shrink_mmap code (or its successor) more readable, as we don't have to add new code each time that we add a new type of page to the page cache.
- This one is related with the FS, not MM specific, but FS people want to be able to allocate MultiPage buffers (see pagebuf from XFS) and people want similar functionality for other things. Perhaps we need to find some solution/who to do that in a clean way. For instance, if the FS told us that he wants a buffer of 4 pages, it is quite obvious how to do write clustering for a page in that buffer, we can use that information.
- We need also to implement write clustering for fs/page cache/swap. Just now we have _not_ limit in the amount of IO that we start, that means that if we have all the memory full of dirty pages, we can have a _big_ stall while we wait for all the pages to be written to disk, and yes that happens with the actual code.
Status Of NTFS Support
6 Jul 2000 - 15 Jul 2000 (22 posts) Archive Link: "Want to help with NTFS"
Topics: FS: NTFS, Microsoft
People: Jeff V. Merkey, Steve Dodd, Anton Altaparmakov, Alan Cox
Timothy D. Webster volunteered to help any existing NTFS effort, and Jeff V. Merkey replied, "There are several folks working on it. Anton (I cannot spell his last name)" [it is Altaparmakov --Ed] "and Steve Dodd are currently doing most of the work. When NWFS is completed and finally checked in, I was planning to clean it up and correct the on-disk structures and put in the real NTFS journal. The person to ask would be Alan Cox."
Steve Dodd summed up his status with the project, saying, "I'll have to confess here: I haven't managed to do much at all in over a year :-( The last thing I was looking at in the current driver was the directory handling, and after a while all the NTFS_PUTU32(foo+0x<magic number>, ...) stuff really fried my brain. I did start some new code a while back which actually used structs to represent the on-disk structures ;-) and was supposed to play nicely with the page cache. But I never got very far with it."
Anton Altaparmakov also said he was too busy to do much coding on it at the moment, but invited Timothy to hack on the numerous 'fixme's in the source. To Jeff's plans to clean up the code himself, Anton replied, "That would be great. More so since you seem to be the only person who actually has the ms ntfs specs and hence the only person who actually knows what the on-disk structures are... - The information I have at least is gathered from various books and sources and is not quite complete or may very well be wrong in places." Timothy rose to the challenge and asked Anton for some pointers to any NTFS docs that might be available, and Anton listed:
The following books contain some information on NTFS but this is not their primary focus:
Windows NT/2000 Native API Reference
by Gary Nebbett
Macmillan Technical Publishing USA
Inside Windows NT Second Edition
by David A. Solomon
The macmillan book is quite a handy reference if you are doing NT/2000 programming. The microsoft book is not too deeply technically involved but it is a good overview of how NTFS works.
Per-User Resource Limits Planned For 2.5
6 Jul 2000 - 12 Jul 2000 (40 posts) Archive Link: "Kernel 2.2.14 OOM killer strikes."
Topics: OOM Killer, Virtual Memory
People: Mike A. Harris, Claudio Martins, Marcelo Tosatti, Olaf Titz, Warren Young, Derek Martin, Rik van Riel
Mike A. Harris reported his first Out Of Memory event in a long, long time. He'd used Midnight Commander to view a really big file in hexadecimal format, and 'mc' had apparently decided to load the entire 143M file into RAM. This triggered Linux's 'OOM Killer' algorithm, which attempts to intelligently guess which processes to kill under those circumstances. In this case, it killed netscape and the window manager, leaving 'mc' still running. This trashed his work and forced a reboot. He stated:
The kernel OOM killing issue is never going to be solved properly because it would have to be sentient to do so. So no amount of argument/debate will yield the omnipotent killing algorithm, and as such it will always suffer from doing the wrong thing, and pissing people off.
Therefore, since some people WANT OOM killing to be done, and others such as myself do NOT want it to be done, could someone in the know of doing so, please make it a compile time or run time tunable option? I'd like to tell my kernel "If an OOM condition occurs, under absolutely *NO* circumstances are you to EVER kill a running process".
I am *NOT* asking for this to be the default option, nor am I asking that everyone else use it. I *FULLY* understand the need to have a system like we have right now FOR SOME SYSTEMS, however my system does not need it, and I suspect many other desktop systems do not either. I'd rather have the application that is hogging memory DIE than everything on my system by some "smart" algorithm.
Claudio Martins posted an exploit:
It's trivial for any user to bring the system to an unusable state:
...and boom! In 15 seconds there's no more inetd, logins, http, etc :(
But Marcelo Tosatti explained:
Per-user resource limits will avoid monkeys from doing that stuff.
Unfortunately only 2.6 kernel will have this feature, but distribution vendors will probably use a backported version for 2.4.
Linux per-user resource limits are called "user beancounters", and you can find a development version at http://www.asplinux.com.sg/install/ubpatch.html. Currently there is only kernel-level code.
A userlevel PAM module is needed to make it usable for real systems.
SGI's CSA (http://oss.sgi.com/projects/csa, no code available yet) is a similar, but more complete per-user resource accouting scheme which will be ported to Linux in the future.
Elsewhere, Marcelo suggested Mike try other 'OOM Killer' algorithms, and suggested searching for "oom" on Rik van Riel's kernel patches (http://www.surriel.com/patches/) page.
Elsewhere, Olaf Titz suggested:
I wonder what happened of the AIX approach: define a new signal (they called it SIGDANGER AFAIK), ignored by default, and send this signal to _all_ processes a few seconds before starting to SIGKILL processes.
This moves the policy completely to user space (i.e. The Right Thing). You can either have a daemon listening to this signal and deciding by configuration what and how to kill, or you can implement a handler for graceful exit in suitable applications, or both. Make the "few seconds" tunable by sysctl and all is perfect. You can even implement the following: when a process explicitly ignores the special signal (and has an appropriate capability?), cause it to never be killed. That would be for X servers and session managers.
Claudio Martins objected that this would cause the programs of one user to be killed when another user used up all RAM. He felt Olaf's solution merely disguised the problem, and reiterated the need for per-user resource accounting.
Derek Martin suggested just killing the process that caused the OOM. This turned out to be more than a simple proposition, and Warren Young explained:
The first time your program asks for a little bit of memory, malloc() goes and asks the kernel for a bunch of memory: tens of K at least, maybe more. This is because it wants to minimize the number of memory blocks, the number of syscalls, and the amount of memory fragmentation.
Now when malloc() asks for that memory, the kernel doesn't mark it as in-use, and it doesn't assign it a place in real storage. Instead, it maps it into the VM tables as unwritable memory. When someone writes to the memory (first storage into the new memory), the kernel traps; it realizes that the memory is now _really_ in use, so it marks it as such. This optimization lets my piggy program ask for a megabyte of memory, only use a few bytes of it, and not waste a bunch of real RAM. (I do waste some VM, but that doesn't matter unless your system has real memory for all or nearly all of the addressable memory: 4 GB on ia32.)
Now imagine a running system: each program has a bunch of memory overcommitted. Right before the OOM event, RAM and swap are full with in-use pages. Now one program tries to access one of its allocated but unused pages. The kernel traps, tries to map the page into memory, and fails.
Now what process should the kernel kill? The naive answer is the one that caused the trap. But what if you've got a process with a memory leak that happens to eat all the memory, then just before it grabs the last bit of memory, the kernel schedules another process which touches one of its overcommitted pages. What if that unfortunate process is, say, syslogd or something critical like that?
System fall down go boom.
The right solution is resource accounting, which is coming to a kernel near you sometime next year.
Status Of Intel 82802 RNG Support
10 Jul 2000 - 12 Jul 2000 (6 posts) Archive Link: "Intel 82802 RNG"
Topics: Random Number Generation
People: Robert M. Love, Pavel Machek, Jeff Garzik
Robert M. Love grepped through the sources for the Intel 82802 RNG Random Number Generator. He couldn't find a 'CONFIG_INTEL_RNG' config option, and volunteered to write and maintain the driver himself. A couple people pointed out that Jeff Garzik had already written this driver (Robert took a look and remarked, "looking over the source, its everything i dreamed of." ); but Pavel Machek added that Jeff apparently didn't own the relevant hardware, and might be willing to pass maintainership over to Robert.
Some Discussion Of The SMP Booting Code
11 Jul 2000 - 13 Jul 2000 (6 posts) Archive Link: "why is trampoline.S code copied for each cpu?"
People: James Bottomley, Philipp Rumpf, Tigran Aivazian
Tigran Aivazian had some trouble understanding the 'trampoline.S (http://lxr.linux.no/source/arch/i386/kernel/trampoline.S?v=2.4.0-test2) ' code. In particular, 'arch/i386/kernel/smpboot.c:do_boot_cpu()' used 'setup_trampoline()' to copy the 'trampoline.S' code once for each CPU. It seemed to him that all CPU's could be pointed to the same copy, which would save space, instead of making identical copies for each CPU. After examining the file, he found that the reason for this (which he could see but not understand) was that each instance wrote a magic number "a5a5a5a5" into its code. He asked if anyone could explain why this was.
James Bottomley replied that actually all CPU's did use the same memory locations, which were just refreshed each time. So there was no wasted RAM. Regarding the magic number, James explained:
If you look lower down in do_boot_cpu(), this is used as a diagnosis of a boot failure (the check at phys_to_virt(8192)). If we don't find the signature, we know the CPU failed to start, if we do find it, we know it started but got stuck somewhere after the switch to protected mode.
In the latter case, the CPU could happily have trashed the trampoline code before disappearing off into hyperspace, so it makes sense to set it up anew each time.
Philipp Rumpf objected to this behavior, saying that if this ever happened in practice, it would mean that a random CPU was executing random code. But James clarified, "It doesn't happen in the normal course of events. For those of us who play with SMP HAL's, having this type of information can be invaluable. Also, getting the machine to boot so you can try to find out what happened to the errant CPU is useful. Usually it is just stuck somewhere like the message implies." At this point Tigran felt his question had been answered, and thanked James and Philipp. EOT.
Smaller Timeslices And New Algorithm For 2.4
12 Jul 2000 - 13 Jul 2000 (10 posts) Archive Link: "Report: Big Improvement in -test3"
People: Linus Torvalds, Andrew Morton, Richard Gooch
Joel Sloan noticed much smoother and faster interactive performance under load in 2.4.0test3. Linus Torvalds replied:
This was actually almost certainly due to a _really_ simple improvement.
As of test4-pre4, the default time-slice for a normal process is just 50ms, while it used to be 200ms.
200ms is way too long a timeslice when working with interactive things: it's easily noticeable. 50ms should be much better.
Linus seems to have been referring to a kernel that came out after the one Joel had praised, so this may not be the whole story. That didn't come out in the thread though...
Richard Gooch replied to Linus, asking if the change was possible because the number of ticks was no longer derived from the priority level. Linus confirmed, and explained:
If people wondered why the "->priority" -> "->nice" change was done, now you know. "->priority" used to be a tick-based nice level, and it just wasn't able to handle UNIX semantics when the resolution of ticks dropped to just a few ticks.
Simple vulcan mind-trick. Switch them around, and instead of calculating "nice" from the number of ticks, we calculate ticks from the virtual nice value, making the problem go away and allowing for a shorter timeslice without having to do major surgery.
Richard Gooch asked how many ticks e.g. 'nice 10' and 'nice 11' got, and Andrew Morton listed:
if (HZ < 200):
nice -20: 11 ticks
nice -19: 10 ticks
nice -18: 10 ticks
nice -17: 10 ticks
nice -16: 10 ticks
nice -15: 9 ticks
nice -14: 9 ticks
nice -13: 9 ticks
nice -12: 9 ticks
nice -11: 8 ticks
nice -10: 8 ticks
nice -9: 8 ticks
nice -8: 8 ticks
nice -7: 7 ticks
nice -6: 7 ticks
nice -5: 7 ticks
nice -4: 7 ticks
nice -3: 6 ticks
nice -2: 6 ticks
nice -1: 6 ticks
nice 0: 6 ticks
nice 1: 5 ticks
nice 2: 5 ticks
nice 3: 5 ticks
nice 4: 5 ticks
nice 5: 4 ticks
nice 6: 4 ticks
nice 7: 4 ticks
nice 8: 4 ticks
nice 9: 3 ticks
nice 10: 3 ticks
nice 11: 3 ticks
nice 12: 3 ticks
nice 13: 2 ticks
nice 14: 2 ticks
nice 15: 2 ticks
nice 16: 2 ticks
nice 17: 1 ticks
nice 18: 1 ticks
nice 19: 1 ticks
Linus also replied to Richard:
Same number of ticks. The nice 10 one gets scheduled more eagerly, though (ie the "nice" level does more than just determine the number of ticks: it is also used to determine relative priorities if two processes have the same number of ticks to run).
In 2.5.x we'll probably make the timer run at a higher rate, making this issue go away, but for 2.4.x this was the expedient way to maintain UNIX semantics and get good interactive behaviour.
Improving The Kernel Release Schedule
13 Jul 2000 - 16 Jul 2000 (82 posts) Archive Link: "[Announce] BKL shifting into drivers and filesystems - beware"
Topics: BSD, FS: ReiserFS, Microkernels, Virtual Memory
People: Alan Cox, Richard Gooch, Theodore Y. Ts'o, Linus Torvalds, Andrea Arcangeli, Hans Reiser, Alexander Viro, Rik van Riel
This one got started when Alexander Viro announced serious changes to the Virtual Filesystem that would require code changes to third-party drivers. There was a lot of resistance to this, and Alan Cox said, "This continual VFS redesign creep is going to far now. We'll never get a 2.4 if we keep moving all the locking around." Elsewhere, he added: "This kind of lock shifting is a major upheaval. It invalidates any device driver testing done in the past weeks when we have been slowly moving towards more stability. Im just hoping Linus refuses the changes"
Hans Reiser also pointed out that it was thanks to things like Alexander's VFS redesign so late in the game, that made him wonder why ReiserFS would have to wait until 2.5; Elsewhere, there was also some discussion of the Virtual Memory situation, which Rik van Riel had still not stablized (Rik and Andrea Arcangeli also had another small flame skirmish in their ongoing war over classzone).
At one point Linus addressed the VFS changes, saying he was not totally opposed to the redesign, but he wanted to avoid breaking third-party drivers. He approved Alexander's patch, but Richard Gooch objected, "But at some point you need to cut a new kernel version and live with it," and added that the changes would "cost us time and stability. Where do you draw the line?" Theodore Y. Ts'o replied:
I have to agree with Richard here. If we do this, it'll set back 2.4 by at least month (and I may be conservative here). There will *always* be more places where we can do more/better fine-grained locking. Where indeed do we draw the line, and do we really want to be doing this while we're at 2.4.0-test*?!?
If we were still accepting stuff like this, we shouldn't have moved the verison number to 2.3.99 or 2.4.0testn.
Linus explained that he didn't want to require driver writers to sprinkle '#ifdef's in their code, testing for kernel versions. He explained:
We'll end up living with 2.4.x for two years or more, judging by past performance, and we'll have especially driver writers that concentrate on the 2.4.x stuff for a long time.
Having driver interface inconsistencies like that is nasty, and the ones we _know_ that we'll have should be minimized.
This is another reason why the open/read/write/close series is particularly important to get done first: it's the stuff every driver tends to do. Things like "fasync" is less critical because even if it is used a lot by drivers, it tends to use the helper routines (so drivers often do not need to worry over-much about locking).
This shoul dhave been the last locking issue, though. We scale too well for words.
Richard suggested shortening the development cycle to 6 or 9 months, and Linus replied:
This was what I wanted for the 2.2 -> 2.4 change too, and a much shorter development cycle would be wonderful. It obviously was not to be: 2.4 is already about 18 months out. We're not as bad as the 2.0 -> 2.2 cycle was, but it's painful.
It seems to be a lot harder than it should be to keep short development releases. I certainly haven't found the magic combination yet. 2.4.x is definitely all I hoped for (the big goal for me was to make sure that the mindcraft-like web-performance scalability problems would be gone, and we definitely fixed that), but it took much too long.
What would help would be more modular releases: smaller pieces of the kernel getting released independently, allowing for smaller and shorter release cycles. In Linux at least we don't have the "whole world" release issue that most other OS people have (including the free ones: I think the BSD "world" approach is horrible partly because of the release issues), but even just the kernel is so big that it would be nice to be able to see it as multiple independent projects.
At the same time it's obviously not true that there are independent projects, and especially drivers (which _sound_ independent) are very likely to be impacted quite a lot by infrastructure changes. There are no really clear lines to split development up by, and the cures are worse than the disease ("microkernels", I hear somebody shout. But that approach would just make it impossible to do the kinds of improvements we _have_ done and probably will continue to do).
Later, he went on:
a central kernel repository is so nice: most of the time when something changes, we can fix everything in one go, and people don't have to be all that aware of the changes. It's not always true: some of the VFS changes (namely the page cache write-through etc) were _so_ intrusive that it was hard to make the fix-ups available, and as a result a number of filesystems were left in a broken state.
And quite often the "grep for places to change" approach misses a few (this, btw, is why I've grown to love the new syntax for structure initializers: it's a h*ll of a lot easier to do a "grep 'release:' *.c" than it is to try to figure out where the different "release" entries are initialized).
But the fact that we need a big kernel repository right now does not necessarily mean that we'll need one forever. With good enough interfaces that people can truly feel happy about, it would be possible to split stuff up one day. That is, after all, how the system call interfaces work, and is what allows us to split the kernel from everything else.
Of course, usually what is "good enough" one day ends up being "really bad" the next when some clever bastard came up with the really good way of doing something..
Elsewhere, in the same vein he added:
In another five or ten years, we may be at the point where the fundamental interfaces _really_ don't change, and that we've handled all the scalability issues and that we have no need to add new interfaces. At THAT point we can just say "ok, drivers are truly independent".
It's not true today.
Basically, the thing that allows 2.4.x to scale as well as it does (and it does really well: look at the current SpecWeb99 world record numbers, and compare it to the also-ran second place), is exactly because we had all the source in one place, and we _could_ make fundamental changes. Claiming anything else is silly - if we had broken-up device drivers, we'd have been up shit creek without a paddle. End of story.
This is the thing that people don't understand. In theory it is wonderful to have modularization. It's the best thing on earth. But if that modularization means that you can't fix the module interfaces, then you're going to remain broken for all time.
This is why I rather fix module interfaces early and often. Make sure that we (a) have good interfaces that matches what the different parts of the kernel want to have and (b) make people used to the fact that a driver or a filesystem is not a static thing, and keep them aware of the fact that it depends on the kernel underneath it.
We're certainly getting closer to a good interface in many areas. The current VFS interfaces, for example, are pretty good - although many of the less important ones still depend on the kernel lock etc. But we're _not_ at the stage yet where we could just say "ok, a driver is a driver, and we don't need to worry about it".
Later, he added, "Don't get me wrong. To some degree I would _love_ to not have a large kernel archive. It's big, and it makes releases harder. No question about that. But the monolithic approach definitely has a lot of advantages."
One Way To Hunt For Bugs
13 Jul 2000 - 14 Jul 2000 (13 posts) Archive Link: "Result of compiling with `-W'"
People: Andrew Morton
Andrew Morton posted a patch and reported, "Building the kernel with `gcc -W' generates about nine megs of warnings. It also catches a _lot_ of bugs, some quite serious. The attached patch fixes about fifty of them. Ten or so others have been sent to maintainers." Some folks criticized various aspects of the patch, and Andrew posted a corrected one.
'gcc-2.91.66' Recommended For Kernel Compilation
14 Jul 2000 - 17 Jul 2000 (8 posts) Archive Link: "gcc-188.8.131.52 warnings [PATCH]"
People: Richard Gooch, Randy Dunlap, Alan Cox, Linus Torvalds
In the course of discussion, Linus Torvalds mentioned that he used 'gcc-2.91.66', and that 'gcc-2.7.2' was "compiler non-grata". Richard Gooch replied, "Aha! Now I know what compiler you're using. Is there general agreement in the cabal that gcc-2.91.66 is Good[tm]?" Alan Cox and Randy Dunlap confirmed this, and the thread ended.
Band-Aids On Virtual Memory While New Design Coalesces
15 Jul 2000 (2 posts) Archive Link: "[patch] 2.4.0-test4 filemap.c"
Topics: Virtual Memory
People: Rik van Riel, Arjan van de Ven
In the 'linux-mm' mailing list, Rik van Riel posted a brief patch and summed up his general attitude to the existing VM structure, saying, "this stuff is untested and since I don't really care about tweaks to the old VM you shouldn't bother me about it..." Arjan van de Ven reported good success with this and another patch, and remarked, "I hope these patches make it into the kernel, at least until the new VM is ready..."
Elsewhere under the Subject: [PATCH] test5-1 vm fix (http://mail.nl.linux.org/linux-mm/2000-07/msg00119.html) , in the course of discussion, Rik said, "There's nothing wrong with the current VM that wasn't fixed in one of my patches the last 8 weeks. (except for the fundamental design flaws, which I will fix in the *next* N+1 weeks)."
Joe Pranevich's Latest Summary Of Kernel Changes For 2.4
16 Jul 2000 - 17 Jul 2000 (4 posts) Archive Link: "Linux 2.4 Changes - Wonderful World of Linux 2.4 Final Draft"
Topics: Disks: SCSI, FS: devfs
People: H. Peter Anvin, Douglas Gilbert, Joe Pranevich
Joe Pranevich posted the latest draft of The Wonderful World Of Linux 2.4 (http://kernelnotes.org/lnxlists/linux-kernel/lk_0007_03/msg00127.html) , and asked for feedback. H. Peter Anvin replied with a correction to Joe's comments about 'devfs'. In the document, Joe had said that the traditional '/dev/index.html' naming scheme wouldn't work if one had, say, more than 26 hard drives, because that would use up all the letters of the alphabet for '/dev/hda/index.html' through '/dev/hdz/index.html'. H. Peter replied that the 27th drive would simply be called '/dev/hdaa/index.html' in the traditional naming scheme. He also remarked bitterly, "The bottom line is that devfs takes things that belong in user space, forces them into kernel space, and then expects user space to clean up the resulting mess." Douglas Gilbert replied with a pointer to a doc on device naming in the SCSI subsystem (http://www.torque.net/scsi/linux_scsi_24/) .
Joe told me recently he's still looking for feedback (and he probably will be for awhile). Check out his summary and contact him with your comments at firstname.lastname@example.org (mailto:email@example.com) .
We Hope You Enjoy Kernel Traffic
Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.