Kernel Traffic #3 For 28 Jan 1999 By Zack Brown Table Of Contents * Standard Format * Text Format * XML Source * Introduction * Mailing List Stats For This Week * Threads Covered 1. 20 Jan 1999 - 26 Jan 1999 (39 'goto' In The Sources posts) 2. 21 Jan 1999 - 23 Jan 1999 (7 'tar' Slowdown And File Ownership posts) 3. 19 Jan 1999 - 25 Jan 1999 (22 ioctl Documentation; Legacy Features posts) 4. 20 Jan 1999 - 21 Jan 1999 (2 Roast posts) 5. 21 Jan 1999 - 25 Jan 1999 (14 FUD From WindowsNT Magazine posts) 6. 21 Jan 1999 (2 Compiling The Examples From 'Linux posts) Device Drivers' 7. 21 Jan 1999 (4 Filesystem Mirroring In The Kernel posts) 8. 22 Jan 1999 - 26 Jan 1999 (13 Legacy Compiler Workaround posts) 9. 21 Jan 1999 - 27 Jan 1999 (53 Big Memory Machines posts) 10. 24 Jan 1999 - 26 Jan 1999 (4 Latebreaking Bug And Fix posts) 11. 20 Jan 1999 (2 2.2 Press Release Seen Early On posts) Slashdot 12. 26 Jan 1999 - 27 Jan 1999 (6 Patch Name Confusion Between pre9 And posts) 2.2.0 13. 26 Jan 1999 (4 Hacking Shifts To 2.2.x posts) 14. 26 Jan 1999 - 27 Jan 1999 (27 2.2.0 Exploit To Crash The System posts) 15. 26 Jan 1999 - 27 Jan 1999 (22 Plans For The Stable Series posts) Introduction We have a new home! Mark Constable has generously given quite a nice site at http://www.kt.opensrc.org to Kernel Traffic. Now all we have to do is figure out how to use all the nifty features! We'd also like to thank the other folks who offered your services in exchange for nothing. Thank you all very, very much. We also have a new banner, created by Daniel van Gerpen (http:// homepages.emsnet.de/~gerpen/) ! Oh, yeah, and we took the main body of Kernel Traffic out of its table element so it'll load faster. Some folks were complaining. By the way, next week may not see an issue of Kernel Traffic. Our main author is being summoned to the land of no email. We'll try to get one out before we go, but it may not be possible. Sorry. This was a crazy week. The release of 2.2 caused an explosion of debugging, while prior to that there was a quiet period, though it's hard to think of flame wars and other debates as 'quiet'. Mailing List Stats For This Week We looked at 1726 posts in 6660K. There were 636 different contributors. 280 posted more than once. 257 posted last week too. The top posters of the week were: * 59 posts in 170K by (Alan Cox) * 40 posts in 191K by Andrea Arcangeli * 36 posts in 127K by "Stephen C. Tweedie" * 26 posts in 91K by Arvind Sankar * 26 posts in 83K by (Guest section DW) * Full Stats 1. 'goto' In The Sources 20 Jan 1999 - 26 Jan 1999 (39 posts) Archive Link: "Structure vs purism ?" Topics: Assembly, Coding Style People: Dave Jones, John Alvord, Linus Torvalds Another big debate about the overall structure of the kernel source. Dave Jones posted a lengthy criticism of the use of 'goto' in the kernel source, adding that he had come up with a patch that removed many of them. He summed up his argument with, "The idea of goto's being a 'bad thing' is possibly the first thing that gets taught on pretty much every programming course I've ever been on. Some might argue, that this ideology is just for purists, but compilers have become a lot more advanced. The use of constructs such as goto are outdated crutches used by people too lazy to write a more structured solution." Many people responded to this, pointing out various constructs that depend on gotos to produce sleak assembly. A few people pointed out other problems with goto. John Alvord said, "I always thought the biggest problem was caused by backward gotos. That was what caused the number of possible paths to explode combinatorially." On the whole, though, pretty much everyone had strong arguments in support of gotos in the kernel, and Linus Torvalds doesn't look like he's going to take them out. 2. 'tar' Slowdown And File Ownership 21 Jan 1999 - 23 Jan 1999 (7 posts) Archive Link: "Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues." Topics: History, Networking, Source Distribution People: Linus Torvalds Michael Weller noticed some odd behavior. When creating a tarred backup of his system, tar worked fine until it got to the linux source directory, at which point it started taking several seconds per file. A bunch of people pointed out that this behavior happens if the system doesn't recognize the owner of the files in question. In particular, the linux sources probably had Linus Torvalds' user id, which had presumably not been assigned to anyone on Michael's system. One solution was to do a 'chown' on the files in question, to assign them to a known user. Another was to create a new user account with that id. The reason for the slowdown is that when tar sees that it doesn't recognize the user id of a file, it queries the NIS, which takes several seconds if it doesn't find a user. Only after the NIS query is answered does tar proceed to the next file. The NIS (Network Information Service) is just a protocol for making those kind of informational queries across a network. Interesting tidbit: according to the Free Online Dictionary of Computing (http: //wombat.doc.ic.ac.uk/foldoc/index.html) , the NIS protocol is actually owned by SUN Microsystems. Originally the protocol was called the Yellow Pages, but that turned out to be a trademark of British Telecommunications, so they had to change it. The old name has a legacy in the protocal itself, though: all their commands and functions still start with 'yp'. 3. ioctl Documentation; Legacy Features 19 Jan 1999 - 25 Jan 1999 (22 posts) Archive Link: "Adding checkpointing API to Linux kernel" Topics: Backward Compatibility, Debugging, Documentation, Ioctls, Replacing Linux People: Alan Cox, Simon Kenyon, Aaron Denney, Michael Elizabeth Chastain, Linus Torvalds A wide ranging thread that featured Michael Elizabeth Chastain's revelation last week of a powerful debugging tool (ftp://ftp.shout.net/pub/users/mec/misc) he wrote back in 1995 and released under the GPL but which went completely unnoticed. That anonymity is definitely a thing of the past. Responding to Michael's list of features, Alan Cox said, "I'd practically kill for that stuff," and Simon Kenyon added "i too would have been using it all day, every day had i known that it exists." This week, the thread seems to have migrated to ioctls and the possibility of documenting them. 'ioctl' is short for 'input/output control' and the ioctl() function is a catch-all for giving non-read/write commands to physical devices. Such a command might include telling a device to change its state in some way. Anyone who writes a device driver can add their own functionality to ioctl(). With the speed of linux development, this means (according to Michael in the description of his debugging tool) that new ioctl calls appear on a weekly basis and are almost impossible to track, let alone document. Moreover, there is no standardized way to implement ioctls. The calling parameters vary from invocation to invocation. ioctls appear to be one of those UNIX holdovers that one or another post-linux OS will get rid of. Aaron Denney even suggested, "Instead of ioctls I would actually prefer something akin to to the plan 9 model where each device is actually a directory with two files: a data file and a control file. Instead of opening the device and calling ioctls you would instead open the control channel and read() and write() messages." As Michael put it while discussing his program, "It's a dandelion problem -- you pull on it, and you find a whole root system underneath it." He added "The problem is: 'how do you document an entity where hundreds of unrelated people check in code everywhere, and your desire for documentation exceeds the willingness of those hundreds of people to write it?'" Michael also suggested that Linus Torvalds require all ioctl patches to come with documentation or be rejected from the kernel. Robert Kiesling responded with the suggestion that the information may already exist in the code, and only needs to be put into a readable form. Getting back to the question of debilitating holdovers: there was a bit of a flame war this week in a different thread, regarding the idea that linux may one day be replaced. It's a natural subject. The idea is that the need for backward compatibility ties OS's to design flaws, and that only a fresh start can exorcise those bugs. A thorough discussion of these issues might lead to the creation of a new and better operating system, especially now that software could be developed and ported at lightning speed via the bazaar development model. The linux-kernel mailing list might be the best place to find people familiar with linux's limitations and the reasons behind them, but (evidently) tact should be used in discussing them. 4. Roast 20 Jan 1999 - 21 Jan 1999 (2 posts) Archive Link: "Libs drop from memory when?" Topics: New Developers People: David D.W. Downey A brief but chilling thread. David D.W. Downey asked a question, then replied the same day to his own post after apparently getting private flames by numerous people on linux-kernel (as well as two sincere replies to his question). The entire thread consisted of just those two posts. No one else responded on the list. His first post is friendly and respectful. His second is, well, troubled. David before: "Evening all. My question is, what is the kernel's time limit for dropping unused libs from memory?" David after: "to some, those without their level of knowledge, who are incidently trying to learn, are idiots and should be burnt alive at the stake." 5. FUD From WindowsNT Magazine 21 Jan 1999 - 25 Jan 1999 (14 posts) Archive Link: "Linux Kernel constraints!" Topics: FS, Microkernels, Microsoft, Real-Time, SMP People: Yogesh Bansal, Justin Bradford, Matthew Kirkwood, Vojtech Pavlik, Zygo Blaxell, Stephen C. Tweedie Yogesh Bansal asked about a WindowsNT magazine article that rejected linux as an "enterprise solution", accusing linux of the following, in Yogesh's words: 1. kernel is not preemptive. ie even a higher priority user thread cant cause another thread to be swapped if the other thread is presently running in privileged/kernel context. 2. kernel is not reentrant. ie.only one thread in kernel context at a time. 3. kernel is not multi processing in the sense that on multiprocessor systems it will run on only one cpu at a time. Justin Bradford directed one of the harsher replies, not to the author of the original post, but to the magazine itself: "Windows NT magazine also ran an article stating all Linux programs had to distribute their source. In the same article, they mention Oracle, Informix, and Corel porting. So where do I download Oracle 8 source?!? The Win NT writers either aren't bright enough to understand the LGPL, or they're deliberately spreading misinformation. Either way, I can't imagine any group of people less qualified to comment on the Linux kernel." Other reactions addressed the limitations directly. Matthew Kirkwood said that all three assertions really meant the same thing, and thus were all true for 2.0, though not for 2.1 (and now 2.2). But he admitted that even for the latest kernels, "the filesystem and network stuff is still pretty single-threaded." Vojtech Pavlik said that only the first of the magazine's objections carried any weight into kernel 2.1, and did admit that a process running in priveleged/ kernel context could not be interrupted. He added that this "only would be a problem if the kernel call would take very long time," and Zygo Blaxell added, in another part of the thread, "And if it is [a problem], use RT-Linux, which does have a pre-emptive kernel. Sort of. It's actually a hard real-time microkernel that runs the entire Linux kernel as a low-priority task. It's a hybrid solution but good enough for things that plain Linux isn't." Finally Stephen C. Tweedie did eventually come down on even the magazine's first objection: "This is a feature, not a problem. A fully preemptive kernel is necessary for true realtime, NOT for a server OS. Excessive preemption requries extra locking and it craps up the use of the CPU caches, resulting in overall poorer throughput for a server OS." 6. Compiling The Examples From 'Linux Device Drivers' 21 Jan 1999 (2 posts) Archive Link: "Kernel modules in 2.1/2.2.." People: Matthew Kirkwood This is one of those quick question/answer threads that pop up often in linux-kernel. Someone has a problem, and someone else has the answer. Nicholas M. Kirsch bought O'Reilly's "Linux Device Drivers" and found that the simple examples in the book weren't compiling with the latest kernels. Matthew Kirkwood suggested, "If you can, rebuild the kernel without module versioning and try again. Alternatively, adding -DMODVERSIONS after -DMODULE might do the trick." End of thread. 7. Filesystem Mirroring In The Kernel 21 Jan 1999 (4 posts) Archive Link: "omirr" Topics: BSD: FreeBSD, FS People: Matthew Kirkwood Real-Time A brief but informative exchange. Javier Kohan posted asking why omirr (online fs mirroring) isn't in the kernel, and if it would be in the future. According to Matthew Kirkwood, with omirr "two filesystems could be kept in sync with one being the canonical version and the other (possibly remote, or slow) as a backup or similar." He added, "It went into the kernel at about 2.1.3x and out again at about 5x, where it became obvious that it was causing too many complications and was better left to userspace." Numa then pointed out, "Something similar to omirr already exists, it has been implemeted in Irix, and is underway for FreeBSD. The userland tool is called 'webd (http://science.nas.nasa.gov/Groups/WWW/subpages/topology.html) ', and although the name implies web, it could be used extensively in other arenas, such as a realtime tripwire." A tripwire seems to be something that tells if a filesystem has been changed--useful for detecting attacks. 8. Legacy Compiler Workaround 22 Jan 1999 - 26 Jan 1999 (13 posts) Archive Link: "-fno-strength-reduce" Topics: History, Optimization People: Arvind Sankar, Steven N. Hirsch, Henrik Olsen Arvind Sankar asked, "Why is this option given to the compiler?" -fstrength-reduce is an optimization option used to do strength reduction, i.e. to calculate loop variables more efficiently. By using -fno-strength-reduce, the compiler is explicitly told not to do that optimization. The answer is interesting. Apparently it's a legacy from way back. Steven N. Hirsch said, "AFAIK, it is a workaround for a gcc-2.7 bug discovered by John Davis. I believe it was fixed quite some time ago in gcc-2.7.[2-3], and almost certainly does not exist in egcs." The gcc bug had to do with the fact that gcc's -O2 option (which just tells the compiler to do a lot of different speed optimizations) implicitly included -fstrength-reduce, which (it was discovered) didn't work so well. The -fno-strength-reduce was added to the linux makefile to stop -O2 from trying to do that broken optimization. Henrik Olsen added, "It was fixed with GCC 2.7.2.3, but it's still in [the makefile] because only one person actually bothered to make a speed comparison and post the result to the [linux-kernel] list, this comparison showed that any gain from dropping the option is too small to be measurable, end ends up in the noice. As a result, Linus didn't think removing it was worth the bother." 9. Big Memory Machines 21 Jan 1999 - 27 Jan 1999 (53 posts) Archive Link: "Re: MM deadlock [was: Re: arca-vm-8...]" Topics: Big Memory Support, Development Philosophy People: Linus Torvalds, Alan Cox Microkernels This long and wide ranging thread had at least one interesting episode this week. Linus Torvalds and Alan Cox both took a vacation for a few days, then came back and had an argument over memory. Alan wanted linux to have better handling of large areas of ram, while Linus put his foot down, saying: You need more that 32 bits of address space to handle that kind of memory. This is not something I'm going to discuss further. If people want to use more than 2GB of memory, they have exactly two options with Linux: + get a machine with reasonable address spaces. Right now that's either alpha or sparc64, in the not too distant future it will be merced. + use the extra memory as a ram-disk (possibly memory-mappable, but even that I consider unlikely) This is not negotiable. Linus often takes this tone about an issue, and each time is very interesting. On the one hand, one might object that it's not "nice" to simply cut off the discussion. On the other hand, there seems to be something in the technical nature of the issues under dispute and the actual positions Linus takes regarding them, that the other top developers feel they can respect. It goes beyond the fact that Linus is Linus, it goes beyond the fact that he started the project. Although those things definitely support him, I believe the bottom of it is, as various developers have said at various times, some decision has to be made about such-and-such an issue, and they believe Linus' decisions are reasonable, whether they agree with those decisions or not. I think it's that recognizable (to the developers, if not to us mortals) reasonableness that gives legitimacy to what might otherwise be criticized as autocratic. 10. Latebreaking Bug And Fix 24 Jan 1999 - 26 Jan 1999 (4 posts) Archive Link: "pre9/final-Bug, introduced in pre8" Topics: FS, Release Scheduling People: Kurt Huwig, Linus Torvalds, Tim Waugh A bug was found in pre9 that might have delayed the release of 2.2. According to Kurt Huwig, the bug was introduced in pre8, and manifests itself in the following code (ed. [] (KT thanks Ted Clark for reminding us to escape '<' and '>' in the following snippet)) : #include void main( int argc, char *argv[] ) { open( argv[ 1 ], O_WRONLY|O_CREAT|O_TRUNC, 0666 ); } Linus Torvalds replied, "this seems to be due to pre9 removing some rather bogus code that happened to hide another problem in open_namei()." In the same post he offered a patch, and Tim Waugh (and later Kurt Huwig) followed up with a success report. open_namei() is defined in /usr/src/linux/fs/namei.c (http://lxr.linux.no/ source/fs/namei.c?v=2.2.15#L660) . According to the comments, it's "the namei for open". namei() is found right above it in the same file, and according to the comments, is simply used to get the inode of a given name. So, as we might have known from the source snippet above, the bug has to do with opening files. 11. 2.2 Press Release Seen Early On Slashdot 20 Jan 1999 (2 posts) Archive Link: "Draft 6 of Press Release seen on Slashdot" Topics: Release Scheduling People: Linus Torvalds Some folks were distressed that the press release for 2.2 that was being drafted on the list had been made public without being blessed by Linus Torvalds. It will be interesting to see how this affects future press releases that develop on linux-kernel. Will they be submitted to Linus for final touch-up, several releases before the final one? Will no press release be drafted because of this problem? Time will tell. 12. Patch Name Confusion Between pre9 And 2.2.0 26 Jan 1999 - 27 Jan 1999 (6 posts) Archive Link: "Re: 2.2.0 patch (where) ?" Topics: Release Scheduling People: Linus Torvalds Apparently Linus Torvalds called the pre9 patch and the 2.2 patch both "2.2.0-final", which caused a small bit of confusion amid the general flurry of bug-fixing. 13. Hacking Shifts To 2.2.x 26 Jan 1999 (4 posts) Archive Link: "Linyux 2.2.0ac1" Topics: Release Scheduling People: Alan Cox Alan Cox released one of his famous ac patches on Tuesday. One of the interesting things about moving from 2.odd-number to 2.even-number is that a lot more people suddenly start using the kernel and finding bugs. So the 2.2 series, at least initially, is likely to seem buggier than the 2.2pre series, just because more (maybe a lot more) people are using it. One interesting thing came out of this thread: apparently some folks on linux-kernel didn't know 2.2 had been released. They were waiting for an announcement that never came. 14. 2.2.0 Exploit To Crash The System 26 Jan 1999 - 27 Jan 1999 (27 posts) Archive Link: "Re: 2.2.0 SECURITY" People: Alan Cox, Linus Torvalds, Tigran Aivazian, Ingo Molnar, Chris Ricker This is the thread in which Dan Burcaw revealed the infamous 'ldd core' bug. Just run 'ldd core' on any core file and boom! System crash. Apparently ac1 also crashes in the same way, to which Alan replied, "If anyone has a crystal ball available that works let me know, otherwise its tricky to fix undiscovered bugs 8)" Tigran Aivazian tried to hack the problem down a bit, and claimed that the crash can only be induced immediately after the core file is created, and there must be some swap in use. But Chris Ricker came back with an example of the crash using an old core file and no swap. Some folks got the idea that telling the system not to produce core files would be a good workaround. Some others pointed out that this might hide the problem superficially, but would not protect the system. While this was going on, Ingo Molnar posted a patch that fixed the problem. Linus Torvalds was confused as to how the bug crept in, because he had originally written the key line correctly, "but I must have broke it for some really stupid reason." 15. Plans For The Stable Series 26 Jan 1999 - 27 Jan 1999 (22 posts) Archive Link: "Big Fix for 2.2.1" Topics: Development Strategy People: Linus Torvalds Linus Torvalds made his plans for the 2.2 series crystal clear in this thread. At some point in the thread Serguei Koubouchine asked why a particular error message that should have been taken out of the source before 2.2 had not been taken out. Linus replied, "Maybe because nobody complained even though I released 9 pre-releases? Too late now." And elsewhere in the thread he said, "Changing code for 2.2.x is not an option." So not only will no new features be added, but no changes will be made at all that do not fix particular bugs. We Hope You Enjoy Kernel Traffic Kernel Traffic is hosted by the generous folks at Tux.Org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.