Kernel Traffic #137 For 15 Oct 2001 By Zack Brown linux-kernel FAQ (http://www.tux.org/lkml/) | subscribe to linux-kernel (http:/ /www.tux.org/lkml/#s3-1) | linux-kernel Archives (http://www.uwsg.indiana.edu/ hypermail/linux/kernel/index.html) | kernelnotes.org (http:// www.kernelnotes.org/) | LxR Kernel Source Browser (http://lxr.linux.no/) | All Kernels (http://www.memalpha.cx/Linux/Kernel/) | Kernel Ports (http:// perso.wanadoo.es/xose/linux/linux_ports.html) | Kernel Docs (http:// jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html) | Gary's Encyclopedia: Linux Kernel (http://members.aa.net/~swear/pedia/kernel.html) | # kernelnewbies (http://kernelnewbies.org/) Table Of Contents * Standard Format * Text Format * XML Source * Polish Translation * Introduction * Mailing List Stats For This Week * Threads Covered 1. 3 Oct 2001 - 10 Oct 2001 (49 Journalled Filesystem Recommendations posts) 2. 3 Oct 2001 - 8 Oct 2001 (28 More Discussion Of VM Politics posts) 3. 8 Oct 2001 (10 Differences Between Linus' And Alan's posts) 2.4 Trees 4. 8 Oct 2001 (1 post) 2.0.40-pre2 Released Introduction I'd like to thank the folks who sent me encouraging emails in response to last week's introduction to KT. I'd also like to thank the two folks who sent emails criticizing me for my statements. They were thoughtful and sincere, and I appreciated them. Thanks to all. Mailing List Stats For This Week We looked at 1381 posts in 5970K. There were 427 different contributors. 198 posted more than once. 183 posted last week too. The top posters of the week were: * 76 posts in 187K by Alan Cox * 37 posts in 142K by jamal * 31 posts in 125K by Thomas Hood * 29 posts in 171K by Ingo Molnar * 28 posts in 144K by Andrea Arcangeli * Full Stats 1. Journalled Filesystem Recommendations 3 Oct 2001 - 10 Oct 2001 (49 posts) Archive Link: "[POT] Which journalised filesystem uses Linus Torvalds ?" Topics: Disk Arrays: RAID, Disks: IDE, FS: ReiserFS, FS: ext2, FS: ext3 People: Dave Jones, Alan Cox, Dave Cinege, Andre Dahlqvist, Rik van Riel Sebastien Cabaniols asked which journalling filesystem would be best for production machines. Rik van Riel recommended ext3, which he'd used for over a year without trouble. Dave Jones replied that he also had excellent experience with ext3, except for the time he tried it on the IBM disk on his Vaio. In that case, "Lots of asserts were triggered, and on reboot it couldn't find the journal, the superblock, or the backup superblocks. I spent a few hours trying to get data back, and eventually gave up and reformatted as ext2." Andre Dahlqvist asked which IBM disk had this problem, and alan Cox said: Its not specifically IBM, there are two sets of things to watch out for * Cache flush as a nop/unimplemented. This is legal in all but the most recent ATA specification. The spec has been tightened so that problem will go in time * Some IBM laptop drives appeared to fail to write back the cache on machine shutdown/suspend etc. The exact rights/wrongs/details on that one haven't been pinned down because the folks concerned swapped a couple of drives for different ones, saw the problem vanish and being a large organisation had the supplier replace the other fifty odd. Billy Harvey also reported no problems with ext3, and Alan added, "I have no recorded case of an ext3 crash that someone showed was even likely to have been disk caching stuff." Elsewhere, Dave Cinege recommended Reiserfs, which he used for everything, "including a 13 drive Fiber Channel SAN with 3 hosts and multiple levels of Software RAID between them." He'd never been excited by what he'd read about ext3, and had never tried it. 2. More Discussion Of VM Politics 3 Oct 2001 - 8 Oct 2001 (28 posts) Archive Link: "bug? in using generic read/ write functions to read/write block devices in 2.4.11-pre2" Topics: Virtual Memory People: Rob Landley, Alan Cox, Rik van Riel In the course of discussion, Rob Landley asked: Out of morbid curiosity, when 2.5 does finally fork off (a purely academic question, I know), which VM will it use? I'm guessing Alan will still inherit the "stable" codebase, but the -ac and -linus trees are breaking new ground on divergence here. Which tree becomes 2.4 once Alan inherits it? (Is this part of what's holding up 2.5?) Are we waiting for andrea's shiny new VM to get into Alan's tree first? I think Alan said something about somewhere freezing over, but don't quite recall. Is someone else (Andrea?) likely to become 2.4 maintainer? Alan Cox replied, "For the moment I plan to maintain the 2.4.*-ac tree. I don't know what will happen about 2.4 longer term - that is a Linus question. Looking at historical VM history I don't think we will eliminate enough "2.4.10+ oops on my box" and "on this load the VM sucks" cases from 2.4.10 to fairly review Andrea's VM until Linus has done another 5 or 6 releases and the VM has been tuned, bugs removed and other oops cases proven not to be vm triggered." In Rob's original post, he also asked, "Oh, and what's the deal with "classzones"? Linus told Andrea classzones were a dumb idea, and we'd regret it when we tried to inflict NUMA architecture on 2.5, but then went with Andrea's VM anyway, which I thought was based on classzones... Was that ever resolved? What the problem avoided? What IS a classzone, anyway? I'd be happy to RTFM, if anybody could tell me where TF the M is hiding..." Rik van Riel replied: Classzones used to be a superset of the memory zones, so if you have memory zones A, B and C you'd have classzone Ac consisting of memory zone A, classzone Bc = {A + B} and Cc = {A + B + C}. This gives obvious problems for NUMA, suppose you have 4 nodes with zones 1A, 1B, 1C, 2A, 2B, 2C, 3A, 3B, 3C, 4A, 4B and 4C. Putting together classzones for these isn't quite obvious and memory balancing will be complex ;) Of course, nobody knows the exact definitions of classzones in the new 2.4 VM since it's completely undocumented; lets hope Andrea will document his code or we'll see a repeat of the development chaos we had with the 2.2 VM... At one point Alan said along similar lines, "The classzone code seems to deal in combinations of memory zones, not in specific zones. It lacks docs and the comments seem at best bogus and from the old code so I may be wrong. So its relative weightings for each combination of memory we might want to consider for each case." 3. Differences Between Linus' And Alan's 2.4 Trees 8 Oct 2001 (10 posts) Archive Link: "linux-2.4.10-acX" Topics: Compression, FS: InterMezzo, FS: ext3, Raw IO, USB, User-Mode Linux, Virtual Memory People: Alan Cox, Linus Torvalds, Louis Garcia, Robert Love Louis Garcia asked how much of Alan Cox's 2.4 branch had been merged with Linus Torvalds', and what any remaining differences were between them. Alan replied: There are measurable differences between the two trees. Notably * Linus uses the Andrea VM in 2.4.10 -ac uses the Riel VM in 2.4.10-ac The -ac tree also has the following major additions * Platform support for x86_64, usermode linux , etc * 32bit uid safe quota * Ext3 file system * PnPBIOS support * Various PPro and Pentium workarounds * Simple boot flag * Faster x86 syscall path * PPPoATM * Elevator flow control * DRM 4.0 and 4.1 support not just 4.1 (ie XFree 4.0.x works) * CMS file system * Intermezzo file system * isofs compression and drivers for * IB700 * IBM Mwave * Lots more MTD devices * SA1100 PCMCIA * Various USB toys and then lots of bug fixes Much of that will go on to Linus. Some he has refused (faster syscall path, elevator flow control, ..). It takes time to feed stuff on and often I want to test it in -ac first. Because so much changed in 2.4.10/11pre it's now getting very hard to merge a lot of the fixes like the truncate standards compliance stuff so they may not make Linus tree until 2.5 Louis asked if the raw IO and block IO patches had been merged into Alan's tree from 2.4.10, and Alan replied, "No. There were certain bits of 2.5^H4.10 that I took one look at and threw out for the moment as unsafe for a stable tree - the page cache block device and O_DIRECT stuff included. 2.4.11pre seems to back some of that out too." Elsewhere, Robert Love asked what Linus' complaint had been about the faster syscall path, and Alan replied, "He insisted it wouldnt make it any faster. Of course rdtsc and profiling counters of locked cycles show otherwise.." But Linus replied to this: No, I insist that it doesn't make things _noticeably_ faster (a segment load is something like 12 cycles on a PII), and doing it complicates the return path unnecessarily for the default case. I seriously doubt you've (or anybody else) measured it with rdtsc or profiling: what you call the "fast path" is never taken on regular system calls, only on nested calls where we return to the kernel. How many of those have you ever seen? In short, has _anybody_ EVER seen any actual improvement from this ugly "optimization"? There was no reply. 4. 2.0.40-pre2 Released 8 Oct 2001 (1 post) Archive Link: "[ANNOUNCE] kernel v2.0.40-pre2" Topics: CREDITS File, FS: ext2, MAINTAINERS File People: David Weinehall, Philipp Rumpf, Jari Ruusu David Weinehall announced: First of all, I'd like to thank Seiichi Nakashima for reporting a few of these errors, and Jari Ruusu for reporting the problem with the new version-name and modules. If this still doesn't work, I'll remove the KERNELRELEASE-stuff completely. This release is dedicated to all the innocent people of Afghanistan that inevitably, and sadly, will suffer in the hunt for Usama Bin Ladin. 2.0.40pre2 * Make pci2000 compile (Joseph Martin) * Use KERNELRELEASE in module installpath as well (me) * Removed unused variable in ext2/super.c (me) * Fixed warning in ext2/dir.c (me) * Fix a blunder of my own in arch/kernel/i386/traps.c (me) * Fix typo in sched.c (Tim Sutherland) * Fix bug in mkdep.c (Tim Sutherland) * Fix bug in autoirq.c (Michael Deutschmann) * Add allocation debugging code (Michael Deutschmann) * Fix bugs in the math-emu code (Bill Metzenthen, Michael Deutschmann) 2.0.40pre1 * Fixed the ordering of watchdog initialising, to make sure hardware watchdogs takes precedence over the softdog driver (Philipp Rumpf) * Fix the CREDITS-entry for Kai Petzke (Kai Petzke) * Updated the MAINTAINERS-file a little (me) * Fix "dumpable"-race (Solar Designer) * Fix theoretical exploit in printk (Solar Designer) * Backported checkconfig.pl, checkhelp.pl and checkincludes.pl from v2.4 (me) * Backported support for tags and TAGS (me) * Added an extra-version entry to the version#, to keep track of the prepatches etc. (me) * Fix all occurences of #endif BLABLA type; don't forget that it should be /* BLABLA */ !!! (me) There was no reply. Sharon And Joy Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.