Kernel Traffic #316 For 20 Jun 2005 By Zack Brown One thing I have to say is that I've learnt a lot more shell tricks. Now I'll just have to unlearn them, so that I won't have nightmares. --Linus Table Of Contents * Standard Format * Text Format * XML Source * Mailing List Stats For This Week * Threads Covered 1. 7 Jun 2005 - 11 Jun 2005 (34 Linux 2.6.12-rc6-mm1 Released posts) 2. 7 Jun 2005 - 10 Jun 2005 (13 Migrating To 4K Kernel Stacks posts) 3. 7 Jun 2005 - 14 Jun 2005 (103 Summary Of Real-Time Issues posts) 4. 10 Jun 2005 - 13 Jun 2005 (51 PREEMPT_RT Versus Adeos: A First posts) Problematic Comparison 5. 10 Jun 2005 - 16 Jun 2005 (42 DevFS Removed From Linux posts) 6. 11 Jun 2005 (2 Linux 2.6.11.12 Released posts) 7. 12 Jun 2005 - 13 Jun 2005 (2 Linux 2.4 Will Not Support GCC 4 posts) 8. 14 Jun 2005 - 16 Jun 2005 (10 Linux Docs Viewable On Only Non-Linux posts) Browsers Mailing List Stats For This Week We looked at 1706 posts in 10MB. See the Full Statistics. There were 614 different contributors. 243 posted more than once. The average length of each message was 95 lines. +-----------------------------------------------------------------------------+ | The top posters of the | The top subjects of the week were: | | week were: | | |-------------------------+---------------------------------------------------| |41 posts in 369KB by | | |gregkh@suse.de |102 posts in 549KB for "attempted summary of "rt | |36 posts in 110KB by |patch acceptance" thread" | |david s. miller |85 posts in 492KB for "[patch] local_irq_disable | |35 posts in 227KB by ingo|removal" | |molnar |57 posts in 219KB for "ipw2100: firmware problem" | |34 posts in 183KB by |51 posts in 276KB for "preempt_rt vs adeos: the | |karim yaghmour |numbers, part 1" | |30 posts in 162KB by |50 posts in 299KB for "2.6.12-rc6-mm1" | |denis vlasenko | | +-----------------------------------------------------------------------------+ These stats generated by mboxstats version 2.8 1. Linux 2.6.12-rc6-mm1 Released 7 Jun 2005 - 11 Jun 2005 (34 posts) Archive Link: "2.6.12-rc6-mm1" Topics: Kernel Release Announcement People: Andrew Morton, Adrian Bunk, Matt Porter Andrew Morton announced Linux 2.6.12-rc6-mm1, saying: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/ 2.6.12-rc6-mm1/ * Added v9fs * Various random fixes * Probably a similar number of breakages Adrian Bunk remarked, "That we do now have both drivers/rio/ and drivers/char/ rio/ and that they are for completely different things is confusing. What about drivers/rapidio/ ?" Matt Porter replied, "Fine with me. I'll roll it into my next update." 2. Migrating To 4K Kernel Stacks 7 Jun 2005 - 10 Jun 2005 (13 posts) Archive Link: "RFC: i386: kill !4KSTACKS" Topics: FS: ReiserFS People: Adrian Bunk, Alexander Nyberg Adrian Bunk said: 4Kb kernel stacks are the future on i386, and it seems the problems it initially caused are now sorted out. I'd like to: * get a patch into the next -mm that unconditionally enables 4KSTACKS * if there won't be new reports of breakages, send a patch to completely remove !4KSTACKS for 2.6.13 or 2.6.14 The only drawback is that REISER4_FS does still depend on !4KSTACKS. I told Hans back in March that this has to be changed. Is there any ETA until that all issues with 4Kb kernel stacks in Reiser4 will be resolved? If not people using Reiser4 might have to decide whether to switch the filesystem or the architecture... Alexander Nyberg pointed out that certain combinations of configuration options would break 4KSTACKS, "so you can't just remove it leaving users with no choice. This was not even difficult to trigger a while ago and I haven't seen any stack reduction patches in these areas." There was some dispute over whether this was still an issue; with no resolution on the list. Elsewhere, Vladimir Saveliev said he was hard at work on resolving any lingering ReiserFS issues with 4KSTACKS, and estimated he would complete his patches very quickly. 3. Summary Of Real-Time Issues 7 Jun 2005 - 14 Jun 2005 (103 posts) Archive Link: "Attempted summary of "RT patch acceptance" thread" Topics: Real-Time People: Paul E. McKenney Paul E. McKenney said, "Midway through the recent "RT patch acceptance" thread, someone mentioned that it might be good to summarize the various approaches. The following is an attempt to do just this, with an eye to providing a reasonable framework for future discussion." He included a very long document ( http://groups-beta.google.com/group/linux.kernel/msg/5ce63e8782bdd07a?hl=en) , which got a lot of praise from all quarters, especially for his ability to maintain a balanced, unbiased approach. A lot of people had suggestions for how to improve the document, but the conversation degenerated into further flames and accusations. The original post is quite worth reading though, for anyone interested in navigating the maze. 4. PREEMPT_RT Versus Adeos: A First Problematic Comparison 10 Jun 2005 - 13 Jun 2005 (51 posts) Archive Link: "PREEMPT_RT vs ADEOS: the numbers, part 1" Topics: Microkernels: Adeos, Real-Time, SMP People: Kristian Benoit, Nick Piggin, Ingo Molnar, Karim Yaghmour Kristian Benoit said: For the past few weeks, we've been conducting comparison tests between PREEMPT_RT and the Adeos nanokernel. As was clear from previous discussion, we've been open to be proven wrong regarding endorsement of either method. Hence, this comparison was done in order to better understand the impact of each method vis-a-vis vanilla Linux. At this time, we are publishing the summary results of the various test runs we've conducted. There is, of course, a lot of background information that needs to be provided (.configs, scripts, drivers, etc.) We will be making those available sometime early next week. In the mean time, the following is meant as food-for-thought. In our tests, we've used a set up with 3 machines. The two main systems were Dell PowerEdge SC420 machines with a P4 2.8 (UP not SMP configured) with FC3. One had 256 MB RAM and was the guinea pig (i.e. the machine controlling the mechanical saw a.k.a. TARGET) The other, having 512 MB RAM, was used to collect data regarding the guinea pig's responsiveness (a.k.a LOGGER.) The third machine, an Apple PowerBook 5,6 G4 / 1GB, was used for a dual purpose. First, it controlled both the target and the logger via ssh, and was also used to ping flood the target. This 3rd system is known as the HOST. Data was generated on all three systems: TARGET: LMbench data LOGGER: Interrupt response time HOST: LMbench total running time Both the host and the logger had a constant kernel configuration. The logger was running an adeos-enabled kernel in order to trigger and deterministically measure the responsiveness of the target. The host was running a plain gentoo-based kernel. The target and the logger were rigged via their parallel ports so that an output from the logger would trigger an interrupt on the target who's response would itself trigger an interrupt on the logger. In the various test runs, we've attempted to collect two sets of data. One regarding LMbench's total running time for a given set up and the other regarding the system's interrupt response time. Where appropriate, both tests were conducted simultaneously. Otherwise, they were conducted in isolation. The following tables should be self-explanatory. For LMbench test runs, 3 passes were conducted and an average running time was collected. Certainly, 3 passes is not as much as we'd like, but for the immediate purposes, it provides a sufficiently corroborated data set for analysis (as can be seen in the following tables.) For the interrupt response time measurement, the logger generated between 500,000 to 650,000 interrupts and measured the target's response time. The logger was not subject to any load whatsoever, except that imposed by the logging driver (running in a prioritary Adeos domain, and hence being truly hard-rt, a.k.a. "ruby" hard) and that of the user-space daemon committing the data to storage. It could be argued that the use of Adeos imposes a penalty to the measured response time. However, this penalty is imposed on all data sets, and verification of its impact can be inferred by analyzing the adeos-to- adeos set up provided below. With no further ado, here are the results we've obtained. As we said above, we will be making all related scripts, patches, and drivers available, so that others may conduct their own tests. Note that in the tests we've conducted, we've tried, in as much possible, to use similar kernels. However, we were unable to find a recent Adeos and a recent PREEMPT_RT patch which would both cleanly apply to a same recent kernel. Hence, we've compared vanilla 2.6.12-rc2 with an adeos-patched one and a 2.6.12-rc4 with a PREEMPT_RT-patched one. As can be seen below, the runs of the vanilla rc2 and rc4 yield very similar numbers, and can therefore reasonably be considered equivalent. LMbench running times: +----------------------------------------------------------------+ | kernel |plain|IRQ test|ping flood|IRQ & ping|IRQ & hd| |------------------+-----+--------+----------+----------+--------| |Vanilla-2.6.12-rc2|174 s|175 s |189 s |193 s |217 s | |------------------+-----+--------+----------+----------+--------| |with Adeos-r10c3 |180 s|180 s |185 s |183 s |211 s | |------------------+-----+--------+----------+----------+--------| |% |+3.4 |+2.9 |-2.1 |-5.2 |-2.8 | |------------------+-----+--------+----------+----------+--------| |Vanilla-2.6.12-rc4|176 s|177 s |189 s |191 s |218 s | |------------------+-----+--------+----------+----------+--------| |with RT-V0.7.47-08|184 s|187 s |206 s |201 s |225 s | |------------------+-----+--------+----------+----------+--------| |% |+4.5 |+5.6 |+9.0 |+5.2 |+3.2 | +----------------------------------------------------------------+ Legend: plain = Nothing special IRQ test = on logger: triggering target every 1ms ping flood = on host: "sudo ping -f $TARGET_IP_ADDR" IRQ & ping = combination of the previous two IRQ & hd = IRQ test with the following being done on the target: "while [ true ] do dd if=/dev/zero of=/tmp/dummy count=512 done" In the following, interrupts are triggered by the logger at every 1ms. It would be interesting to redo such tests with shorter trigger times. However, we wanted to keep the logger as "off-the-shelf" as possible. Interrupt response times (all in micro-seconds): +----------------------------------------------------+ | Kernel | sys load |Aver| Max |Min |StdDev| |------------------+----------+----+-----+----+------| |Vanilla-2.6.12-rc2|None |15.5|64.8 |15.2|1.0 | | |Ping |15.7|63.4 |15.2|1.2 | | |lm. + ping|16.0|72.2 |15.2|1.4 | | |lmbench |15.8|65.6 |15.2|1.1 | | |lm. + hd |15.8|179.9|15.2|1.3 | |------------------+----------+----+-----+----+------| |with Adeos-r10c3 |None |13.4|53.3 |13.2|0.2 | | |Ping |13.8|53.3 |13.3|0.6 | | |lm. + ping|13.9|21.8 |13.2|0.7 | | |lmbench |13.9|21.3 |13.3|0.6 | | |lm. + hd |13.9|53.2 |13.2|0.5 | |------------------+----------+----+-----+----+------| |Vanilla-2.6.12-rc4|None |15.2|64.2 |15.2|0.5 | | |Ping |15.6|63.0 |15.2|0.9 | | |lm. + ping|16.0|170.5|16.0|1.4 | | |lmbench |15.8|184.1|15.2|1.2 | | |lm. + hd |15.8|67.1 |15.0|1.1 | |------------------+----------+----+-----+----+------| |with RT-V0.7.47-08|None |15.5|73.8 |15.2|1.2 | | |Ping |17.1|79.8 |15.2|2.3 | | |lm. + ping|17.7|77.2 |15.2|3.1 | | |lmbench |17.1|80.0 |15.3|2.3 | | |lm. + hd |17.0|80.0 |15.3|1.8 | +----------------------------------------------------+ Legend: None = nothing special ping = on host: "sudo ping -f $TARGET_IP_ADDR" lm. + ping = previous test and "make rerun" in lmbench-2.0.4/src/ on target lmbench = "make rerun" in lmbench-2.0.4/src/ on target lm. + hd = previous test with the following being done on the target: "while [ true ] do dd if=/dev/zero of=/tmp/dummy count=512 done" Note: Adeos-r10c3 is a "combo" patch including both Adeos and PREEMPT_RT, though PREEMPT_RT is disabled. The above data has been provided as-is without any analysis for now. We will provide such analysis when publishing the complete data sets and related software. In the mean time, we hope such results will help further reflection. Nick Piggin replied, "This is wonderful data, thanks very much for putting in the work. I hope this thread and future threads on this topic can be steered more towards technical facts and numbers, as that is the only way to make sane choices." Elsewhere, Ingo Molnar asked Kristian, "could you send me the .config you used for the PREEMPT_RT tests? Also, you used -47-08, which was well prior the current round of performance improvements, so you might want to re-run with something like -48-06 or better." (he later added, "make that -48-10 or better." ) Nick added, "The other thing that would be really interesting is to test latencies of various other kernel functionalities in the RT kernel (eg. message passing, maybe pipe or localhost read/write, signals, fork/clone/exit, mmap/munmap, faulting in shared memory, or whatever else is important to the RT crowd)." Elsewhere, Karim Yaghmour said: Much to our dislike, we only noticed that we forgot to disable the debug options after posting the results :/ So, in all fairness, we will be redoing the tests on PREEMPT_RT early next week. In the plethora of things we wanted to try, it also seems that the "dd" test wasn't exactly as it was supposed to be. There should've been a "bs=1M" in there; as it currently is, the dd command doesn't really put any real load. We'll add this one to our repeats. I notice there are already suggestions regarding additional types of tests, and that's good. We'll try to take as many of these as possible. This is relatively simple given the scripts Kristian has put together. Nevertheless, it must be understood that we don't have infinite resources. So in sharing the framework we've developed, we hope others will be motivated to conduct their own tests. Ingo replied that he'd suspected something like debug options being enabled in those tests. He said, "the PREEMPT_RT latency numbers were so out of whack with anything i've measured on similar boxes. The debugging features on PREEMPT_RT are powerful but have a high overhead." He asked again (actually for the third time) for the .config file used in the tests, saying, "That's the easiest way i can tell you which options to watch out for." Karim posted the .config file, and Ingo pointed out numerous options that would increase latency. Karim said they'd try to use Ingo's modified .config file for their new tests. James R. Bruce asked Ingo to document the best options to use for the lowest latency, but Ingo replied, "i'm not a big doc writer, but i'm taking patches :-)" . 5. DevFS Removed From Linux 10 Jun 2005 - 16 Jun 2005 (42 posts) Archive Link: "[RFC] Patch series to remove devfs [00/22]" Topics: FS: devfs People: Greg KH, Adrian Bunk, Christoph Hellwig, Armin Schindler, Ed Tomlinson Greg KH said: As everyone knows[1], devfs is going to be removed from the kernel soon. To accomplish this, here is a series of patches (22 in all) that do just that. Surprisingly enough, devfs was almost everywhere in the kernel, that's why it takes so many patches :) Anyway, here's the whole series against 2.6.12-rc6-git4. If some of them don't make it through to lkml (due to size restrictions, or just failing on a "taste" filter), you can find them all at: http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-05-devfs / along with a quilt series file to apply them with. Andrew, please do not pick these up for your -mm tree. Odds are it will just cause too many conflicts to make it worth your while :) Comments welcome. Oh, and the best part? Here's the summary of the diffstat: 222 files changed, 112 insertions(+), 8545 deletions(-) It's nice to remove code from the kernel for a change... thanks, greg k-h [1] What? You don't know this? Didn't you get the memo[2]? Did you miss the huge flame war almost a year ago[3]? Are you living under a rock[4]? [2] http://lxr.linux.no/source/Documentation/feature-removal-schedule.txt [3] http://thread.gmane.org/gmane.linux.kernel/219278 [4] http://www.balloonplanet.com/shop/images/products/product_1788_small.jpg Adrian Bunk said, "Please don't remove the !CONFIG_DEVFS_FS dummies from devfs_fs_kernel.h. I'm sure some driver maintainers will want to keep the functions in their code because they share their drivers between 2.4 and 2.6." But Greg replied, "All drivers should be in the mainline kernel tree, so why would they need this? Remember, out-of-the-tree drivers are on their own..." Adrian said: I'm talking about drivers in the mainline kernel tree. In some cases the driver author supports both 2.4 and 2.6 and prefers to support them in one file. Sometimes he submits the latest version of his driver to Marcelo or Linus. If you remove the global function dummies, you force every driver maintainer who works this way to add the function dummies to their drivers. Yes, there are many places where 2.4 and 2.6 are not source compatible for good reasons. But if the effort for maintaining compatibility between 2.4 and 2.6 in one area is as easy as keeping a header file with some dummy funtions it's worth considering. And keeping the compatibility stuff in one file instead of spreaded through the kernel sources makes the cleanup to remove the last occurences a few years from now easier. Christoph Hellwig said, "The devfs calls for 2.4 and 2.6 are totally incompatible. And there's a trivial way to support both 2.4 and 2.6 in this area: don't support devfs at all, it always was marked either experimental or deprecated anyway." Adrian was surprised at how much had changed in DevFS during the 2.5 timeframe, and withdrew his objctions. Elsewhere, under the Subject: "[PATCH] Remove devfs from the partition code ( http://groups-beta.google.com/group/fa.linux.kernel/msg/d3d86f6fa5d1530c?hl=en) ", Greg posted many patches to remove DevFS not only from the partition code, but everywhere. In this thread, Armin Schindler expressed surprise that "the removal will be done in the middle of a stable line..." Adrian reminded him, "According to the current development model, 2.6 is a development kernel..." And Ed Tomlinson added, "The current Linus kernel is 2.6.11.12, where the last .12 is the latest 2.6.11 kernel with VIF (very important fixes) applied." Armin said he was aware of all this, but it was still surprising to see major upheavals like this, without a bump of the minor version number. 6. Linux 2.6.11.12 Released 11 Jun 2005 (2 posts) Archive Link: "Linux 2.6.11.12" People: Chris Wright Chris Wright said: We (the -stable team) are announcing the release of the 2.6.11.12 kernel. The diffstat and short summary of the fixes are below. I'll also be replying to this message with a copy of the patch between 2.6.11.11 and 2.6.11.12, as it is small enough to do so. The updated 2.6.11.y git tree can be found at: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/gregkh/linux-2.6.11.y.git and can be browsed at the normal kernel.org git web browser: www.kernel.org/git/ (http://www.kernel.org/git/) 7. Linux 2.4 Will Not Support GCC 4 12 Jun 2005 - 13 Jun 2005 (2 posts) Archive Link: "[PATCH 2.4.31 0/9] gcc4 fixes overview" People: Mikael Pettersson, Marcelo Tosatti Mikael Pettersson said: This set of patches fixes gcc4 problems in the 2.4.31 kernel's 'core' code. I've been running gcc4-compiled 2.4 kernels for several months on i386, x86_64, and ppc32, and there are currently no known regressions compared to gcc34. Note: you'll want to use recent gcc-4.0.1 snapshots as gcc-4.0.0 is known to be broken. This set of patches do not include fixes to drivers, file systems, or architectures I don't use myself. I have a preliminary patch kit for those, but as it has received only limited compile testing I'm not submitting it unless these core patches are accepted. Marcelo Tosatti replied, "I believe its about time for v2.4 to reject such kind of modifications, they can live outside the mainline repository." 8. Linux Docs Viewable On Only Non-Linux Browsers 14 Jun 2005 - 16 Jun 2005 (10 posts) Archive Link: "Design Level Documentation for the Linux kernel (V2.6)" People: Nick Newcomb Nick Newcomb said: I'm working with the Software Revolution and I thought you guys might like to know that we just completed the automatic generation of a full, design-level documentation of the LINUX kernel and associated sub-systems. This documentation set is made up of hyperlinked graphics and text documents of all the major subsystems and all of the source code fields and functions and is organized by complexity and file-system location. It covers the Linux kernel, memory management, file-system, security, cryptography, initialization, drivers, architecture and interprocess communication subsystems. Furthermore, we're offering this for... well free. I just thought it was something maybe you guys could use. If you would like to view this information, just go to: http://www.softwarerevolution.com/jeneral/open-source-docs.html Parag Warudkar said he would like to use look at the docs, but the page was only viewable with Windows-only browser plugins. Several other folks also had this problem, and discussed workarounds. Kyle Moffett was able to view the pages from Mac OSX, and liked them very much, but he noticed that the site used a lot of image maps. He suggested that this be changed, so that his favorite browsers, OmniWeb and Safari, would have an easier time with the pages. Sharon And Joy Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.