Kernel Traffic
Latest | Archives | People | Topics
Latest | Archives | People | Topics
Latest | Archives | People | Topics
Home | News | RSS Feeds | Mailing Lists | Authors Info | Mirrors | Stalled Traffic

Kernel Traffic #108 For 23 Feb 2001

By Zack Brown

linux-kernel FAQ | subscribe to linux-kernel | linux-kernel Archives | | LxR Kernel Source Browser | All Kernels | Kernel Ports | Kernel Docs | Gary's Encyclopedia: Linux Kernel | #kernelnewbies

Table Of Contents

Mailing List Stats For This Week

We looked at 1423 posts in 6616K.

There were 470 different contributors. 222 posted more than once. 164 posted last week too.

The top posters of the week were:

1. Hot-Swapping CPUs In 2.4.1

4 Feb 2001 - 11 Feb 2001 (9 posts) Archive Link: "[PATCH] Hot swap CPU support for 2.4.1"

Topics: FS: sysfs, SMP

People: Anton BlanchardLars Marowsky-BreeRusty Russell

Rusty Russell announced that he and Anton Blanchard had written a patch to allow hot-swapping CPUs in 2.4.1; the patch enabled root users to bring down a CPU with a simple echo 0 > /proc/sys/cpu/0/online command, or bring one up with echo 1 > /proc/sys/cpu/0/online. He added that this would only work on PowerPC machines at the moment. Lars Marowsky-Bree asked what would be needed in order to add new CPUs, instead of simply turning existing ones on and off. Anton replied, "In order to bring a new cpu up you will need to duplicate a lot of the stuff in smp_boot_cpus or else just set up all NR_CPUS of these structures (eg NR_CPUS idle threads etc) at boot time." Elsewhere, Matthew Fredrickson asked if he'd need any special hardware to experiment with the patch, and Anton replied:

You should be able to run it on any SMP machine assuming you write the arch specific code (PPC could be used as an example). Of course it isn't very interesting if the hardware doesn't support hot swap :)

As soon as I get the SMP ultra booting again (I arrived one morning to hear the disk was making loud grinding noises) I'll code up sparc (ie E10K) support. It sounds like S390 support will be trivial, I'd love to get my hands on one of those :)

2. Summary Of Reiserfs Problems In 2.4

7 Feb 2001 - 12 Feb 2001 (46 posts) Archive Link: "Apparent instability of reiserfs on 2.4.1"

Topics: FS: NFS, FS: ReiserFS

People: Hans ReiserChris MasonDavid ReesDaniel Stone

Hans Reiser reported on the recent reiserfs problems:

I know that our number of users has increased, but I doubt that the increase is sufficient to match the marked increase in bug reports on reiserfs-list. Please be patient as we work on this. We will issue a patch this week that will fix some bugs (NFS i_generation count losing, and space leakage on crash due to preallocated blocks being lost).

We will also change the default for mkreiserfs to creating the new 2.4 only format, as this (we have belatedly realized) is probably the cause of many users reporting they can't create large files.

We have a bug affecting add_entry which we suspect is due to our rename not being adequately atomic and leaving hidden directory entries in the filesystem, and we are exploring how this might happen (improper journaling, we don't yet know....) Treat this description with the usual skepticism attached to any explanation of a bug not fixed yet, our diagnosing continues.... This is the most worrisome bug for us stability wise. It seems ~ a user a day encounters it.

This patch for sure also won't fix the zeros getting added to syslog files bug which we are desperate to learn how to reproduce at our site.

Chris Mason added:

how about we list the known bugs:

zeros in log files, apparently only between bytes 2048 and 4096 (not reproduced yet).

preallocated block leak on crash (fix in testing)

hidden directory entry cleanup (still reproducing, very hard to hit).

knfsd (patches in testing).

oops in reiserfs_symlink, create_virtual_node (bug in redhat gcc 2.96, fixed by downloading the update).

We've also had a few reports of other corruptions, most of which have been traced to hardware problems. There are two where I'm not sure of the cause yet, but the method to trigger the bug was too simple to not be a hardware problem.

Regarding the filesystem corruption, David Rees asked if it might be related to the corruption reported on systems with VIA chipsets. For the details of that situation, see Issue #107, Section #6  (5 Feb 2001: VIA Disk Corruption: Continued) . Chris replied that no, the corruption reports in this case happened on other chipsets as well. He figured the problem was really in reiserfs.

Daniel Stone reported some mbox corruption, and various folks dug around for the cause. At one point, Chris remarked, "I suspect the bugfixes in pre2 will fix some of the more exotic corruption reports we've seen, but this one (nulls in log files) probably isn't caused by a random (or semi-random) lower layer corruption. These users are not seeing random metadata corruption, so I suspect this bug is different (and reiserfs specific)." Hans asked for some clarification, and Chris summarized, "Ok, I'll try again ;-) People have been seeing null bytes in data files on reiserfs. They see this without seeing any other corruption of any kind, and they only see it on files of very specific sizes. They see this without crashing, and without hard drive suspend kicking in. They see it on scsi and ide, on servers and laptops."

3. Boot Messages Vs. Animated Logo

7 Feb 2001 - 12 Feb 2001 (25 posts) Archive Link: "Re: [ANNOUNCE] Animated framebuffer logo for 2.4.1"

Topics: Framebuffer, Small Systems

People: Pavel MachekChristophe BarbeMike GalbraithAdrian CoxMiles Lane

In response to a recent announcement of an animated framebuffer boot-logo, Pavel Machek remarked wryly, "Long time ago I joked that win2000 will have 30-minute film at the bootup. [3.1 had picture, 95+ had static logo with moving line...] And now it looks like _linux_ is getting that feature..." Christophe Barbe pointed out that a nice boot-logo might relax Linux newcomers who might have been afraid to see all those boot messages scrolling past. He added:

I use LPP (linux patch progress). It's a little patch. The main idea is : redirect all boot messages on the second console, display on the first one a bigger framebuffer logo (screen size) and draw on it the progress bar, progress text and warning messages. A proc interface is provided for the second part of the boot process (echo "starting X Font Server" > /proc/progress).

The boot is not significantly longer (and with a well fitted kernel, is really faster than M$ Wx) and suddendly the first linux impression is really good.

He said he hoped the patch would make it into the kernel, but Miles Lane felt it should be optional, if included at all; and Mike Galbraith came down even harder on it, saying, "I hope that nothing like this is _ever_ integrated (and doubt I need be concerned;). IMHO, hiding output from users arrogantly assumes that they are too stupid/ignorant to have any use for such information." Christophe argued that most users weren't interested in the internals of their boot process, but that in any case, "there is no need to be ignorant. With LPP, messages are displayed during the boot process and if something goes wrong an little picture inform you. And you can switch to the classic console when you want (by a simple CTRL-ALT-F2)." Mike replied that he felt most folks would be interested in seeing the boot-messages. Adrian Cox replied with a different take, "I want to use this for embedded systems. For example, last weekend I was on a bus where the advertising screen at the front went through a complete (uncustomised) Windows 2000 boot. I want to do better than that, and build an application specific splash screen early into the boot process, with the detailed messages coming out through the serial port." There was no reply.

4. The VM Subsystem In 2.4

8 Feb 2001 - 18 Feb 2001 (28 posts) Archive Link: "Linux 2.4.1-ac7"

Topics: Bug Tracking, Kernel Release Announcement, Virtual Memory

People: Rik van RielMarcelo TosattiAlan Cox

Alan Cox announced 2.4.1-ac7, which included some virtual memory rebalancing code from Rik van Riel, intended to give a good speedup, especially on lower memory machines. Rik added:

I'd really like feedback from people when it comes to this change. The change /should/ fix most paging performance bugs because it makes kswapd do the right amount of work in order to solve the free memory shortage every time it is run.

This, in turn, should make it far less likely that user processes will *ever* need to call try_to_free_pages() themselves, unless the system really goes into overload mode.

It would be good to know if this change really fixes the bug or if it only helps for certain workloads and not for others. I'd really like to close the following bug but need confirmation that it works first ;)

Several folks replied with reports of bad problems, either OOM (out of memory) lockups, or just huge amounts of swapping. Rik and Marcelo Tosatti banged around on the code, and Marcelo presented several new patches, but a significant problem was the fact that it's really impossible to tune any Virtual Memory manager "correctly" for all cases; and so tuning for any particular case can leave other cases out in the cold. This also tends to make it difficult to distinguish between issues related to tuning, and actual bugs. At one point Rik remarked:

so we're back to the old VM magic number game again ;(

In short, we have to be more agressive towards unmapped cache pages than towards mapped pages in processes, except that this horribly breaks down when somebody does streaming IO using mmap while somebody else is at the same time re-using data from cached files (say, .h files)...

Now the question is ... WHY do we need to change this behaviour and HOW exactly should it be changed ?

I don't really feel comfortable just tweaking stuff until we get a half-dozen benchmarks right, I think we need to understand what is happening and change things accordingly.

It's fine with me to put some temporary thing in place to get at least -ac5 behaviour back, but I don't think we should have this as a long-term thing.

5. Lost Keypresses In 2.4.1

13 Feb 2001 (12 posts) Archive Link: "lost charaters -- this is becoming annoying!"

People: Alan CoxTigran AivazianAndrew Morton

Tigran Aivazian noticed that 2.4.1 would lose keystrokes on his Dell Latitude CPx. The same machine under 2.2.x was fine. Alan Cox said that 2.2 and 2.4 handled keyboard error cases quite differently; and asked Tigran to try 2.2.18 or the 2.2.19 pre-patches, adding, "Those if my first guess is right will behave like 2.4 does to you." Tigran tried 2.2.19pre9, but was unable to reproduce the problem.

Elsewhere, Tigran added in response to a question from Andrew Morton, that the lost keystrokes only occurred when the laptop was in its docking station. He eliminated X as the culprit, because he still lost characters under 2.4.1 in console mode. Andrew tried using an external keyboard on his own Dell Latitude, and could not reproduce the problem. Elsewhere, Ulf Carlsson reported a related problem on identical hardware, where only his Caps-Lock keypresses were being lost. The thread ended inconclusively.

6. Kernel Debugger In 2.4.x?

13 Feb 2001 (3 posts) Archive Link: "To Linus: kdb in 2.4?"

Topics: FS: NTFS, User-Mode Linux

People: Linda WalshJeff Dike

Linda Walsh argued the case for include the kdb kernel debugger as part of the standard 2.4 tree. She said:

I'm thinking that it could be a great teaching tool to break and examine structures, variables, process states, as well as an aid to people who may not have a grasp of the entire kernel but need to write device drivers.

It's easy for someone who's "grown up" with Linux to know it all so thoroughly that such a tool seems fluff. But even the best mechanics on new cars use complex diagnostic tools to do car repair. Sure there may be experts that designed the engine that wouldn't need it, but large numbers of people need to repair cars or modify them for their purposes. Having tools to aid in that isn't so much a crutch as it is a learning tool. It's like being able to look at the characters of the alphabet individually before one learns to comprehend the entirety of the writings of Buddha.

Certainly Buddha doesn't need to know how to read to know his own writings -- and certainly, if everyone meditates and 'evolves' to their Buddha nature, they wouldn't need to read the texts or recognize the letters either.

But not everyone is at the same place on the mountain (or even the same mountain, for that matter).

In wisdom, one would, I posit, understand others are in different places and may find it useful to have tools to learn to read before they comprehend.

Someone gave a pointer to Issue #87, Section #1  (2 Sep 2000: Possible GPL Violations By Microsoft; Kernel Debugger In Official Sources) , and replied that it were best to "not nudge sleeping penguins." But Jeff Dike said he was highly sympathetic to Linda's point, "assuming that a kernel debugger doesn't change the kernel's behavior." But he suggested that folks wanting a native kernel debugger should check out User Mode Linux. He explained, "A number of kernel hackers are very successfully using UML for doing filesystem and mm development and debugging. With some help from the host, it's also possible to do driver development under UML. I also know of a number of people using UML to further their education by using it to poke around a running kernel."

There was no reply.

7. New Filesystem Corruption In 2.4.2-pre2

13 Feb 2001 (4 posts) Archive Link: "2.4.2-pre2 ext2fs corruption"

People: Alex RomosanAlan Cox

Alex Romosan experienced some massive filesystem corruption after a crash under 2.4.2-pre2, and posted some logs. Alan Cox asked for some hardware information, and Alex replied, "intel piii, with an adaptec AHA-294X Ultra2 scsi adapter. the disk in question is a 9gb IBM disk Model: DNES-309170W Rev: SA30." Alan added this information to his growing pool of data on these recent corruption issues, and mentioned, "doesnt tally with other corruption reports (other aic7xxx reports with the older driver in the non-ac tree are of the it doesnt work/hung variety)" . There was no reply.

8. Video Drivers In The Kernel

13 Feb 2001 - 16 Feb 2001 (10 posts) Archive Link: "Video drivers and the kernel"

Topics: FS: NFS, Virtual Memory

People: Louis GarciaJeff GarzikAlbert D. CahalanTimur Tabi

Louis Garcia suggested (Ccing the XFree86 developers mailing list) adding video drivers to the kernel, instead of just letting X Windows deal with all the video hardware. He said, "if video drivers were part of the kernel and had a nice API for X or any other windowing system, would not only improve performance but would allow competing windowing systems without having to develop drivers for each. Has anyone thought or rejected this idea?" Jeff Garzik replied shortly, "See linux/drivers/video and linux/drivers/char/drm in kernel 2.4." There was no reply to that, but elsewhere Mark Vojkovich (from the XFree86 team) suggested immediately terminating the thread in order to avoid Flames O' Death. Elsewhere, Albert D. Cahalan braved the heat, saying:

Problem is, X is a big old wad of code. It wasn't designed to run in a kernel environment. It isn't easy to rewrite, and getting rid of it isn't currently reasonable for normal desktop Linux systems.

So then what, split X, with only the hardware access in the kernel? This can actually reduce performance, by a small or great amount depending on how it is done. Stability would improve a bit, assuming the new drivers have Linux quality rather than XFree86 quality. The gain is tiny, while the difficulty is large. At least we'd get a safe and reliable way to print an oops though.

Both options could eat some memory. (but NOT anything like the VM size of an X server, much of which is the video memory itself) Putting the whole thing in the kernel does allow for memory pressure hooks though.

Both options cause political troubles. Currently the X server is shared with OS/2 and other crummy systems. If the Linux kernel had serious video drivers for PC hardware, then driver support for the other operating systems would mostly go away. Linux would become a better desktop OS, at the expense of various crummy systems.

Both options would tend to hurt people who like to leave X running on a low-memory web or NFS server. For a kernel X server, swapping must be done more-or-less explicitly.

Both options cause more work for Linus. This totally kills the idea. See his past postings flaming the GGI/KGI developers.

If you ever write this, go ahead and throw in the rest. I mean the window manager, xterm, and a GDK system call even. My hardware can spare the memory, but CPU cycles are way too scarce. Clean design can go screw itself when it eats CPU time. Don't worry about being accepted into the main kernel, because that won't happen no matter what you do. Have fun hacking, and whip XFree86's ass.

Timur Tabi replied, "just because the drivers move into the kernel doesn't mean that other OS's can't be supported. A video driver could be compiled for the kernel on Linux, but be compiled as something else for other OS's. In fact, on OS/2, a special driver is provided with XFree86 that effectively allows the X Server to run with the same capabilities as an OS/2 device driver. In fact, by strict standards, it's a security and reliability loophole, but it still works pretty well."

There was not much discussion.

9. Kernel Autoconfiguration Utility v.

15 Feb 2001 (4 posts) Archive Link: "[ANNONCE] Kernel Autoconfiguration utility v."

Topics: Kernel Build System, PCI

People: Giacomo CatenazziWilliam StearnsAndreas SchwabAndrey Panin

Giacomo Catenazzi announced version of his kernel autoconfig utitity, a tool to help any user detect hardware and configure the kernel appropriately (though of course, only the root user could install the kernel once it had been configured). Since the project was still in the test phase, it would only output the proposed configuration, as opposed to actually changing the configuration automatically. He listed the items in his hardware database:

He also added:

I need some help:

I will do:

Andrey Panin offered some information on particular hardware that he felt Giacomo was misdetecting. Elsewhere, William Stearns was pleased by the project, but pointed out that currently, the main script required bash2 in order to run. He posted a patch to enable the script to run under either bash2 or bash1, but Andreas Schwab replied that William's patch was completely wrong. There was no further discussion.







Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.