Kernel Traffic #78 For 31 Jul 2000

By Zack Brown

Table Of Contents


There were a number of important threads about IDE problems, but some of them seem to still be ongoing, so I've moved all the related threads into the same bucket and will cover them together when they're done. Thanks go to Dylan Griffiths for pointing out a number of significant Subject lines in the whole debate.

Mailing List Stats For This Week

We looked at 748 posts in 3152K.

There were 396 different contributors. 139 posted more than once. 131 posted last week too.

The top posters of the week were:


1. Reliable MDA Card Detection
9 Jul 2000 - 24 Jul 2000 (27 posts) Subject: "MDA video detection request."
Topics: Assembly
People: Mike A. HarrisJames SimmonsAlan CoxMatan Ziv-Av

Mike A. Harris asked succinctly, "Anyone know how to reliably detect an MDA card (in-kernel) in a system that may have multiple adaptors (PCI/vga/MDA) multihead setup?" Edward Betts pointed him to 'drivers/video/mdacon.c', but James Simmons objected, "The current code to do detect MDA fails for some cards. I tried it on my Matrox Millenium card and it detected it as a mda card. At the time I was also running vgacon and I was testing to see what would happen if mdacon was called with no MDA card. I was hoping it would fail." Soeren Sonnenburg posted some interesting assembly code he'd found somewhere on the net, that attempted to do the detection. Matan Ziv-Av said this wouldn't work on non-Hercules MDA cards. He added that VGA cards might also emulate the behavior the algorithm checked for. Alan Cox replied:

The reliable way to detect an HGA card is to run through the VGA/EGA detect and only if VGA/EGA fails then check that the light pen port is changing value as you would expect. The light pen port check is used heavily - clone Herc cards emulate its behaviour even if they dont have one

This is well documented in some of the PC video books

But Matan pointed out, "The system has both vga card and mda card, so the vga test will not fail. The light pen ports are not available on mda cards, but only on hercules and compatibles."

Later in the thread, James posted a very short patch from Matan, that attempted to correctly identify the presence of MDA cards. He invited testers, and Ferenc Bakonyi reported success with 3 different Hercules cards, and a Riva 128 (Asus V3000) VGA card.


2. Filesystem Corruption On Western Digital Hard Drives
13 Jul 2000 - 18 Jul 2000 (12 posts) Subject: "2.4.0-test4 Corrupt filesystems"
People: Andre HedrickRoger Larsson

Someone reported filesystem corruption with 2.4.0-test3-pre7 and then again with 2.4.0-test4. He listed some of his system information:

Andre Hedrick replied, "How many FS's have to die before people quit trying to do UltraDMA on WDC drives?" For more on WDC drives, see Issue #54, Section #2  (28 Jan 2000: WDC Drives: Strange Requirements) .

Roger Larsson checked out the WDC home page, and said that it looked like newer disks should be OK, and that AC22100H was dependent on revision. The original poster gave a pointer to the Western Digital Linux FAQ ( that seemed to be what Roger had referred to. Andre Hedrick was extremely skeptical, and replied to Roger:

Until they prove it and I have an ata-analyser to observe the traces, I will not believe them. You know it is funny what you get to here and know when you are a full voting member of NCITS-T13-AT-Attachemnt Standards Committee. I believe the folks that I work with on the committee. I also asked WDC point blank about the issue, and got a NOP answer.

They know that all they have to do is call me on my cell and tehy will get me. So unless you are with WDC and will make a public counter statement to the facts and submit proof, web-propaganda does not cut it for me.

Roger posted a link he'd found to Western Digital EIDE Hard Drive Data Transfer Rates ( . Andre replied to this on the list, CCing Darrin Bulik of Western Digital. He said:

This does make it messy, now.

Hey Darrin,

Here is a chance for WDC to save some face. Talk to me and tell me what to make of this doc? Since the CCC marking is on the outside of drive, will WDC generate a firmware list? I know that all drive makers make OEM products that have things disable, but if this is the case, could someone consider correcting the IDENTIFY table to correspond tp the capabilities shipped not total.

There was no reply.


3. Some Explanation Of Edge-Triggered Versus Level-Triggered Interrupts
14 Jul 2000 - 20 Jul 2000 (5 posts) Subject: "APIC and edge triggered vs level triggered"
Topics: Backward Compatibility, Disks: SCSI
People: Miquel van SmoorenburgHelge HaftingGerard Roudier

Miquel van Smoorenburg tried to get a news server going (he listed his hardware: AMI Megarum II board w/ 2xPIII/450 1 GB RAM, dual symbios onboard SCSI 1 9 GB system disk on scsi0, 4 18 GB spool disks on scsi1 Linux 2.2.17pre11), but found that the system hung after 5 or 10 minutes. He saw that the symbios chipset had stopped delivering interrupts, and posted this to the linux-scsi mailing list. He reported that Gerard Roudier had replied that this behavior was only to be expected, since APIC PCI interrupts had to be level-triggered rather than edge-triggered. Miquel asked, "Is there a way in the Linux PCI kernel code to change an interrupt from edge to level triggered? I had a look at the PCI code, but it seems that it just takes it all from the BIOS and that there's no way to change it." Two and a half days later he replied to himself:

I found the solution myself this weekend. It appears that the BIOS reports the interrupts as edge triggered, while in reality they are level triggered. So we need a way to override the information from the BIOS.

Jos van de Ven posted such a patch on the linux-smp mailinglist about a year ago. Fortunately the patch still applies and solves my problem completely.

See for Jos' posting and the patch.

Someone asked for an explanation of edge-triggered versus level-triggered interrupts, and Helge Hafting explained:

Edge triggered: The interrupt line voltage goes like this:
I.e. normally low, and a pulse to signal the interrupt.

This makes interrupt sharing hard, as two devices could do this simultaneously resulting in only one interrupt, leaving you to wonder if one or several devices need attention. ISA devices works like this. Two simultaneous interrupts may seem unlikely, but keep in mind that this problem also occur if the second interrupt happens while the first is being serviced. Some interrupt handlers takes a long time, and some devices makes lots of interrupts. Having one of either kind sharing is a recipe for disaster.

Level triggered: The interrupt voltage goes like this:
I.e. normally low, but it goes high and remains high until the device driver/kernel turns it off. This makes interrupt sharing easier. Two or more devices sharing an interrupt can both pull the voltage up. One of them will have its device driver called. The driver will do whatever the device needs, and turn off its interrupt signal. But the signal will remain high because of the other device that isn't serviced yet. So a new interrupt happens as soon as the first completes. Now, the first driver see that its device isn't active, so control is passed to the next device sharing the interrupt. This one services its device, and turn off its interrupt signal. The interrupt line goes low again when all active sharing devices have been serviced, and normal operation continues. PCI devices normally use level-triggered interrupts. Some bioses allow edge-triggered operation for backward compatibility (i.e. a PCI card supposed to work with existing drivers for ISA cards.)


4. Japanese-Encoded Spam And linux-kernel Policy
16 Jul 2000 - 24 Jul 2000 (28 posts) Subject: "F*ck*ng japanese garbage postings and possible HACK."
Topics: Spam
People: Aaron LehmannRik van RielJohan KillstamAndrew van der StockJohan KullstamMike A. Harris

Mike A. Harris complained about Japanese-encoded spam hitting linux-kernel. Aaron Lehmann suggested, "Why not just get the list admin to require posters to be subscribed?? lkml is one of the most backwards mailing lists I've ever been on, and AFAIK anyone can spam it without being subscribed (in which case they could easily be unsubscribed and therefore blocked)." But Rik van Riel explained, "Because that would mean we could no longer get bug reports from non-subscribers. Also, some sites (eg transmeta) run an internal (read-only) linux-kernel newsgroup because some people find it easier to deal with the volume in news clients."

Johan Kullstam pointed out that the Japanese 8-bit characters caused weird errors in certain applications. He said, "check out how kernelnotes mail listing in netscape goes nuts after it hits one of these japanese spam land mines. you can still read it, but the subjects are in italics and each post has a bunch of empty lines after it," and gave a pointer to an example ( . Several people pointed out that this was just a bug in the way the page was set up. Andrew van der Stock remarked pointedly, "kernelnotes is broken - it doesn't deal properly with the language of over 126 million people." He went on, "Spam is evil (and spammers are first against the wall when the revolution comes), but this thread is missing the point. Spam != language support. Many recipients will now block Japanese posts due to a persistent and annoying spammer. This is - in my opinion - monumentally stupid."


5. Some Explanation Of Elevator Code
17 Jul 2000 - 18 Jul 2000 (3 posts) Subject: "elevator algorithm questions"
People: Citez GaborJens AxboeChris Wedgwood

Vitez Gabor noticed in 'll_rw_blk.c', that the elevator algorithm used to minimize drive head movement while writing, "is used for all kind of block devices, including hardware-raid controllers (like IBM ServerRaid and Compaq SMART2). It also seems to me that you are not using the head position for block write sequence reordering, but the linear address of the blocks." He asked if these conclusions were true, and Jens Axboe replied, "Yes. But low level drivers are free to override that choice by either defining their own I/O scheduler or use the no-op scheduler provided by the kernel as well." And Chris Wedgwood also explained, "With many modern drives you don't know the disks position; there is all sorts of cunning gymnastics going on inside the drive, you just access it by sector number -- legacy CHS access doesn't describe real-life especially on large disks which may contain many zones (regions where the number of sectors per cylinder is different from elsewhere on the platter)."

For more on the elevator algorithm, check out Issue #64, Section #3  (10 Apr 2000: Cornering A Slowdown) and Issue #67, Section #3  (28 Apr 2000: Modularizing Elevator Code)


6. Phasing Out Kernel-Based IP Configuration
18 Jul 2000 - 20 Jul 2000 (12 posts) Subject: "DHCP in the kernel"
Topics: FS: NFS
People: Alexandre STEFANIAndrzej KrzysztofowiczRobert M. LoveWerner Almesberger

Alexandre STEFANI reported that "On the 2.2.16 kernel, I could choose "IP; kernel level autoconfiguration" and then choose "DHCP", "BOOTP" or "ARP". I made successfull experiments with BOOTP and DHCP," and added, "I tried the 2.3 and 2.4 kernels, but this feature seems to have disapear. Does anyone know why or knows if DHCP autoconfiguration is abandonned forever." Andrzej Krzysztofowicz explained, "It did not disappear. It appeared in 2.2 very late and has never been ported to 2.3 / 2.4." Robert M. Love gave a different take:

IP configuration, such as that, has been relegated to userland, where it belongs. There exists (somewhere, i am sure) a userspace dhcp client for IP configuration.

So, yes, kernel-level IP autoconfiguration is gone forever -- but the alternative userspace implementation is the Right Thing.

Someone replied, wanting to ensure the possibility of mounting the root filesystem over NFS, which was possible in 2.2; alternatively, the poster asked if there were any other options for diskless workstations. Eric Lammerts the poster's fears to rest, saying that yes, root filesystem over NFS was still possible in 2.4, but he added that he felt userspace solutions for bootp/DHCP and NFSroot would be better. And later in the thread, Werner Almesberger remarked, "Kernel-based NFSroot isn't dead yet, but I'd consider it an endangered species."


7. Forcing Partition 'umount'
18 Jul 2000 (7 posts) Subject: "[patch-2.4.0-test5-pre1] nullfs and forced umount"
Topics: SMP
People: Tigran AivazianManfred Spraul

Continuing his train of thought from Issue #76, Section #2  (23 Jun 2000: Closing The File Descriptors Of Arbitrary Processes) , Tigran Aivazian posted a patch to implement the 'nullfs' filesystem, which would support generic forced 'umount' of active partitions. He added:

There are known problems with the patch:

  • access to tsk->files->fd[fd] inside disable_fd() is not SMP safe. I will probably need to take tsk->files_lock. Still thinking about it.
  • the mnt_count accounting works for 1-1 sb<->mnt case but seems to break randomly when there are multiple mounted instances of a filesystem and some of them are forcibly umounted.

Of course I will not send this to Linus until these (and anything else you find) problems are fixed but it is better to release early so - I hope to have your feedback.

Manfred Spraul had some criticism of the patch, and remarked, "IMHO the design is flawed: you can't kill an inode while another thread is working with that inode." They went back and forth on various technical points, and Tigran eventually said, "I will redo the patch with all the above in mind. Tomorrow I will be thinking about heavier issues you discovered."


8. Confusion Over 'ext2' Maintainership
19 Jul 2000 - 22 Jul 2000 (4 posts) Subject: "BUG in fs/ext2/super.c"
Topics: FS: ext2, MAINTAINERS File
People: Alexander ViroStephen C. TweedieAndreas GruenbacherTheodore Y. Ts'o

Andreas Gruenbacher reported that recent 2.4-test kernels had a bug in 'fs/ext2/super.c', in which the 'parse_options()' function would be called with a pointer to an unitialized 'new_mount_opt' variable. He posted a one-line patch against 2.4.0-test4 and added that he'd reported this earlier to Alexander Viro and been ignored. He urged folks to test the patch and convince Alexander to accept it.

Alexander replied, "Convince *whom*???" He did a quick 'grep' of the 'MAINTAINERS' file, and pointed out that Remy Card was listed as the 'ext2' filesystem maintainer. He groused:

While Remy seems to be inactive these days, trivial search through archives will show that ext2 is de-facto maintained by tytso and SCT.

Gentlemen, could we _PLEASE_ kill this "Al Viro controls all filesystems" myth? It is Not True. False. !1. NIL. Whatever it is spelled in your language of choice.

When you see a bug in filesystem, SEND PATCHES TO PEOPLE WORKING ON THAT FILESYSTEM. Everyone will be happier that way. Look: I'm not Ted, I'm not Stephen, I'm not Remy and I'm not Linus. We are 5 different people. And only Linus has a power to force something into the tree against the will of maintainers, so convincing anybody other than him and maintainers is an exercise in futility. I don't have such power, never pretended to have it and do not want it. Period. Is it _that_ hard to understand?

<shudder> patch looks sane, BTW, but I didn't look much into that area. And yes, when I find a bug in ext2 I send patch to maintainers. Honest. I can bounce your posting to them, indeed, but WTF?

Andreas apologized and scuttled the patch over to Theodore Y. Ts'o and Stephen C. Tweedie. Stephen replied, "I've already seen it, but I'm away from home right now -- I'll deal with it once I'm back." End Of Thread.







We Hope You Enjoy Kernel Traffic

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.