Kernel Traffic #107 For 16�Feb�2001

By Zack Brown

linux-kernel FAQ (http://www.tux.org/lkml/) | subscribe to linux-kernel (http://www.tux.org/lkml/#s3-1) | linux-kernel Archives (http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html) | kernelnotes.org (http://www.kernelnotes.org/) | LxR Kernel Source Browser (http://lxr.linux.no/) | All Kernels (http://www.memalpha.cx/Linux/Kernel/) | Kernel Ports (http://perso.wanadoo.es/xose/linux/linux_ports.html) | Kernel Docs (http://jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html) | Gary's Encyclopedia: Linux Kernel (http://members.aa.net/~swear/pedia/kernel.html) | #kernelnewbies (http://kernelnewbies.org/)

Table Of Contents

Introduction

NOTE: The Kernel Traffic and Cousin pages have moved, and the old homepage will no longer be updated. The new homepage is http://kt.zork.net. Please update your bookmarks and web pages.

Kernel Traffic and the Cousins are now available as text versions (see the TOC of any issue). You can sign up to receive the text version each week by email. See the mailing lists page (../lists.html) for that option.

Mailing List Stats For This Week

We looked at 1133 posts in 5398K.

There were 437 different contributors. 195 posted more than once. 179 posted last week too.

The top posters of the week were:

1. APIC: The Saga Continues

29�Jan�2001�-�5�Feb�2001 (12 posts) Archive Link: "[patch] 2.4.0, 2.4.0-ac12: APIC lock-ups"

People: Maciej W. Rozycki,�Manfred Spraul,�Linus Torvalds,�Andrew Morton,�Gerard Roudier

Maciej W. Rozycki felt he'd traced the recent APIC lockups to some incorrect interrupt masking. He explained what he thought happened in the code, and posted a modification to a patch by Manfred Spraul. Manfred replied that this didn't seem to fix all the problem. There were no more lockups, he said, but every few minutes several packets were lost (though he acknowledged that his own patch had caused much more packet-loss than Maciej's). He felt the patch merely hid the problem without fixing it. Maciej replied that the problem was essentially in the hardware, and that "To fix the bug we'd have to modify the silicon. It's not feasible at this time, so we can only write worse or better workarounds, i.e. hide the bug." He posted a new patch, which Andrew Morton reported passed all his tests. At this point Maciej asked Linus Torvalds to include the patch in the official kernel, but it was just at this point that Manfred and Gerard Roudier started playing with other workarounds. As the result of some of this poking around, Manfred made a new guess as to the nature of the bug, saying at one point, "If an io apic io redirection entry is unmasked while the irq pin is active, then the io apic sends out the interrupt as edge triggered, but nevertheless sets the IRR bit." There were a few more posts trying various approaches.

2. shm Shared Memory Management Taken Out Of tmpfs

1�Feb�2001�-�6�Feb�2001 (13 posts) Archive Link: "[patch] tmpfs for 2.4.1"

Topics: FS: devfs, FS: sysfs, POSIX

People: Christoph Rohland,�J.A. Magallon,�H. Peter Anvin

Christoph Rohland posted the latest version of his tmpfs patch, and H. Peter Anvin noticed that part of the patch included removing a large comment:

System V shared memory is now implemented via a virtual filesystem. You do not have to mount it to use it. SYSV shared memory limits are set via /proc/sys/kernel/shm{max,all,mni}. You should mount the filesystem under /dev/shm to be able to use POSIX shared memory. Adding the following line to /etc/fstab should take care of things:

none /dev/shm shm defaults 0 0

Remember to create the directory that you intend to mount shm on if necessary (The entry is automagically created if you use devfs). You can set limits for the number of blocks and inodes used by the filesystem with the mount options nr_blocks and nr_inodes.

H. Peter asked what had happened to this feature, of using tmpfs as a management tool for shared memory. Christoph replied:

Unfortunately we lost this ability in the 2.4.0-test series. SYSV shm now works only on an internal mounted instance and does not link the directory entry to the deleted state of the segment.

IMNSHO the new implementation is so much cleaner that it was worth it. Probably we should fix ipcrm to be more flexible.

J.A. Magallon asked if this meant the shm interface was simply not needed anymore, which would actually improve the ability to switch back and forth between kernels 2.2 and 2.4; but H. Peter felt that the interface had been appealing, and that it was a shame to see it go. Christoph agreed with the sentiment, but confirmed that the interface was not needed. He added, though, "You will need for POSIX shm, but there are not a lot of program out there using it." H. Peter replied that either the shm interface was needed or it wasn't. If it was needed for POSIX shm, then he felt it was needed as part of the package. Christoph confirmed that from that perspective it was indeed needed, but that it was not required for any 2.4 functionality.

3. Filesystem Corruption In 2.4.1

2�Feb�2001�-�5�Feb�2001 (5 posts) Archive Link: "Version 2.4.1 has ext2 problems."

People: Russell King,�Richard B. Johnson,�Petr Vandrovec

Richard B. Johnson reported that files appearing in the lost+found directory after running e2fsck, could not be removed. Petr Vandrovec felt that Richard's fileutils, and possibly his gcc, were too old. As a workaround, he suggested truncating the files to zero length via the '>' shell redirect. Richard replied that this idea also didn't work, and that the final solution had been to remake the entire filesystem. Russell King confirmed the problem, with, "this isn't an isolated incident. I was hitting 2.4.1 hard last night on ARM, and ended up loosing my /usr and /var mountpoints and a few other files to this exact corruption. I resorted to using debugfs to remove these entries, and re-running e2fsck." Richard replied that 2.4.1 seemed to be a buggy version in general.

4. Some Ideas Behind Linus And Alan's Kernel Releases

3�Feb�2001�-�6�Feb�2001 (5 posts) Archive Link: "2.4.2-pre1"

Topics: Kernel Release Announcement, PCI, Power Management: ACPI

People: Linus Torvalds,�David D.W. Downey,�Vojtech Pavlik,�Kanoj Sarcar

Linus Torvalds announced 2.4.2-pre1, saying, "Mainly a number of small details and some driver updates. The socket datagram handling one is important, and has already been posted separately here on linux-kernel. The VIA driver update is rather important if you have one of the newer VIA chipsets." He included the changelog:

In a private post, David D.W. Downey asked Linus, "How often does Alan's patches get rolled into your main line? I'm having difficulty following the divergence here. I'm trying to run THE latest release(s) of your kernel with applicaple patches. I'm just trying to figure out if everything that is in the ac## line is ALWAYS rolled into your pre## line or not. Which patch sequence am I supposed to follow to have THE most current release of all fixes et. al.?" Linus replied on the list:

Alan tends to have much more experimental patches than I do - and we don't sync up more often than maybe once a month or so. And even then, the sync-up won't be complete, exactly because I don't take the experimental parts (or more accurately, Alan mostly doesn't even try to send them to me and we tend to agree pretty well on what is appropriate and what is not).

We're nearing another sync-point right now - I actually have a lot of Alans patches in my mail-box, and -pre2 will probably contain a lot of the -ac stuff. But don't expect a complete sync, as explained above.

5. New Maintainer For Configure.help

4�Feb�2001 (2 posts) Archive Link: "Configure.help typo fix"

Topics: MAINTAINERS File

People: Jeremy M. Dolan,�Alan Cox,�Linus Torvalds

Keitaro Yosimura gave Axel Boldt, Alan Cox, and Linus Torvalds a patch to fix a small typo in the 'Configure.help' file, and Jeremy M. Dolan replied, "This might be a good time to mention Axel has passed maintainership of Configure.help to myself. I'm currently working to combine Axel's fork against Linux 2.4.1's Configure.help, which is requiring hand merging a 260 kbyte diff, so it may be a week or two off." He also posted a patch to update the MAINTAINERS file.

6. VIA Disk Corruption: Continued

5�Feb�2001�-�7�Feb�2001 (13 posts) Archive Link: "VIA silent disk corruption - likely fix"

People: Peter Horton,�Udo A. Steinberg

For more on the VIA situation, see Issue�#106, Section�#12� (29�Jan�2001:�General Priorities For 2.4; ACPI Unstable In 2.4) . This week, Peter Horton found what he thought to be the cause of silent disk corruption on his A7V motherboard, and added, "it might affect all boards with the same North bridge (KT133 etc)." A few folks reported no problems on similar hardware, but there was no discussion. Later, under the Subject: VIA silent disk corruption - patch (http://www.uwsg.indiana.edu/hypermail/linux/kernel/0102.0/0896.html) , Peter posted a patch that seemed to fix the problem, at least in his own case. Dale Farnsworth confirmed the problem this time, and Udo A. Steinberg gave a link to an updated BIOS (ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip) . Peter replied, "Good news here, looks like the new BIOS fixes it (1005D). I've run a heavy test for at least 10 hours without a single blip. The BIOS is set for "optimal". Hoorah!"

7. Status Of Matrox Marvell G400

5�Feb�2001�-�6�Feb�2001 (8 posts) Archive Link: "Matrox Marvell G400"

Topics: Framebuffer, I2C, PCI

People: Gregory Maxwell,�Petr Vandrovec,�David Woodhouse,�Wakko Warner

Wakko Warner asked how well Linux supported the Matrox Marvell G400's capture capabilities and dual head, and there were several replies. Gregory Maxwell said flatly, "Capture and dual head are almost totally unsupported without using a proprietary, binary only driver chunk which will soundly place your system as 'unsupported' as far as this list is concerned due to the difficulty of debugging a system when sourceless software bangs the hardware. If this situation is not ideals for you, I suggest you address the issue with Matrox." Petr Vandrovec also replied to Wakko, explaining that those features of the card could work under X Windows. He said:

Under framebuffer both heads of G400 will work for you if it is your primary video devices. For capture capabilities see http://marvel.sourceforge.net. It for sure worked sometime in the past, but I'm not sure about current state. But I believe that at least watching TV works correctly.

And if you insist on X, you can run first head through mga with usefbdev /dev/fb0 with hwcursor off, and secondary head through fbdev /dev/fb1. But it is not supported by me (and neither by XFree guys AFAIK, not even talking about Matrox support guys) - I support only first head in X and secondary head used for 'fbtv -k'.

Regarding taking the issue up with Matrox, he added, "I'm trying... more or less. Next G450 BIOSes will have fix for matroxfb deadlock on boot, so there is at least some move. Although now when workaround is implemented in matroxfb, it is a bit late..."

David Woodhouse asked, "Petr - how much of the matroxfb code is yours to give, and would you permit chunks of it to be reused under the XFree86 licence? Clean-room reverse-engineering is such a PITA :)" Petr replied:

Initialization code is entirely mine, sometime written with Matrox docs in hand, sometime without; except G100 initialization, which was written with cooperation with others. But proper initialization have to parse BIOS - now when PCI subsystem can enable/disable ROM, maybe I should try it.

Accelerator code was written/enhandced by couple of peoples except me, so it is probably impossible to get it under X - but they have some acceleration already, right ? ;-)

Dualhead code is written entirely by me, and at least some portions are already used in BeOS driver, probably under GPL, but I have no problem releasing code under any other license you can imagine, as long as it does not impose additional restrictions on my (me personally, not future of matroxfb) further work.

There maybe problem that i2c examples were used when writting core of maven driver. But real useful code should not be affected by this.

BTW, http://platan.vc.cvut.cz/~vana/maven/mavenreg.html contains partial MAVEN documentation, as I assembled it more than year ago for my own needs. But it is really partial, as most of TVOut equations are present only in code, and not in `datasheet'. I have some about half year old updates to that datasheet which were submitted by someone who had access to TV signal analyser, but I did not integrate them to HTML yet. And my code does not support original G200 TV Out, only late (non-US, MGA-TVO-C) G200 and G400 are supported for TV.

8. Hotplugging With Regular PCI Cards

6�Feb�2001�-�8�Feb�2001 (4 posts) Archive Link: "hotplugging with regular PCI cards"

Topics: Hot-Plugging, PCI

People: Adam J. Richter,�Tim Wright,�Christoph Hellwig

Adam J. Richter mentioned excitedly:

I saw an interesting demonstration at LinuxWorld last week. Compaq had a machine that did hot plugging with regular PCI cards (not Compact PCI). If anyone out there is familiar with this machine, I would be interested in knowing what the status is on getting the support for that backplain integrated into the stock kernels.

When that occurs, that will be yet another reason to treat all new style PCI drivers as potentially hot pluggable, even if those cards are not currently available in a CardBus or CompactPCI form, and in particular to change all of the xxx_pci_tbl declarations in PCI drivers that are currently marked as __initdata back to __devinitdata.

Tim Wright added:

I saw the same demo. It's not the machine as such that's interesting. The hotplug is achieved because of the chipset support. In fact the Compaq chipset that supports hotplug PCI is used in quite a few of the IBM Netfinity machines, and, I'm sure, many other servers. I'm going to be testing their code on the Netfinities that I have access to shortly, but see no reason to believe that it shouldn't work. In fact it would be good if anybody with machines using the Compaq hotplug PCI chips would test the code.

As you mention, there is driver work needed, both the change you mention and to make sure that all the drivers are using the newer 2.4 PCI infrastructure in the first place (the hotplug support relies on this).

Jamey Hicks gave a link to a PCI Hotplug For Linux (http://opensource.compaq.com/sourceforge/project/?group_id=13) page. Christoph Hellwig had trouble accessing the server, but by KT press time it seems to be up and running.

9. Attack On linux-kernel: The Saga Continues

7�Feb�2001 (3 posts) Archive Link: "FYI: Mailing list subscribed to l-k ?"

Topics: Real-Time

People: David S. Miller,�Matthew D. Pitts,�Dave Jones

Continuing the story covered in Issue�#106, Section�#19� (2�Feb�2001:�Hostile Activity Against The linux-kernel Mailing List) , Dave Jones reported that linux-kernel-admin@lists.real-time.com had been sending replies to all his linux-kernel posts, saying it was "awaiting moderator approval". He asked if this was related to the earlier problem, and David S. Miller replied, "Yes, and those bozos (linux-kernel@tachyon.miralink.com) keep resubscribing too. I've removed them both. Blacklisting the miralink.com domain doesn't seem to be stopping the subscriptions so I need to delve further where those are coming from." Matthew D. Pitts said, "So that's why I'm getting duplicate posting on some meesages." A number of other folks reported seeing duplicate postings throughout the week as well.

10. Single Copy Pipe/FIFO Implementation

7�Feb�2001 (4 posts) Archive Link: "single copy pipe/fifo"

Topics: POSIX, SMP

People: Manfred Spraul,�David S. Miller

Manfred Spraul announced:

I finished my single copy pipe/fifo implementation.

Main changes:

  1. it's more a rewrite of pipe_read() and pipe_write(). Both functions were a nightmare of nested loops and gotos. I wrote a test app - with the right timing multiple writers on a fifo can race and then they busy loop in the current pipe_write() - adding another set of goto's for single copy is imho a bad idea.
  2. slightly faster for non-zero copy transfers due to the code simplification.
  3. No single copy for exactly 4096 byte writes, only > PIPE_BUF. Single copy (and thus blocking) such writes could trigger bugs in user space apps that errorneously assume that a pipe write of PIPE_BUF bytes after a successful poll(POLLOUT) doesn't block even if O_NONBLOCK is not set. It's not defined in posix or susv2, but no unix version I tested blocks in such writes.
  4. on P II/350 single cpu it's a big win (~+70 % bw_pipe)
  5. if you run 2 instances on a dual cpu P II/350 it's a big win, but if you run only one instance, then the bw_pipe processes will jump from one cpu to the other and it's only a small improvement (~+15%).

I've attached the patch, the test app is at http://colorfullife.com/~manfred/kiopipe/fail.cpp

He invited folks to test it out. David S. Miller had a problem with the code, and said that regarding Manfred's item 5 above, wake_up_interruptible_sync() had been designed to avoid precisely that CPU-hopping behavior. Manfred explained that he'd been aware of this, but that his code covered a case in which wake_up_interruptible_sync() couldn't be used. He listed the sequence of events to show why this was true, and David agreed.

11. Status Of aacraid In 2.4

7�Feb�2001�-�8�Feb�2001 (11 posts) Archive Link: "aacraid 2.4.0 kernel"

People: Jason Ford,�Byron Stanoszek,�Matt Domsch,�Alan Cox

Jason Ford summarized, "I see in the archives of this mailing list that someone was working on the aacraid driver for the 2.4 kernel however that post was almost 2 months old. I know Alan Cox denied inclusion of the driver due to the poor nature it was written for the 2.2 tree. Every post that I have seen so far has just said that Adaptec is working on it." He asked about the status of the driver, and less than half an hour later Byron Stanoszek gave a pointer to a patch (ftp://ftp.winds.org/linux/patches/2.4.1/aacraid-2.4.1-1.0.6.patch) and replied, "While it's totally unofficial, I have a patch for aacraid 1.0.6 for 2.4.1-ac5. I have not tested it yet, but it compiles cleanly. I'd like to hear any results (good or bad) you have on it." Jason replied a couple hours later with the results of his tests. Apparently the patch compiled fine, but had the same problems as the older driver. He posted some output and asked if he'd done anything wrong. Byron replied, "Nope. It looks horribly broken. Oh well.. I guess I'd stick to 2.2.19-pre on the Dell machines for the time being." Elsewhere, Matt Domsch from Dell explained, "Adaptec is still working on it. Basically (and as Jason discovered), the driver and firmware can't handle single I/O requests larger than 64KB. Even when scatter/gathered, if the total is >64KB, it chokes. This was just fine for 2.2.x (no one has ever run into this problem there), but the much-improved block layer of 2.4.x throws larger I/Os at the driver. So, the developers at Adaptec are busy trying to add support to break large requests into smaller chunks, and then gather them back together." He went on:

there are three objectives:

  1. Get and maintain a working 2.2.x driver. Yes, Alan Cox doesn't want to merge this into the stock kernel, so until then, it's available separately, and several distributions have picked it up, such as Red Hat Linux 7.
  2. Get a working 2.4.x driver. Dell and Adaptec both believe this is critical. Again, we don't expect this driver to make it into the 2.4.x stock kernel, it'll be made available separately to those who want it. This is where development time is being spent today. The best I can say here is "we hope to have something soon".
  3. Develop an aacraid driver for both 2.2.x and 2.4.x that will be accepted into the stock kernels. For this to happen, Adaptec engineers will be re-writing the driver from the ground up as a Linux driver. Due to schedule constraints (wanting 2.4.x support sooner rather than later), and because we didn't expect the 64K issue, this has been delayed until 2) is finished. Hopefully the 64K limit will be eradicated then too.

I've made a web page http://domsch.com/linux on which I've posted all the 2.2.x aacraid patches, and where I'll post a 2.4.x patch when it's available. I've also created an announcements-only mailing list http://domsch.com/mailman/listinfo/linux-aacraid-announce which you may subscribe to and receive notices of new driver availability. I've created a developers list http://domsch.com/mailman/listinfo/linux-aacraid-devel for discussion of the driver if you wish to contribute.

Both the web page and mailing lists will likely be moved to a Dell.com server in the near future.

12. Linux 2.4.2-pre2 Released

8�Feb�2001�-�11�Feb�2001 (2 posts) Archive Link: "Linux-2.4.2-pre2"

Topics: Disks: IDE, Disks: SCSI, Kernel Release Announcement, Networking, Raw IO, USB

People: Linus Torvalds,�Paul Mackerras,�Russell King,�Andrew Morton

Linus Torvalds announced 2.4.2-pre2, explaining:

Ok, the patch is reasonably big, mainly due to a new architecture (cris) and some updates to others (arm and mips).

But what's interesting here are actually three very small patches:

The first would only hit you if you used raw IO (and had some unlucky timing etc), and very few people do.

The second can cause disk corruption with pretty much any disk (seen at least on SCSI under heavy load). Not necessarily easy to trigger, but still..

The third can cause disk corruption on IDE disks if you are using PIO writes with multi-mode and irq unmasking enabled.

All three are quite nasty, but not all that easy to trigger (and have been around for ages in the 2.3.x series - which only goes to show you how important it is to have gotten a lot of new testers). Special thanks go to Russell King for debugging the IDE driver thing with some heroic tracing stuff.

I'd like people to test it out a bit before I'll make a real 2.4.2 release, but the three bugs do make it clear that a 2.4.2 will have to happen soonish. The rest of the patches are quite cosmetic in comparison even if they are much bigger..

He also included his Changelog:

Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.