Kernel Traffic #109 For 2 Mar 2001

By Zack Brown

linux-kernel FAQ (http://www.tux.org/lkml/) | subscribe to linux-kernel (http://www.tux.org/lkml/#s3-1) | linux-kernel Archives (http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html) | kernelnotes.org (http://www.kernelnotes.org/) | LxR Kernel Source Browser (http://lxr.linux.no/) | All Kernels (http://www.memalpha.cx/Linux/Kernel/) | Kernel Ports (http://perso.wanadoo.es/xose/linux/linux_ports.html) | Kernel Docs (http://jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html) | Gary's Encyclopedia: Linux Kernel (http://members.aa.net/~swear/pedia/kernel.html) | #kernelnewbies (http://kernelnewbies.org/)

Table Of Contents

Mailing List Stats For This Week

We looked at 1254 posts in 5390K.

There were 434 different contributors. 217 posted more than once. 172 posted last week too.

The top posters of the week were:

 

1. New Starfire Driver Maintainer
7 Feb 2001 - 19 Feb 2001 (51 posts) Archive Link: "[PATCH] starfire reads irq before pci_enable_device."
Topics: BSD
People: Ion BadulescuJeff GarzikDavid S. Miller

In the course of discussion, Ion Badulescu announced, "Well, I decided to bite the bullet and port my zerocopy starfire changes to the official tree, properly ifdef'ed. So here it goes, the patch was made against 2.4.1 vanilla and includes all the fixes from Jeff and myself that were sent to the list so far." [...] "Alan, if you want, I can rediff this against 2.4.1-ac. I'll also try and send a 2.2.19 patch shortly." He also added himself to the maintainers file as the "STARFIRE/DURALAN NETWORK DRIVER" maintainer. Jeff Garzik replied, "If you've got the hardware and time, I'm always happy to see someone step up .. I must confess that I haven't seen much of your work to date, however." Ion replied, ".. the hardware, the docs, the time, and the day-to-day duty to maintain the starfire driver (and the eepro100 driver) for an older version of BSDI. It's the job that pays my salary..." Jeff said, "excellent :)" and that was that.

Jeff felt that instead of porting the zerocopy patches into the main tree immediately, Ion should let them live in David S. Miller's external patch until they were ready to be fully merged. He said, "Zerocopy is still changing and being actively debugged, so it is possible that we might have to patch starfire.c again with zerocopy updates, before the final patch makes it to Linus." Ion reminded him that zerocopy support was optional in the driver, and would be easy to remove if David's patch was eventually rejected from the kernel. Jeff explained, "If you have some code that will not work at all in the current tree, it should not be in the current tree. If Alan or Linus applies a starfire.c patch that includes ZEROCOPY support while the tree as a whole does not include such support, you are effectively including a developer-local change in the global tree. With your patch but without zercopy infrastructure, defining ZEROCOPY is completely pointless without an additional, experimental patch." Ion replied that "It's an issue of maintainer convenience vs. esthetics. And (last but not least) it's also about other people's ability to easily make changes to the driver, changes they can understand and test." But Jeff said, "Remember: we are in a stable series of kernels. This is experimental code. Maintain a separate branch of development like everyone else. :) Yes it's a bit more effort, but that's what being a maintainer is all about. The kernel needs a -stable- starfire.c, let's talk about adding experimental code later." Ion said fine, and they and others had some technical discussion about other fixes for the driver.

 

2. New Protocol For Network Console During Bootup
12 Feb 2001 - 23 Feb 2001 (45 posts) Archive Link: "LILO and serial speeds over 9600"
Topics: Assembly, BSD: FreeBSD, Networking, Security
People: H. Peter AnvinJames SutherlandAlan CoxWerner AlmesbergerTim Wright

In the course of discussion, H. Peter Anvin remarked, "I have toyed a few times about having a simple Ethernet- or UDP-based console protocol (TCP is too heavyweight, sorry) where a machine would seek out a console server on the network." James Sutherland's ears pricked up and he replied:

Excellent plan: data centre sysadmins the world over will worship your name if it works...

What exactly do you have in mind: a bidirectional connection you could use to control everything from LILO/Grub onwards? Should be feasible, anyway.

I'd go with UDP for this, rather than raw Ethernet. Use DHCP to get the IP address(es) to connect to as console hosts? (That or a command line option...)

The first thing is the kernel: just wrap around printk so as soon as eth0 is up, you set up a session and start sending packets.

I'll do a server to receive these sessions - simple text (no vt100 etc), one window per session - and work on the protocol spec. Anyone willing to do the client end of things - lilo, grub, kernel, etc??

H. Peter replied, "this sounds like it's turning into a group effort. Would you (or someone else) like to set up a sourceforge project for this? I would prefer not to have to deal with that end myself." James filled in the paperwork, and gave the project an interim name, "Network Console Protocol". He also added, "I put the license type as "Other", since the heart of the project is the protocol, and patches to add support to the kernel, FreeBSD etc. will have to be under the license of the OS in question." He said there should be a project announcement some time over the next day or so.

Elsewhere, Alan Cox said the project reminded him of MOP on the old Vaxen. He also argued in favor of TCP as the underlying protocol (H. Peter had agreed with James that "A DHCP/BOOTP option seems to be the obvious way, and I'd hate to use non-obvious ways when there is a perfectly good obvious way." ). Alan said now:

TCP btw isnt as heavyweight as people sometimes think. You can (and people have) implemented a simple TCP client and IP and SLIP in 8K of EPROM on a 6502. There is a common misconception that a TCP must be complex.

All you actually _have_ to support is receiving frames in order, sending one frame at a time when the last data is acked and basic backoff. You dont have to parse tcp options, you dont have to support out of order reassembly.

James replied that UDP would probably still be simpler than TCP, and added that on the flip side, TCP also didn't seem to have any real benefits over UDP; though he acknowledged that perhaps in the future TCP might be useful for the kernel-side code. Alan replied that a UDP implementation would probably be just as big as a TCP one "by the time your UDP code has dealt with retransmits, out of order acks, and backoff." He added, "The IP layer is easy. Thats about 30 lines of code for a minimal IP. You'll need more code to implement ARP, which you will require."

Elsewhere, H. Peter argued, "one thing I'd really like to have is controlled buffer overrun, which TCP *doesn't* have." Alan asked what H. Peter meant by "controlled buffer overrun", and Werner Almesberger explained:

the ability to send new data even if there's unacked old data (e.g. because the receiver can't keep up or because we've had losses).

Such a feature would be mainly useful in cases where data becomes useless if too old, e.g. VoIP. Ironically, for the console, the opposite may be true: if the kernel all of a sudden starts vomiting printks, the relevant information is more likely to be at the beginning than at the end.

One advantage of TCP would be that such an implementation is more likely to get congestion control right, so it would be safer to use over the Internet. (And using UDP wouldn't make this any easier.) Also, when using TCP, it's more likely that some reasonable session management is built into the design.

Alan didn't think unacknowledged old data would build up to the point of causing a problem, since the typical PC class host would have a 32K window. That seemed like enough to him, but H. Peter replied, "Depends on what the client can handle. For the kernel, that might be true, but for example a boot loader may only have a few K worth of buffer space." Alan replied that the same constraint would be true with a UDP implementation; and reiterated that a minimal TCP implementation could really be a very low resource. H. Peter said these arguments didn't mean that TCP was necessarily better. Alan added again, that a UDP implementation would be at least as big as a TCP version, once the UDP code had implemented retransmit handling. And if it didn't have retransmit handling, he added, it would just be junk. Tim Wright then aphorized, "those who fail to learn from TCP are doomed to re-invent it, badly, at the wrong level." H. Peter said he still felt UDP would be best, but that he'd take a closer look at a TCP implementation.

Elsewhere, about a week later, H. Peter gave a link to a sourceforge page (http://sourceforge.net/projects/netconsole/) and announced, "We have set up a network console project on sourceforge and are starting to work on actual details. If you're interested in this subject please do join that list."

 

3. Innovative Microsoft Clustering Solution. Order Now!
14 Feb 2001 - 20 Feb 2001 (5 posts) Archive Link: "*grin* Windows 2000 & HPC: Scalable, Inexpensive Supercomputing Solutions"
Topics: Clustering: Beowulf, Microsoft
People: Jonathan MortonMike HarroldDr. Kelsey HudsonKelsey HudsonDavid Howells

David Howells gave a link to an article by Microsoft (http://www.microsoft.com/WINDOWS2000/hpc/indstand.asp) announcing a clustering solution under Windows 2000. Jonathan Morton remarked, "I bet you need a W2K license for every box you hook up, too." And Mike Harrold said:

The sad thing is, 3/4 of the page is an outright lie. It isn't a first, W2k is not the de facto standard OS, and the TCO is significantly higher than any cluster running Linux.

It's a sad day when companies can get away with blatant lies all in the name of "marketing."

Dr. Kelsey Hudson replied, "No shit, not to mention that Linux is going to be faster and better suited to the task." And someone else gave a link to Scyld Linux (http://www.scyld.com) (formerly Extreme Linux), adding that installation was so simple that only the master node required any installation at all. Simply booting the CD in the remaining machines was sufficient for each to become a node.

 

4. Status Of aic7xxx Drivers
14 Feb 2001 - 19 Feb 2001 (25 posts) Archive Link: "aic7xxx plans"
Topics: BSD: FreeBSD, Networking
People: Alan CoxChip SalzenbergPeter SamuelsonJustin T. GibbsDoug LedfordJ.A. MagallonMatthew JacobWakko Warner

J.A. Magallon knew that Doug Ledford would no longer be maintaining the aic7xxx drivers, and that Doug recommended using Justin T. Gibbs' FreeBSD versions of those drivers. J.A. asked if there were any plans to follow Doug's advice on that, in terms of including Justin's drivers in the main tree; and Alan Cox replied, "I dont plan to switch them yet a while, and never for 2.2. For 2.5 its a total nobrainer that we move to Justins driver or move to Justins driver post crudfixing that may be needed to make it clean and Linuxish." Justin thought Alan meant that the crudfixing would be necessary, and asked what specifically was the problem, but Alan replied that he hadn't had anything in mind, but was just aware of the possibility. At one point Wakko Warner asked where to find Justin's drivers, and Matthew Jacob replied with a pointer to Justin's page (http://www.freebsd.org/~gibbs/linux) . Chip Salzenberg remarked, "Here at VA we're already using" [Justin's] "driver -- it works on the Intel STL2 motherboard, while Doug's driver doesn't (or didn't, a month ago)."

At one point Peter Samuelson remarked:

Have you any idea the breadth of cards and chips that aic7xxx supports? Sure, Justin's driver does great with your shiny new 7899, but can you verify that it also drives the 8-year-old EISA AHA-2740 I still have sitting around (actually retired to the parts pile, but that's beside the point, I'm sure some still exist in the wild)? How about the VLB card I have in my 486 at home?

IMHO there is no way Linus should consider replacing aic7xxx with 6.1 in a stable kernel. Not until it has gotten as much testing on as much obscure hardware as the old driver, which is not going to happen soon. Breaking existing working setups in 2.4.x is not an option. Possible solution: let the two drivers coexist, like ncr53c8xx vs sym53c8xx or tulip vs old_tulip.

Justin detailed:

I use a Dual Pentium-90 with PCI/EISA slots to test a 2742T and a 2740W. I haven't tested a 284X card for some time just for lack of a VLB machine (I have a card), but since it uses the aic7770 just like the 274X does, I'd be very surprised if it didn't just work.

Version 6.1.2 of the driver has been tested on a G3 PowerMac, a Compaq Blazer IA64 machine, and about 14 different PC motherboards. We have an AS1200 on the way from Compaq too so we can test EISA and PCI support on the Alpha. I've verified the driver's functionality on 25 different cards thus far covering the full range of chips from aic7770->aic7899. Lots of people here at Adaptec look at me funny when I pull a PC from the scrap-heap, or pull an old, discontinued card from an unused marketing display for use in my lab, but I'm well aware of how these cards get used in 386sx routers/firewalls etc, and those configurations will be supported.

Peter was pleased to hear all this, and asked, "is there really enough common ground between the whole series of AIC chips to justify a single huge driver? I know they ship three separate NT drivers to cover this range.." And Justin replied, "The chips are very similar. I think the single driver for Linux is actually a smaller binary than any of the individual drivers for NT. 8-)"

 

5. Status Of ServeRAID Driver
15 Feb 2001 - 21 Feb 2001 (4 posts) Archive Link: "ServeRaid 4M with IBM netfinity and kernel 2.4.x"
Topics: Disk Arrays: RAID
People: Alan CoxPim Zandbergen

Stephane Borel reported some Netfinity ServeRAID filesystem crashes under 2.4 during transfer of files larger than a megabyte. Under 2.2.18 there was no problem. Alan Cox replied, "I don't believe IBM have provided an 'official' 2.4 patch set for the serveraid yet so there may be bugs lurking." Pim Zandbergen replied:

They have, but they keep it pretty well hidden. Version 4.50 of the ServeRAID driver seems to support kernel 2.4 and can be downloaded from ftp://ftp.pc.ibm.com/pub/pccbbs/pc_servers/24p2809.tgz

There is also driver disk that lets you install Red Hat 7.0 on a ServeRAID array which can be found at ftp://ftp.pc.ibm.com/pub/pccbbs/pc_servers/24p2811.exe This too was hard to find on the IBM web sites, and there is no mention of it at all on the Red Hat web site.

While you're at it, you might just as well download ftp://ftp.pc.ibm.com/pub/pccbbs/pc_servers/24p2817.iso and burn it on a CD. This is a bootable (windows) CD that contains the above files plus everything else you need to get your ServeRAID running with Linux or other operating systems.

 

6. Linux Boot FAQ And Other Boot-Related Docs
18 Feb 2001 - 21 Feb 2001 (9 posts) Archive Link: "Linux OS boilerplate"
People: Scott LongJeremy JacksonRick HohenseeH. Peter Anvin

Scott Long emerged from the dark and mysterious land of X86 boot code, and said he was considering writing a FAQ on the boot process, which "would include all relevant information on setting up the x86 hardware for a boot (timers, PIC, A20, protected mode, GDT, initial page tables, initial TSS, etc)." He explained that he really wanted to start a little OS project of his own, and he figured the Linux bootup code would be much cleaner than anything else he could find. He asked if folks knew of any existing docs, and Jeremy Jackson replied that he'd been over that code and would be willing to help. He also suggested, "read all of the LILO documentation, and check out some of the LinuxBIOS project at http://www.linuxbios.org." Rick Hohensee also suggested, "Have you seen Janet_Reno? ftp://linux01.gwdg.de/pub/cLIeNUX/interim/Janet_Reno.tgz IIRC. Janet is an x86 bootsector that gets into protected mode and can use the AT BIOS in pmode interrupts. It's written with a bunch of m4 macros I call asmacs that I'm currently basing an assembler in Bash on. That's shasm in the same directory as Janet." Someone else also gave a link to some CS class notes (http://www.eecs.wsu.edu/~cs460/) ; and H. Peter Anvin also gave a pointer to Documentation/i386/boot.txt, which had some explanation of the Linux boot protocol.

 

7. Status Of VIA Driver For 2.2 Kernels
20 Feb 2001 - 21 Feb 2001 (9 posts) Archive Link: "[patch] VIA 4.2x driver for 2.2 kernels"
People: Shane WegnerVojtech Pavlik

Vojtech Pavlik announced that he had ported the VIA driver from kernel 2.4 to 2.2; he posted the patch, and Shane Wegner reported, "This drivers breaks with my HP 8110 CD-R drive. It's sitting on primary slave of a Via 686B controler. When I try to do a hdparm -d1 -u1 -k1 /dev/hdb, the kernel locks up hard. Not even an oops. Reverting to the old driver works fine." Vojtech suggested using the kernel option to enable DMA instead of going back to the old driver. They went back-and-forth for awhile, and eventually Shane said, "I have investigated this problem further. The hdparm triggers the error but is not the cause. hdparm accesses /dev/hdb which is my cd-r drive. This triggers the loading of the cdrom and ide-cd modules. Manually loading cdrom succeeds, after which, manually loading ide-cd crashes the system. No need to even open() the device. This works fine with the VIA driver from 2.2.19pre14+ide-2.2.18-1221." There was no more discussion.

 

8. Status Of NFS In 2.4
21 Feb 2001 - 25 Feb 2001 (6 posts) Archive Link: "TESTERS PLEASE - improvements to knfsd for 2.4.2"
Topics: BSD: FreeBSD, BSD: NetBSD, FS: NFS, FS: ReiserFS, FS: XFS, FS: ext2
People: Neil BrownHenning P. SchmiedehausenMatthias AndreeJames RichAlan Cox

Neil Brown announced:

I have a bunch of patches that change the way knfsd interacts with filesystems. In particular it makes it possible to export reiserfs and other modern filesystesm (providing they have been told how to work with knfsd).

This patch makes some substantial changes to the way knfsd maps a filehandle into an actual file, and this has been an easy place for obscure bugs to hide in the past.

So, I am asking for testers. Anyone who is feeling at all adventurous, and uses knfsd for any filesystem type, and is using 2.4 series kernels: please grab my latest patch, apply it to 2.4.2, and try it out. Then let me know about any problems.

I am looking forward to seeing lots of downloads and absolutely no problem reports.... but is seems unlikely.

Alan Cox has suggested that these changes may not be appropriate for 2.4, so we might have to wait for 2.5 to see them on kernel.org, but we don't have to wait till then to find the bugs.

The jumbo-patch is at

http://www.cse.unsw.edu.au/~neilb/patches/linux/2.4.2/patches-A-H-knfsd

The individual bits that make it up can be seen by looking a little higher in the tree. e.g.

http://www.cse.unsw.edu.au/~neilb/patches/linux/

The reiserfs code in this patch is from the reiserfs team.

Henning P. Schmiedehausen bewailed, "Oh, please not again a stable kernel series with NFS problems, we're locked in for ages. 2.2 was bad enough up to 2.2.18. We have ReiserFS in 2.4.1 (and not in 2.4.0), could we _please_ get NFS-exportable ReiserFS in 2.4.4 or 2.4.5?" Matthias Andree added heatedly:

2.2.18 is still broken, won't play NFSv3 games with FreeBSD clients. Neil has posted a patch here which fixes this.

And, ReiserFS messes NFSv3 up, I'm currently switching all my boxes back to ext2, because I'm really pissed. And if these NFS annoyances continue, it might be about time to try FreeBSD or NetBSD. Journalling file systems which hide their files away for maintainer incompetence and uncoordinated patching around don't buy us anything except continued "don't use Linux as NFS server" reputation.

James Rich suggested, "If you need journaled file systems and NFS I have been using XFS and it seems to be fine when exported over NFS (Yes I know it isn't in the main kernel - hopefully that changes soon)." And Alan Cox put in, "2.2.19pre has all the changes Neil has sent me. One reason I wanted to avoid NFS changes was that they would do somthings we didnt want. And they did although nothing too bad. The 2.2.19 schedule btw is about another week." Matthias replied that he was happy to hear 2.2.19 would be coming soon; and the thread ended.

 

9. Status Of 3c59x Driver And Zerocopy Patches
22 Feb 2001 - 24 Feb 2001 (2 posts) Archive Link: "3c59x in 2.4.{0,1,2}"
People: Igor MozeticAndrew Morton

Igor Mozetic reported, "There is probably just some miscoordination between the kernel mainteiners, but anyway. The 3c59x driver shipped with all official 2.4.x kernels lacks the 'medialock' feature. The result on 3c900 10M/combo cards can be unpleasant: kernel log fills up quickly and only reboot helps. However, Andrew's unofficial drivers at http://www.uow.edu.au/~andrewm/linux/ work fine so this is just a plea to include them into the official kernel." Andrew Morton explained:

The latest 3c59x driver is in the zerocopy patch, as well as at the above site.

Until things converge I'd suggest that you run a zerocopy kernel rather than updating just the driver. We need the testing.

Alexey has done wonders recently, and for 3com cards a zerocopy kernel now performs at least as well as a stock kernel.

 

10. 2.4 VM Improvement Over 2.2; Status Of VM
22 Feb 2001 (2 posts) Archive Link: "2.4 vs 2.2 performance under load comparison"
Topics: Big Memory Support, Clustering, SMP, Virtual Memory
People: Lars Marowsky-BreeRik van Riel

Lars Marowsky-Bree reported:

I did a comparison between 2.4 and 2.2.18 (+ Andrea's patches), using the respective latest SuSE kernels, but the results should apply to the versions in general.

Situation: SAP R/3 + SAP DB + benchmark driver running on a single node 4 CPU SMP machine, tuned down to 1GB of RAM.

Running the SAP benchmark with 75 users on 2.2 yields for the first benchmark run:

  • 7018ms average response time
  • 2967s CPU time in 1136s elapsed time
  • ~500MB swap allocated
  • ~1500 pages paged in/s, 268 pages/out/s on average

Running the same benchmark on 2.4:

  • ~700ms average response time
  • 1884s CPU time in 669s elapsed time
  • ~500MB swap allocated
  • ~50 pages paged in, ~212 pages paged out per second on average

Running the same benchmark the second time on both machines to get them warmed up, 2.2 stays in approximately the same range, while 2.4 gets even _better_, dropping down to ~350ms response time and ~20 pages in/out.

This is a rather amazing improvement in swapping performance.

Rik van Riel summarized:

Actually, in 2.4 we have one big VM balancing problem left.

We have no way to auto-balance between refill_inactive_scan() and swap_out(), so we can (and probably do) still end up paging out the wrong pages lots of times ... this is alleviated somewhat by having a 1-second inactive list, but still...

Another problem is a lack of smarter IO clustering, when we get that better I'm sure we can increase performance even more.

 

11. loopback Broken In 2.4.2
22 Feb 2001 (5 posts) Archive Link: "2.4.2 seems to break loopback and/or mount"
People: Mohammad A. HaqueJim MurrayJens Axboe

Jeff Wiegley had been mounting CD images with loopback under 2.4.1-pre10 with no problem, but as soon as he upgraded to 2.4.2 the mount would hang in an uninterruptible sleep, and all subsequent mounts failed, even for non-loopback devices. Mohammad A. Haque replied, "loopback is broken in 2.4.2 AFAIK. You can grab the loop-6 patch and apply it to 2.4.2 and it should work." Jim Murray mentioned, "Compiling with kgcc compiler from RedHat 7.0 breaks loopback in the way you describe on 2.4.2-prex kernels and I suspect also in the real 2.4.2." But J. Sloan and Mohammad disagreed, and reiterated that loopback was simply broken in 2.4.2; J. Sloan gave a pointer to Jens Axboe's patches (ftp://ftp.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.2-pre4/) , and the thread ended.

 

 

 

 

 

 

We Hope You Enjoy Kernel Traffic
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.