Kernel Traffic #23 For 17 Jun 1999

By Zack Brown

Table Of Contents

Introduction

If you don't know yet, KC gimp-devel (../KC/gimp-devel/index.html) is out, and the KC home page (../KC/index.html) will now also carry general Kernel Cousin news like announcements of new KCs.

For those of you who've written me about the unavailable articles referenced from the Subject lines, all I can say is, I still can't find them myself. I think the archives have been interrupted by kernelhq.com becoming kernelnotes.org; hopefully the problem won't persist.

Mailing List Stats For This Week

We looked at 1123 posts in 4323K.

There were 443 different contributors. 188 posted more than once. 170 posted last week too.

The top posters of the week were:

 

1. IRQ Autoconfig
28 May 1999 - 8 Jun 1999 (10 posts) Archive Link: "Hard coding default COM3/4 IRQ."
People: Theodore Y. Ts'oRichard GoochRiley WilliamsMike A. Harris

Mike A. Harris wanted to change the kernel's default COM3 and COM4 IRQ. He knew about 'setserial', but he was hoping it could be a config option so he wouldn't have to always remember to use it. He was also curious why Linux didn't automatically configure the IRQ on the COM port, especially since MS systems seemed to do it fine.

Theodore Y. Ts'o said he could edit the table in include/asm/serial.h; as far as remembering to use 'setserial', Ted suggested putting an rc.serial into /etc/rc.d; but he also pointed out that there already was a config option for doing autodetection of the IRQ: answer yes to CONFIG_SERIAL_EXTENDED, and select CONFIG_SERIAL_SHARE_IRQ.

But he added that it didn't always work, which was why it was off by default; and it was labelled "unsafe" because it could occassionally mess up other devices that wanted the same IRQ. As far as the MS question went, he answered, "I suspect that Linux folk very often have a lot more random hardware attached to their machines than Windows folks, and these random ISA bus cards can confuse an autodetection algorithm."

Richard Gooch asked if 'setserial' was always reliable, and Ted replied, "For some systems, yes, if the boot-time configuration is not realible, then setserial is not reliable either. This is why I recommend that folks manually specify the IRQ in /etc/serial.conf, and have /etc/rc.serial set the IRQ's appropriately from /etc/serial.conf. This is guaranteed to work, all the time."

Richard also asked what kind of "confusion" Ted meant, and Ted explained:

The problems fall into two categories. One is there are random UART's which claim to be National Semiconductor 16550A compatible, but which are really Taiwan specials that really aren't. Some of these bogus UART's don't signal an interrupt when they should, and tweaking the autoconfiguration code to handle such UART's is an exercise in frustration when (a) you don't have an example of this particular bad UART emulation (b) there is no spec sheet, and (c) even when there is a spec sheet, there's no guarantee the hardware follows it.

The other set of problems is that some random ISA hardware may signal interrupts at unexpected times, and confuse the serial driver into grabbing the wrong IRQ. This is admittedly a less common error case.

The bottom line is that ISA hardware is sh*t, and trying to make autoconfiguration work on ISA hardware is very, very, very difficult. This is especially true in the case of serial UART's, since everyone and their brother feel that they are competent to design their own National Semiconductor UART ripoff, and unfortunately many of them get it wrong.

Ted also indicated the direction he might go in the future, with, "something which I might try is to do the boot-time autoconfiguration, and if it fails, fall back to setting the IRQ to the default 3/4, instead of setting the IRQ to 0. This may confuse some folks since you won't be able to distinguish between a failed IRQ autoconfiguration and an IRQ configuration that happened to choose 3/4. But it will help out the naive folks with the cheap hardware, and it will help some of the folks who have custom hardware, and are too lazy to configure their own /etc/serial.conf."

But the subject is not uncontrovercial. Riley Williams said:

I checked out the kernel autoirq for COM3 and COM4 on standard IO addresses a while back, when 2.0.35 was current, and even on systems where the hardware was set to irq's that conflicted with other hardware in the system, it ALWAYS got the IRQ correct, and NEVER locked one of those systems up.

Because of this, I proposed a patch that enabled the autoirq feature for these two ports. It was rejected by the maintainer, Ted Ts'o, on the grounds that enabling autoirq for HUB6 and the AST FourPort cards (neither of which were affected by my patch) could result in the kernel locking the system up.

I tried to point out to Ted that my patch did not touch support for any of the cards he claimed to be worried about, but found Ted less than interested in that fact. As a result, I will be rather surprised to hear that any such patch has been accepted, unless it was submitted by Linus himself.

Since the serial probing is done differently in the 2.2.x kernels, I'm not planning on resubmitting my patch, and I'm also not planning on wasting my time developing an equivalent patch for the 2.2.x kernels.

There was no reply.

 

2. Fix For Uninterruptible Sound
29 May 1999 - 7 Jun 1999 (10 posts) Archive Link: "sound and control-c"
Topics: Sound
People: Nimrod ZimermanAlan CoxAndreas Schwab

Navindra Umanee found that catting some (but not all) .wav or .mid files (but not .au files) to /dev/dsp (or using 'wavplay' or 'playmidi') could not be interrupted by ctrl-c, unless ctrl-z was pressed shortly thereafter. Then the process really was killed (not just suspended). He had experienced this under several 2.2.x kernels including 2.2.9 (and not under 2.0.36), on 2 different machines and 3 different sound cards.

Alan Cox said it was a known problem, but that no one had fixed it yet. After some discussion, Nimrod Zimerman tracked down the culprit, and explained, "What happens is that the pending signal is added back to the signal set (it was previously removed, when the processing started). However, the sigpending bit of the task isn't updated. This simply means that this signal can have no effect from this point on, unless some other signal is sent." So it didn't have to be ctrl-z that triggered the interruption, it could have been any signal.

Nimrod posted a one-line patch, and Navindra reported total success. Andreas Schwab echoed the patch into another part of the code that he'd noticed had the same kind of error. End Of Thread.

 

3. XTP And Bell Labs' IL In The Standard Kernel
1 Jun 1999 - 8 Jun 1999 (11 posts) Archive Link: "XTP: A better TCP than TCP"
Topics: Networking
People: Jordan MendelsonJamie LokierVince Lo FasoGreg LindahlMatthew WilcoxDavid S. Miller

Jordan Mendelson gave pointers to http://www.mentat.com/xtp/xtp.html and http://www.ca.sandia.gov/xtp/, and asserted that XTP "provides all the features of TCP, plus built-in realiable multicasts, better speed over networks with packet loss, better speed overall, maximum bandwidth limiting built-in, and a few other features which make it noteworthy." He added, "It appears that they have defined a few protocols over XTP already, including SMTP and FTP," and asked if XTP would be added to the standard kernel (Jamie Lokier was slightly critical of Jordan's post, saying, "The sites you refer to claim XTP operates better than TCP over unreliable networks because it does selective retransmission whereas TCP does go-back-N. This is no longer true of TCP -- TCP has SACKs, implemented in Linux for a while now, to do limited selective retransmission." ).

David S. Miller said someone was working on it, but they only fed him very dirty patches, and then disappeared for months at a time without cleaning them up. While not blaming the person (who might be busy, etc), David explained that without the cleanup, he simply couldn't put the code in.

Jordan volunteered to help, and Vince Lo Faso said:

I've been working on an XTP implementation for Linux for some time (and will be the first to admit that I'm over due for releasing it publically).

I recently ported it to kernel 2.3.x and am in the process of (re)sending a patch to David.

I'm doing this strictly on a volunteer basis, which means I can respond to emails during the evenings and have limitted time for its development--but unlimitted interested in its growth and potential. More info will follow.

Quick synopsis of XTP:

XTP is quite flexible and versatile, capable of offering UDP-like and TCP-like services--and everything in between--in one protocol stack. However, its flexibility is also its weak point. There is little public data-experience-testing-implementation on making XTP work in an IP environment, hence this XTP-Linux project. There are also several transport issues that need to be explored in an XTP/IP environment.

If anyone is really interested in this and wants to help out, let me know offline.

Matthew Wilcox felt that a better protocol to put in the kernel would be Bell Labs' IL. He gave a pointer to it (http://plan9.bell-labs.com/plan9/doc/il.html) , and announced he had implemented a bare-bones Linux version and was open to collaborators.

Greg Lindahl pointed out that unlike TCP, IL had no flow control, and added, "You may not think you need flow control, but the minute you have several people talking to 1 at high speed, life gets interesting. A group I used to work for (Legion) had its own IL-like protocol, and we finally decided that TCP's flow control was a huge win locally (our Legion-MPI programs were overrunning the 64k OS buffers), and TCP's adaptive stuff was far superior over WANs."

Matthew replied, "As I understand it, the idea behind not requiring flow-control in IL was that the higher level protocols would take care of this; in 9P (which is the main user of IL), there's a reply to each request, so if the server is replying slowly, the client will slow down to compensate. I don't think IL is suitable for simply sending packets to hosts, it must have a request-response protocol layered on top of it."

 

4. Linux Reminiscence
2 Jun 1999 - 8 Jun 1999 (77 posts) Archive Link: "zero-copy TCP fileserving"
Topics: FS: NFS, History, Networking
People: Matthew WilcoxDavid S. MillerRichard B. Johnson

This was a long thread, but in the course of discussion, Richard B. Johnson mentioned that at one time Sun didn't bother to checksum packets that arrived via Ethernet. Matthew Wilcox clarified with, "SunOS 4.x had UDP packet checksums disabled at boot to improve NFS performance. Hey, who cares about DNS?" , and David S. Miller related this story:

Even Linus himself saw this in amusing incident while still back in Finland. The LAN's NFS server at the universiry, on the network where he did his kernel work, was a Sun machine which skipped the checksums as you describe for UDP, and when a few weeks after new ethernet card was put into the NFS server, his kernel tarballs would become mysteriously corrupted with single bit errors here and there.

Linus and myself had thought it was a bug in Linux, but later it was shown that the network card in the Sun NFS server would let packets through which had gotten corrupted and the networking stack bypassed verifying the checksum and then boom.

 

5. Unfixable TCP Slowdown?
4 Jun 1999 - 8 Jun 1999 (28 posts) Archive Link: "TCP/PPP bug 2.3.5?"
Topics: Networking
People: John Hayward-WarburtonRik van RielAlan CoxByron Stanoszek

Rik van Riel noticed that on 2.3.5, heavy incoming TCP traffic seemed to cause outgoing traffic to stall forever.

Rodel T. Viado confirmed this for 2.2.9 as well. Byron Stanoszek confirmed a similar problem on all kernels since 2.2.5, and John Hayward-Warburton confirmed it going back to late 2.1.x and pointed out that the problem had been mentioned on the list recently. He described it as follows: "after a few hours or days of uptime, outgoing connections with large amounts of data stall, though other connections continue OK. Unloading the ISDN and PPP modules, then reloading them, fixes the problem until the next time it dies."

Rik posted a TCP dump in the hopes that a TCP guru would solve it soon. There was a bit of discussion, and a number of folks including Alan Cox concluded that the problem was with the connection. According to them, the folks who experienced this problem were experiencing packet loss somewhere down the pike, causing TCP to interpret the situation as congestion and slow down. The reason it was happening only on outgoing traffic was because along the connection (out of the user's control), input and output were handled by different hardware.

This was a big letdown for the affected folks, and Rik said, "Well. I guess that still doesn't explain how to fix the problem I and a lot of other people are seeing. Since we have workarounds for just about every other hardware problem, I guess we will want to do something to have at least decent worst-case performance on slightly flakey networks..."

No solution presented itself during the thread.

 

6. Status Of Patches To Use Extra RAM As A Ramdisk
4 Jun 1999 - 9 Jun 1999 (9 posts) Archive Link: "96M of RAM on machine that caches only 64M"
People: Pavel MachekIngo MolnarB. James PhillippeLinus TorvaldsMike A. Harris

Adding more RAM than the motherboard caches can lower system performance. Mike A. Harris had 32M of ram sitting unused on his machine, and was looking for the patches to make a ramdisk out of it. He also wondered why those patches were not in the standard kernel.

B. James Phillippe gave a pointer to the slram patches at http://www.andrew.cmu.edu/~keryan/slram/, and Ingo Molnar surmised that the patches were not in the main kernel because whoever wrote them hadn't submitted them to Linus Torvalds or linux-kernel. Pavel Machek, who had co-authored the patches with Bradley M Keryan, replied a little bitterly that they definitely had been submitted. Mike urged him to submit them again, and offered to take over maintenance. Pavel said the version on the web site was probably more current than what he had in his tree, and added that if Bradley didn't mind, Mike could take over maintenance.

But apparently Bradley was still active as the slram maintainer (B. James had found him quite responsive to email and changes), and said he'd resubmit the patches soon.

 

7. Cross Compiling Alpha Kernel On X86
4 Jun 1999 - 9 Jun 1999 (12 posts) Archive Link: "building 2.2.9 on alpha -- where's alpha/regdef.h?"
People: David S. MillerIvan KokshayskyBenjamin LaHaise

Benjamin LaHaise was cross compiling an Alpha kernel on a x86 box and found compilation of arch/alpha/lib/stx*.S failing due to the #include <alpha/regdef.h> referencing a non-existent file.

Several people replied that cross compiling a 64 bit kernel on a 32 bit host had problems, but David S. Miller clarified, "It works perfectly fine, and in fact right now is the only way, to build sparc64 kernels on a Sparc* system. In fact the Alpha folks were the first ones ever, real early on, to make ix86 --> Alpha kernel cross build working with gcc."

Ivan Kokshaysky pointed out that alpha/regdef.h was part of glibc, not the kernel. Benjamin posted a patch, and there was some discussion about include paths. Apparently the compile worked, though.

 

8. FireWire Conflicting Development
5 Jun 1999 - 8 Jun 1999 (5 posts) Archive Link: "FireWire subsystem in development"
People: Andreas BombeSrdjan SobajicEmanuel Pirker

Andreas Bombe announced he was developing a FireWire subsystem (IEEE 1394) for Linux, including a TI PCILynx chip driver. He knew that Emanuel Pirker's previous attempt was now defunct, so he'd started his own. He summed up the status of his work, saying, "The subsystem runs fine and responds to outside requests. Things that are not tested yet are outgoing transactions. Things that don't even exist yet are any interface to user space and a real bus manager implementation. And drivers for the Adaptec chip and OHCI chips." He added that he'd only tested it as a module.

He gave the URL of his home page (http://homepages.munich.netsurf.de/Andreas.Bombe/) , but added that the patches there weren't current.

Srdjan Sobajic came in, saying he'd been working on the same project, but he'd started from Emanuel Pirker's work. He summed up his own progress, saying, "Now I can get the drivers to work with 2.3.*, and I can access my Sony DCR-TRV7 (maybe this is a Japan-only model?) video camera, at least for starting and stopping the camera. However, I've been trying to see how to get the isochronous mode stuff to work, and haven't had any luck yet."

He offered to get into co-development with Andreas, and asked Andreas to email him privately to work out the details.

 

9. Kernel Based Web Server
5 Jun 1999 - 9 Jun 1999 (10 posts) Archive Link: "Announce: kHTTPd 0.1.0"
Topics: Web Servers
People: Tigran AivazianJakub JelinekDavid S. MillerArjan van de VenBjorn Wesen

Arjan van de Ven implemented a kernel-based web server as a loadable module. He was inspired by a debate on the linux-future mailing list, in which there was no agreement over whether such a thing would be a good idea or not. He gave a URL (http://www.fenrus.demon.nl) , in which his benchmarks showed it competitive with Zeus (though he acknowledged that really good benchmarks were impossible).

In the course of discussion, it came out that while Zeus and khttpd seemed to perform almost equally, Zeus used up 100% of the CPU, which khttpd used only 50%.

Tigran Aivazian suggested modifying the networking code to load and unload the module at need, but David S. Miller replied that such a thing would slow down a very often-traversed code-path, and Bjorn Wesen objected to the entire idea, saying that if the web server worked better as a module, it was only an indication that the OS itself was broken and should be fixed.

Jakub Jelinek took a look at the code and found a memory leak, which Arjan fixed.

 

10. Riva Framebuffer Driver Started But Nonfunctional
6 Jun 1999 - 7 Jun 1999 (2 posts) Archive Link: "[patch] Riva framebuffer driver (non-working)"
Topics: Framebuffer
People: Jeff Garzik

Jeff Garzik announced he was starting work on a driver for Riva 128/TNT/TNT2 cards. He gave a URL at http://havoc.gtf.org/garzik/kernel/files/UNTESTED/ and described the status of the project, "It compiles, and the logic is 90% in place, but don't consider it even close to working."

Robbert Muller tried the patch and had many problems. The thread didn't continue, but development is probably ongoing.

 

11. 'Linux Input Driver' Suite
6 Jun 1999 - 8 Jun 1999 (15 posts) Archive Link: "[announce] Linux Input Driver suite version 0.1.0"
Topics: Hot-Plugging, USB
People: Vojtech PavlikMartin MaresJames H. Cloos

Vojtech Pavlik said:

I've just released the first really useable version of the new Linux Input Drivers, a suite of drivers that intends to make handling of devices like keyboards, mice and joysticks simpler in Linux.

It tries to get rid of all the old cruft and dust that has gathered in the keyboard and mouse drivers present in the kernel, and I think it is quite useful in doing that.

A couple of immediate advantages you'll see after installing it:

  • If you have a Logitech M-S48 mouse, the wheel will work.
  • If you have a 3-button serial MouseSystems mouse, it'll have twice as fast updates when using the drivers than when you let GPM or X talk to it directly.
  • The Pause key repeat works
  • You can have an XT or Sun keyboard attached to your PC
  • No more mouse problems when switching to and from X
  • You can unplug and re-plug AT keyboards and PS/2 mice, and they will keep working, and won't loose the LED and autorepeat settings
  • You plug any mouse in your computer, be it PS/2, Serial, USB, or busmouse, and you get an emulated /dev/psaux, allowing for a simpler configuration

There are more of benefits for the future, like allowing easy multihead support, easy transition to the bus & hotplug scheme described by Martin Mares earlier.

The only two drawback of these patches I see:

  • integrating PS/2 and Serial mouse decoding routines into the kernel, but because those are rather small in size I think its way worth it.
  • The large size of the patch (14000 lines), that comes from rewriting drivers from scratch and moving them around in the directory tree.

He posted two URLs, a web page at http://atrey.karlin.mff.cuni.cz/~vojtech/input and the sources at ftp://atrey.karlin.mff.cuni.cz/pub/linux/input.

James H. Cloos Jr. found a big speed improvement with the patches. But he'd lost his meta and menu keys in X on his 105-key AT keyboard. He posted a patch to Xwindows to get the keys back, but Vojtech said it would probably be better to fix the drivers to behave normally. Folks had various problems and there was a bit of discussion about fixes and configuration, and by the end of the thread Vojtech had released version 0.1.1.

 

12. Performance-Monitoring Counters Patch Version 0.2
7 Jun 1999 - 10 Jun 1999 (06 posts) Archive Link: "release 0.2 of x86 performance-monitoring counters support patch"
Topics: FS, Profiling, SMP
People: Mikael PetterssonAndi Kleen

Mikael Pettersson announced version 0.2 of his x86 performance-monitoring counters patch (against kernel 2.3.5). He gave a pointer to http://www.csd.uu.se/~mikpe/linux/, and offered a changelog:

  • Added support for WinChip CPUs.
  • Restart counters from zero, not their previous values. This corrected a problem for Intel P6 (WRMSR writes 32 bits to a PERFCTR MSR and then sign-extends to 40 bits), and also simplified the code.
  • Added support for syncing the kernel's counter values to a user- provided buffer each time a process is resumed. This feature, and the fact that the driver enables RDPMC in processes using PMCs, allows user-level computation of a process' accumulated counter values without incurring the overhead of making a system call.

There was some discussion of implementation, and Andi Kleen suggested adding a way to access the performance counters of remote processes. Mikael replied that he was already working on the design of that feature. He explained, "I will probably represent the counter state objects as files in a pseudo filesystem, in order to get sharing, lifetimes [the state must survive process death if there's another process controlling it], and mmap() access right. SMP will be also have to be considered, but I guess I can steal code from ptrace() or kill()."

Andi found this to be too complicated and costly, and suggested just using a ptrace extension to read/write the performance registers. End Of Thread.

 

13. 2.2.9-ac1 Panic And Fix
7 Jun 1999 (5 posts) Archive Link: "Kernel Panic: 2.2.9-ac1"
People: Alan CoxTrond Myklebust

Theo Van Dinter upgraded from 2.2.4-ac? to 2.2.9-ac1 and after about a day had a panic on his news server. Trond Myklebust tracked the problem to a 'page_cache_release(page);' statement in the 2.2.9-ac series' nfs_unlock_page definition in nfs_cluster.h. He posted a patch to fix it. Alan Cox claimed responsibility for the bug, and several people posted complete success with the patch.

 

14. Framebuffer HOWTO Version 1.0 Released
7 Jun 1999 (1 post) Archive Link: "HOWTO-framebuffer goes to 1.0 release!"
Topics: Framebuffer
People: Alex Buell

Alex Buell announced:

You can peruse the HOWTO at http://www.tahallah.demon.co.uk/programming/prog.html

These files are also available from the same place:

    HOWTO-framebuffer-1.0.html.tar.gz
    HOWTO-framebuffer-1.0.sgml.tar.gz
    HOWTO-frmaebuffer-1.0.txt.tar.gz

Enjoy, and _please_ _please_ _please_ mail me privately if you wish to discuss issues with the HOWTO, we don't want to pollute the l-k mailing list.

Changes:

  • Moved 'Setting up X11 FBdev' into its own section
  • Moved to 1.0 release from 1.0pre7 (Got bored!)

Distributors, HOWTO list maintainers, feel free to take copies and put on CDs or web sites.

Next releases will add new sections on changes in 2.3.x & 2.2.x kernels.

 

15. Strace Bug When Tracing Recursive System Calls
7 Jun 1999 - 8 Jun 1999 (5 posts) Archive Link: "Bug: Tracing recursive system calls"
People: Nate EldredgeAlan Cox

Nate Eldredge reported, "There is a minor bug with system-call tracing. Some system calls call functions in the kernel which make their own system calls (kernel_thread is an example). If the process is being traced, these recursive calls are also traced. This confuses strace, since it (reasonably, IMHO) expects syscall entrances and exits to happen in strict succession."

Alan Cox came in with some implementation discussion (which later turned out to be wrong), and Nate posted a patch against 2.2.10pre2, recommending that other arches port the patch as well (though he didn't have the skill to do it).

 

16. aic7xxx Version 5.1.17 Is Out And Fixes All Known Bugs
9 Jun 1999 - 10 Jun 1999 (4 posts) Archive Link: "aic7xxx-5.1.17 is released"
People: Doug Ledford

Doug Ledford announced, "This version of the aic7xxx driver is a bug fix release that fixes all known bugs in the 5.1.15 and 5.1.16 driver versions that I have any way to reproduce. There are still a few sporadic reports of systems here or there that have some odd problems, but nothing that I have been able to identify and reproduce. Anyone having any aic7xxx related problems is encouraged to give this release their full attention and let me know if anything breaks. I'm going away for about a week starting Friday so people will have plenty of time to work it over by the time I get back. This update is not available via ftp for various reasons at this time. In order to get this update, please go to http://www.redhat.com/~dledford/aic7xxx.html. If you have any problems, let me know. There are 2.0.36 and 2.2.9 patches at that web site. No 2.3 patches are available yet, but the 2.2 patch should go in fairly cleanly."

 

17. Advice And Doc Update For Swapping
10 Jun 1999 - 13 Jun 1999 (5 posts) Archive Link: "Documentation fix for 2.2.9+2.3.5"
Topics: Disk Arrays: RAID, Documentation, FS
People: Adam FritzlerPavel MachekLinus Torvalds

Pavel Machek posted a documentation patch, which offered three pieces of advice to users:

There was some discussion; in particular, Adam Fritzler recommended changing Pavel's third suggestion to:

Pavel saw no urgent need for that change, but said he'd include it in his next submission, if Linus Torvalds rejected the first one.

 

 

 

 

 

 

We Hope You Enjoy Kernel Traffic
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.