Kernel Traffic #55 For 21�Feb�2000

By Zack Brown

Table Of Contents

Mailing List Stats For This Week

We looked at 1319 posts in 5664K.

There were 472 different contributors. 196 posted more than once. 146 posted last week too.

The top posters of the week were:

1. Keyboard Lockup Bug Hunt

28�Jan�2000�-�12�Feb�2000 (56 posts) Archive Link: "Keyboard is frozen on boot of 2.3.41"

Topics: USB

People: Miles Lane,�Linus Torvalds,�Wakko Warner,�Michael Neuffer,�Harold Oga

Miles Lane reported that his keyboard was frozen in 2.3.4x. He'd seen similar reports, and was throwing his in as well. He also reported problems with the Yenta driver, though he would be unable to send debugging info until his keyboard problem was fixed. He added, "I'll be glad to send any desired system diagnostic info to anyone willing to look into the keyboard and Yenta problems. Just let me know." Linus Torvalds replied, "I have a few reports from you that seem to imply that the same kernel sources have worked for you or not, depending on exact configuration options. One of the differences I see in your previous working/nonworking setup (using 2.3.38) is that a previous try with USB enabled caused problems, while USB off worked." He had seen other folks with keyboard problems that were also connected to USB, and suggested Miles check that out.

Michael Neuffer reported having a similar problem with his keyboard, but USB on or off made no difference. However, accessing /dev/psaux with gpm or X would result in a keyboard/mouse lockup. The error he saw was

Jan 29 13:46:22 schlepflop kernel: keyboard: Timeout - AT keyboard not present?

Logging in remotely and killing the processes would free the keyboard. Linus replied:

Ahh. Good, this helps. Pinpointing it to a mouse interaction is always helpful.

Exactly when did this start happening, do you happen to know?

Michael estimated it started around 2.3.39pre or so. But he added that removing yenta/pccard support eliminted the problem. Linus asked what /proc/interrupts contained when the problem occurred, and added, "The only interaction I can imagine with the pcmcia code is that it might share an interrupt with the mouse, and the shared interrupts might just confuse the mouse/kbd driver.." Michael posted a lot of debugging information, but nothing came of it.

Harold Oga reported a similar problem, and Linus replied:

Ok, one final suggestion:

Could you who see the keyboard hang please do

If that fixes it, then we have another clue.

Wakko Warner replied with a report of the problem on his system, which he described: "NEC Versa SX. Ricoh Cardbus controller RL5c478, Intel PIIX4 chipset. The irq's for the devices agree with the bios (lspci -v and seeing the device list before booting)" Linus' suggestion did not help.

Michael also described his system as a Dell Inspiron 7000 laptop. He also reported that Linus' suggestion didn't help, but he added:

Opening the psaux device causes the keyboard lockup and no interrupts are beeing received by the psaux device. Closing the device also unlocks the keyboard.

The only test that so far had any impact was the suggestion to disable the toshiba workaround. With the workaround enabled I get the "Timeout - AT keyboard not present ?" warning after starting gpm. If it is disabled, I do not hit this code path.

Miles, however, noticed a difference on his system, after trying Linus' suggestion. He explained, "After making the change that Linus suggested, my mouse at least started working. When I looked at my system log, I noticed the keyboard seemed to be locking up around the time the gpm driver got loaded. However, removing gpm from my system did not help. I guess that confirms your theory that the problems are at least not identical. However, since most of these recent keyboard lockup problems started recently, it seems likely that there is some underlying change that was introduced which is manifesting differently depending on system hardware. At least, that's my theory."

Harold also eventually noticed a change after following Linus' suggestion. He explained:

It appears that as long as I have irq 12 manually assigned to something in the bios, I don't get any lockups if there is no ps/2 mouse connected when gpm attempts to open /dev/psaux. Here are the results of my latest testing:

stock 2.2.14:

2.2.14 with the above init sequence:

stock 2.3.41:

I didn't try 2.3.41 with the above init sequence change, since it seems not to lockup if irq 12 is assigned. Here's how I currently have the irq's assigned in the bios:

current old
PIRQ_0 3 3
PIRQ_1 11 9
PIRQ_2 12 11
PIRQ_3 5 5

It seems to me that this is a bios bug, or at least a bios quirk.

After further debugging over the course of a couple weeks, the issue was still not solved.

2. Private Header File Debate

31�Jan�2000�-�9�Feb�2000 (7 posts) Archive Link: "linux-2.3.41/drivers/char/mxser.c does not compile"

Topics: PCI

People: Theodore Y. Ts'o,�Alan Cox,�Rogier Wolff,�Adam J. Richter

This was first covered in Issue�#33, Section�#34� (29�Aug�1999:�PCI Serial Driver Ready For Testing) , and then again in Issue�#43, Section�#8� (2�Nov�1999:�Serial Driver Restructuring) . This time, Adam J. Richter reported that linux-2.3.41/drivers/char/mxser.c would not compile, because it relied on private declarations in linux/serialP.h that were no longer pulled in. He offered to submit a patch, but admitted that he didn't understand the issues involved in the current situation. Theodore Y. Ts'o explained:

The best way to deal with this is to #include <linux/serialP.h> in mxser.c.

serialP.h is an internal "private" header file for the serial driver, and for those drivers which are developed by grabbing serial.c and lightly tweaking it. I removed SERIAL_XMIT_SIZE and struct async_icount from serial.h and moved them to serialP.h since they should only be used by the driver internally.

Note: In the future (i.e., post 2.4) it's likely that things like async_icount will move to tty.h, and the code for handling it will be moved to tty_io.c, so that different drivers can share the code instead of sharing it by cut-and-paste.

But Alan Cox also replied to Adam, with, "Just reverse the pieces that Ted keeps trying to move into the private header and put them back. This is not the first time this has happened and its beginning to annoy me."

Ted replied:

It's very simple. Those definitions are private definitions, and have nothing to do with the exported interface to userland. As such, they don't belong in serial.h. If driver authors want to "borrow" huge amounts of the serial driver code by doing code reuse by cut-and-paste, they should #include serialP.h, which is a private header file for non-exported #defines and structures.

In the long-term, the kind of functionality should be moved into the high-level tty layer, so there isn't the need to share code by cutting and pasting from serial.c. This approach isn't a good one in the long term, since I've in the past fixed bugs in serial.c that don't get fixed in other drivers that have borrowed huge amounts of code from serial.c --- in many cases, I have no idea they've borrowed code, so I couldn't notify them even if I had the time to track all of the places that have borrowed from serial.c

The other solution (for now) is to use the generic_serial.c by R.E.Wolff@BitWizard.nl, but I don't personally like that approach since it introduces yet another abstraction layer, when in fact most of this functionality should be going into the tty layer.

Alan pointed out that doing a 'make modules' would show which drivers had problems. He also asked, "Does it belong in the tty layer when it appears relevant purely to physical port hardware?"

Rogier Wolff also replied to Ted, with:

First, Let me say that I think that the TTY layer doesn't help me as a driver writer enough: There is too much "administration" that a driver needs to do. For example, the whole "open" logic is now implemented again and again in every serial driver. I think a driver should be responsible for talking to the hardware, and the "driving layer" should do the thinking.

I asked around, and was told that I should make a "library" like layer that drivers could use to simplify stuff. A driver would then be able to choose the "generic" (but maybe a bit slow) stuff, or implement things by itself. For example a driver for a board that has a 4k output buffer, may want to skip the 4k write buffer in the kernel.

It's not quite finished completely the way I want it, but it's a lot better than before. (SX and RIO now share around 2k lines-of-code through "generic-serial". That's good.) So far, generic-serial simply has the stuff from serial.c that isn't talking to the hardware.

To Roger's point about the "open" logic, Ted replied:

Post-2.4 most of the open logic is GOING AWAY, along with the /dev/cua* devices. This will simplify the open logic immensely.

I want to move the CD logic into the tty layer, both for blocking opens, and for hangup processing. Hangup processing will happen in a bottom-half handler, and not in the middle of an interrupt, where all sorts of races can happen

And to Roger's point about the "driver layer" doing the thinking, Ted replied:

Oh, agreed, 100%. I just think the "driving layer" should be the tty layer; we don't need another one.

The basic idea is that we'll have some generic functions for interfacing to physical devices (face it most tty devices are physical; pty and console are the exceptions), which can be overriden by the driver if they wish.

End Of Thread.

3. Encryption In The Kernel: Saga Continues

1�Feb�2000�-�11�Feb�2000 (23 posts) Archive Link: "Encrypted File systems implementation into the kernel?"

Topics: FS: ReiserFS

People: H. Peter Anvin,�Michael H. Warfield,�Hans Reiser,�Linus Torvalds

This was last covered in Issue�#53, Section�#3� (14�Jan�2000:�US Crypto Laws) , and before that in Issue�#52, Section�#6� (11�Jan�2000:�Relaxing Of US Crypto Laws) . This time, someone gave a pointer to the sources and libraries for the 3DES and RC5 modules (http://tcfs.dia.unisa.it/) suggested adding encrypted filesystems to the kernel. Robert de Bath replied bitterly that the US and French governments had to change their laws first, but H. Peter Anvin explained, "Actually, the French government got a clue. The U.S. government got half of one. It looks like it might be enough (just *barely*) to make it happen." Michael H. Warfield went into more detail:

As of January 14, the US export restrictions on cryptopgraphy were relaxed (outside of one minor reporting irritation which does NOT affect Open Source Software - only commercial software). We are in a 120 day "comment period" but the changes in the regulations themselves are in effect, NOW. The regulations on Open Source Software SOURCES was almost totally relaxed with no real restrictions on download sites and no reporting requirements (please note emphasis on SOURCES - binaries are still restricted somewhat). The policy at kernel.org has now changed to allow cryptography and they are in the process of making crypto available from their sites and mirrors. One gotcha was the loss of one or more mirror sites that reside in the T-7 countries (7 countries listed as restricted due to Terrorist activities) because the kernel.org gang do want to include some binaries on the sites. I know we lost at least one.

The French relaxed their restrictions on possession of cryptography last year some time and even the crypto law survey site has that updated. There are still restrictions in Russia, China, and a few other countries, but we can't be playing to everyone elses' lowest common denominator (plus we want to provide strong encouragement and incentive to lower those regulations as well).

He went on to describe a meeting with Linus Torvalds:

I spoke directly with Linus a couple of days ago at LinuxWorld Expo in New York. I specifically wanted to get his views on where he saw crypto in the kernel proceeding, both from an implimentation standpoint and from a time-frame standpoint. I don't want to "speak for the man" but he told me that he wants to move forward on crypto in the kernel but he wants to move slowly as things play out.

That gave me the impression that it is unlikely that we will see hardened crypto integrated into the kernel sources in the early 2.4 releases.

Hans Reiser replied, "If you need help from the ReiserFS/Namesys team, just let us know how best to assist, and we can put what hooks you need into our next major version."

4. e2fs Compression In The Main Kernel?

1�Feb�2000�-�11�Feb�2000 (46 posts) Archive Link: "2.4 Features"

Topics: Compression, FS: NFS, FS: ext2, Virtual Memory

People: Peter Moulder,�Stephen C. Tweedie,�Alan Cox,�Rik van Riel,�Riley Williams

Pasi Korkkoinen asked for a list of new features in 2.4; specifically large filesize support on 32 bit machines, 32 bit uid/gid support, and NFSv3 support. Rik van Riel suggested searching the archives or going to Kernel Traffic (http://kt.zork.net) . Stephen C. Tweedie also replied to Pasi, saying that all three features were in 2.3, and would be in 2.4 as well. Riley Williams replied, asking if the e2fs-compressed patch would make it into 2.4 or (preferably) 2.2; Stephen replied that it was probably too late for 2.4.0, but that it might go in later if someone took responsibility for it. He asked if e2compr was still being maintained, and if there was a robust 2.3 version.

Peter Moulder replied, "I'm the current maintainer of e2compr. I continue to work on it, but I've announced my desire that someone else take on `maintainer' title, as for quite some time I've been busy with other things." As far as whether there was a stable 2.3 version out, he added, "Nope. The change to Linux' write mechanism in early 2.3 (namely having writes go through writepage instead of having ext2_file_write create disk buffers) means that there's no longer a convenient place to do the compression (in the most common case, where we wish to compress more than a page full at a time). Probably the fix is to organise for multiple pages to be written at once. It just requires that someone spend a chunk of time getting familiar with how pages usually get written out, and the locking issues and so on."

He asked Stephen's opinion of that plan, and Stephen replied that it would not be difficult to do in 2.3; Stephen explained, "The 2.3 VM doesn't force you to use generic_file_write(). grab_cache_page() and read_cache_page() can be used by other filesystems to access the page cache, and you can use that in your own file_write function to do any sort of transformations you want during the write() system call. You don't want to use generic_file_write(), as that assumes that the disk blocks we're using for IO correspond exactly to the contents of the page cache. That doesn't prevent you from using your own file_write function to update the page cache and then do any other write-through you want to the buffer cache." This made sense to Peter, who thanked him.

Elsewhere, Riley had said he wasn't sure about the stability of the 2.3 version, but the 2.2 version seemed rock solid to him. He also gave a pointer to the e2compr (http://opensource.captech.com/e2compr) home page. Stephen replied, "We still get problem reports trickling in for e2compr. It doesn't appear to be quite 100% solid just yet." Riley offered to help fix the code if he could get some of the problem reports. Stephen didn't have any reports on hand, but thought Alan Cox might. He added, "One of the problems is that nobody, to my knowledge, has done any serious sustained stress testing on e2compr." Alan replied, "I've been filing e2compr problems that went away when it was disabled in the 'not mainstream, someone elses problem' section (the folder accessed rapidly by the 'd' key). Sorry"

At this point Riley volunteered to be co-maintainer with Peter, with the possibility of taking over completely at some future time. He also asked Alan to bounce any further problem reports over to him. He was also interested in stress testing the code along the lines of Stephen's suggestion, but he wasn't sure exactly what that would entail. Peter replied that Riley was welcome to be co-maintainer.

At this point the discussion skewed off into an unrelated argument between different people.

5. Makefile Cleanup; Module Init Order

6�Feb�2000�-�9�Feb�2000 (21 posts) Archive Link: "SCSI Makefile cleanup"

Topics: Executable File Format

People: Alan Cox,�Peter Samuelson,�Matthew Wilcox,�Eric Youngdale,�Jeff Garzik,�David Parsons,�Michael Elizabeth Chastain

Matthew Wilcox posted a patch to drivers/scsi/Makefile, to make it use lists instead of conditionals; and attributed the idea to Michael Elizabeth Chastain. Alan Cox warned him not to change the linking order by, explaining, "if you re-order the object files (eg sorting them) then you change the init order, and you end up initialising boards in the wrong order, causing AHA1542 to grab buslogic cards etc." Matthew replied that although he hadn't reordered the lines in the file, he had used a 'sort' to remove duplicates. He didn't think removing them was all that important, and asked how the patch looked without the 'sort'. Alan said fine, and Peter Samuelson suggested splitting out the three drivers (wd33c93.o, 53c7xx.o, NCR53C9x.o) that could be duplicated, from everything else. He added, "A hack? Yes, horrible. But I would still like it better than the current makefile." But Matthew replied, "i really don't think it's worth it. it was only a minor optimisation i put in because it was essentially free."

Eric Youngdale also replied to Alan's admonition, saying that currently, the init order was not controlled by the link order, but by the order of their appearance in hosts.c; but he added, "I was kind of thinking that a better way of doing this is to use a special ELF section that contains the initializer. Then hosts.c would just walk the list until it got to whatever was the end - in this case, the link order would become quite critical." Alan replied, "That facility already exists and I believe Jeff Garzik has scsi on his current hit list for it. So the order will matter real soon now." Jeff confirmed, "Yep. After I update a bunch of drivers there will be another initcall bombing run." Matthew suggested that Jeff apply his Makefile patch and test everything at once (to which Jeff agreed, though he couldn't do it immediately); then asked, "Any suggestions on which Makefile to tackle next? I was planning on romping through the rest of the drivers/ directory, maybe drivers/block next." Jeff replied, "I already sent Linus a pending patch against 2.3.42 which converts drivers/char, and drivers/net is already done. Unless there are problems, I hope that will appear after the LinuxWorld dust settles on Linus' desk. :) drivers/block would be good, though step lightly as some of the dependencies in there have teeth, IIRC. :) And there are a ton of sub-directories inside the major driver directories which are worth evaluating."

Elsewhere, David Parsons suggested adding a "requirements function call to the kernel, so you can have drivers explicitly load what they depend on before they start loading." He envisioned something like need("aha1542"); in the buslogic driver, for example. Jeff replied, "I imagine "need()" would have to be some sort of trigger which ld (or a pre-ld script) uses to figure out the link order for the static kernel image?" David liked this idea better than what he'd been thinking about, which was more like a kernel registry.

6. Character I/O Problems With SMP In Stable Series

7�Feb�2000�-�8�Feb�2000 (11 posts) Archive Link: "PPP is not SMP safe in 2.2.X, Oops"

Topics: Networking, SMP

People: Mitchell Blank Jr,�Rik van Riel,�Alan Cox,�Oleg Drokin

Oleg Drokin felt that PPP was not SMP safe in 2.2.x; he posted a decoded oops, and Mitchell Blank Jr replied, "Hmmm... looks like you're right. Someone is stepping on ppp->tpkt in between the time ppp_tty_push called ppp_async_encode and it was done with the packet." There was a bit of discussion about the problem, and Rik van Riel added, "The tty code in 2.2 (and 2.3) isn't SMP safe either. SMP Linux seems to have some serious problems performing character I/O :(" Alan Cox replied, "I hope to have a cleaned up version of A J Kroll's patch in by the final 2.2.15. That fixes the main known tty race."

7. Development Process And Corporate Politics

7�Feb�2000�-�11�Feb�2000 (42 posts) Archive Link: "Source Code Release of NWFS 2.0 for 2.2/2.3/2.4"

Topics: Clustering, Disk Arrays: RAID, FS: ext2, Microsoft, Patents

People: Jeff V. Merkey,�Gregory Maxwell,�Taso Hatzi,�Jeff Garzik,�Matthew Wilcox,�Erik Andersen,�Alan Cox,�Lawrence Walton,�J. Scott Kasten,�Matthew Kirkwood,�Pavel Machek,�Theodore Y. Ts'o,�Rik van Riel,�Linus Torvalds

Jeff V. Merkey said:

I think I really don't like Pavel Machek, Ingo Molinar, and Al Viro's shitty attitudes towards us and TRG in general (and that they completely ignore emails and requests for help in areas they claim to own). If you guys want us to feel like a part of what you are doing, then I think out of courtesy, when we ask questions, they should get answered. We are more than happy to **NOT** publish any more source code for general use by the Linux Community if our reward for doing so is to be shit on by a some of the Linux career politicians just because our views are different or that we come at Linux from different angles.

The "partisan" commercial politics (since a lot of these folks have gotten sucked up into RedHat, Suse, etc.) are preventing folks from getting help or participating in the process of developing Linux, and WILL HURT LINUX AND FRAGMENT THE MARKET. I think all the RedHat stock and Suse stock everyone now has has gone to some peoples heads. The partisan politics are destructive to Linux in general, and if RedHat employees are going to play them, then so can everyone else, including us.

I do appreciate your help and have enjoyed our dialogue, but hey, there has to be something in it for us too, either in the form of collaboration, technology adoption, or $$$. To date, less your help, I have not seen it from the other Linux folks except Linus, and Anvin, both who try very hard to stay above the petty BS with the commerical Linux crap.

I guess what I am trying to say is that if I keep seeing a shift towards partisan RedHat politics in terms of who gets to contribute and getting included in the process of what's going on, we can just as easily act like everyone else, and be "greedy" too, an withhold our stuff.

I think the next time one of my requests is ignored, we will simply not publish any more source code, and just let CALDERA customers get access to binaries. With what we get from our Microsoft Windows NT/2000 customers, why should we tolerate this type of crap when we aren't getting what we expect out of our hard work on Linux.

There were a lot of individual replies to this that did not become threads. Gregory Maxwell said, "I almost split a gut here. "Feel a part of the community" coming out of the guy whos earlier (first?) l-k posts said thing to the effect of 'The GPL is meaningless, we plan on ignoring it'."

Taso Hatzi also replied to Jeff, "Get to the point. If you have a specific gripe then let's hear it. If you just want to generally complain, email the individuals. If you have some good s/w you want to let out that's great. If you don't want to let it out, that's your choice. People choose to live in Linuxland because they see a benefit to themselves. If you don't like the place you are free to leave."

Jeff Garzik also replied to Jeff M.:

First, regardless of any real or imagined "partisan attitudes" I have yet to see a good implementation of a good idea rejected.

Second, and more importantly, having 'redhat.com' (or mandrakesoft.com that for matter) in your e-mail address does not automatically imply that that 'helping Jeff Merkey' is included in the job description. Many kernel hackers (a) do what they want, and (b) have friggin' HUGE mailboxes.

If you have a specific technical issue, I am sure we are more than happy to discuss it in public on lkml. Sure maintainers are sometimes slow to respond, or don't respond at all. Just keep trying, and keep civil about it.

So chill out... the harder you push, the more doors you are closing for yourself.

Matthew Wilcox also replied to Jeff M., "Jeff, please refrain from such accusations. Having just spent the past three days with two Red Hat employees, one VA employee and two LinuxCare employees working on some VFS and ext2 projects, I can assure you that partisan politics are not involved in the development process. Email sometimes gets dropped. Live with it."

Erik Andersen also replied to Jeff M., "I've never had a piece of code rejected due to my race, gender, religion, sexual preference, bank account contents, or the company where I am employed. I _have_ however had code rejected for being broken, poorly documented, insufficiently granular, and I've even had code rejected for being well implemented but philosophically wrong. The philosophically wrong stuff is harder emotionally, but after a bit of a flame fest where everybody explains to everyone else why the code is or is not wrong, the right thing happens. I'm afraid I've failed to see where Alan and/or Linus played politics."

Alan Cox also replied to Jeff M., starting a longer thread. He said, "I can't find a contract between any of the linux kernel people and you for support. On the basis of your ridiculous allegations I regret it would be unreasonable to continue to work with you on anything NWFS related. We have no 'deal' about tools as you claim either. Please direct all your 2.2.x discussions directly to Linus since I would hate to risk them being touched by some potentially impartial body." And to Jeff's signature, "Your friend," Alan replied, "Ex..." Jeff replied:

Contract? What the hell is this? I have been observing what goes on here now for almost 18 months, and from my vantage point, it's clear that RedHat just sits at the mouth of each womb ready to devour every baby that is born like a pack of hungry, ravaging wolves. Everyone tries their best ot put on a happy face while they knife each other in the back and plot and scheme ways to rip each other off, and shaft each other in the Linux Community.

This is why Linux will ***ALWAYS*** be inferior to Windows NT/2000. It can only be as good as the people who write it, and when they're second rate unix hackers, that's the ceiling on the quality level of the effort. Anyone who suggest any direction that's not understandble by you "gods" gets ignored, knifed, bad mouthed, or character assisinated, and if what's proposed is not undergrad unix computer science, you guys don't seem to understand it, or care (and you show a definite unwillingness to even try).

One good example is the VFS in Linux. EVERY release, you guys break something or there is MASSIVE file system corruption, or memory corruption, or some other catastrophe that takes days to sort out. Commercial OS vendors never tolerate this lack of quality/compatibility. I'm sorry if you are offended, and i withdraw the allegations (man did I get your dander up -- jeeeeez), but we are spending money on developing on Linux, and the obvious lack of COURTESY, PROFESSIONALISM, and QUALITY increases support effort (I have to rewrite the VFS interface EVERYTIME you post a new kernel. You guys are constantly BREAKING stuff and LEAVING IT BROKEN and inflicting your laziness and bugs on the entire planet. If a Microsoft engineer (or Novell engineer) operated at this level of quality, they would have their work heavily scrutinized.

And get over it, would you.

There was another batch of single replies that did not become threads. Lawrence Walton said, "Jeff, been a fan of yours, and have read many of your email's on LK, but this last email smacks of basic misunderstanding of Linux in general, and seems rather self-defeating. Until very recently no one made money coding Linux kernel, it's been a club, a "gee thats fun, I did not know I could do that" sort thing, a hobby, like it or not MOST people still think of it that way. If your trying just to make money, (I rather hope your not) you have to play that game. If your not; start thinking about what you just said, it basically bad mouths everyone, people of different skill sets, backgrounds and cultures. Not just Alan Cox. Think about it."

J. Scott Kasten also replied to Jeff M., saying that Jeff M.'s flames were not appropriate on the list. In response to Jeff M.'s argument, he added, "Perhaps you've never worked in a real Operating System development environment before. Things get broken all the time in coporations as the developers grow their products into the next official product release. The difference is that customers don't see that process because it's all internal. At my company, day to day firmware builds may or may not even work. That's what the development kernels are, more or less frequent builds where ideas are tried. The official releases are the even numbered kernels, and they do as a rule work solidly with the same quality as any commercial end product release would. If your worried about how it affects your development, then you have the same opptions you would in any commercial environment. Either develop against an official release, or stick with a particular development snapshot that works for what you need to do in the near term and wory about integrating it into the mainstream when the right time comes."

Taso replied as well, "That statement is unfair and silly. Don't be duped by the slick veneer MS marketing put on (often) crufty merchandise. Linux's dirty linen is hanging out there for all the world to see. MS's dirty linen is stashed away in a locked cupboard. They only show you what they want you to see, and only tell you what they think you want to hear. Where's the NT-kernel development newsgroup?"

Matthew Kirkwood also replied to Jeff M., "My shame at responding to your post is surpassed only by my rage at your abuse." In response to Jeff M.'s assertions about the quality of Linux developers, he went on, "No. There is a philosophy surrounding Linux and Unix, and your refusal even to try to appreciate it loses you the ears of many here who would otherwise offer useful help and insight." To Jeff M.'s remarks about getting people's dander up, Matthew replied, "Alan is a very nice guy, with basically infinite patience for those who are prepared usefully to help themselves. Many of the others on the list are the same." To Jeff M.'s complaints about each new release breaking important things, Matthew replied, "We don't have to get everything right the first time. Once it exists, people use it. Then we make it better. Think of it as pipelining." To Jeff M.'s assertion that a Microsoft engineer would not be able to get away with such low quality work, Matthew replied, "So *fuck off*. Many people might like to use your nwfs, but when you persist in so fundamentally and rudely misunderstanding Linux development, you repel them." And finally, to Jeff M.'s "get over it" comment, Matthew replied, "How *dare* you? You posted a bizarre rant attacking Alan for some perceived injustice inflicted upon you by his employers. You may have interest in Linux. You may release bits of source code. But you patently have no interest in the development of Linux, so you have no place here. Please desist from posting your vitriol to this list."

A brief thread came out of Pavel Machek reply to Jeff M.; Pavel said, "What are you? Every second post from you is gem. You want to develop striping and do not know what raid is. Then you want to sue (forgot who) because he said something (and post Cc: of flames to l-k). Then you start attacking Alan, Alexander and everyone else. In the next mail you write that linux will be always inferior to WNT (and attack RedHat, just btw). Are you a elisa-like software trying to create biggest flames possible on l-k?" Jeff M. replied, "At least your talking to me again now -- this is progress. Nobody ever attacked anything, I asked questions about your stuff, you guys take everything so personal -- it's not personal -- it's just business. I do thank you for finally responding. Had you responded to my requests in November regaring the raid driver rather than ignore me because you didn't like the way we approached the problem, we would not have had to write our own buffer cache (we could have simply collaborated on adding what we needed to your RAID drivers which was my preference). You cost me an additional $250,000 in salaries to engineers to do this. It would have been much simpler to just help us use your stuff (which is very good by the way)." Theodore Y. Ts'o replied, "If calling the entire linux kernel development community "second-rate Unix hackers" is just business, and not something to be taken personally, remind me never to invest in your company..... it's certainly not smart business, at any rate." He added, "If Ray Noorda is really funding this character, someone with his e-mail address (perhaps at Caldera) should forward some of the e-mail on this thread to him. He should know where his money is going."

Jeff M. replied, "sticks and stones may break my bones, but words will never hurt me." He added, "Some HUGE egos out here. I guess I'd better learn to duck," and went on, "by the way, we don't need your money for investment. Unlike most of the Linux companies, TRG actually makes PROFIT because we are SMART enough to sell Windows NT/2000 software (which makes money, unlike Linux where only hardware vendors and vertical app writers can make money)."

In a small subthread, Rik van Riel replied to Jeff M. on a technical level:

Your stuff seems to have solved some of the problems that the current Linux code hasn't solved yet. Maybe it would be a good idea if you worked closer together with the people who are implementing RAID and clustering for Linux?

With that I don't just mean that you keep us up to date and we'll have the chance to look at your stuff, but also that you keep an eye on our developments and help plan the future of the subsystems you depend on.

When you help design the future, not only will you face less unpleasant surprises, but the code will also be closer to what you want it to be and we can anticipate on your needs (instead of receiving a flame afterwards). Also, your view of the matter might have improved Linux...

Sure, doing all that might cost you quite a bit of money, but it would have avoided the $250.000 bill of duplicated effort, the open source community could have handed you some valuable ideas that could have saved you even more work or improved your product -- for free.

That the open source world works differently is not at all a matter of "get over it".

It's a matter of adapting to the environment and using it to your advantage...

Jeff M. replied:

If someone owns an area, the SMART thing is to work through that person so we don't duplicate effort. What Pavel already has is 98% of what we need, and I'd rather let him own this area. If he doesn't want to that's his call, but it makes more work, costs more money, and takes more time.

If you want to mediate my projects, then perhaps I should coordinate through you. I need someone to look at our stuff (who is competent with the RAID stuff, and tell us whether it would make more sense to use what's already there(my first choice). We can also try to junk up our brain cells pouring over Pavel's Code, but why should we waste brain cells storing information we won't use after it's implemented? Particularly when we have a STUDLY expert like Pavel in this area -- and who knows, maybe he'll learn something from seeing how someone else does mirroring and create something even better down the road for Linux?

There was a bit more argument elsewhere.

(ed. [] I've read a number of threads involving Jeff M., and I still don't know quite what to make of him. Although intelligent and knowledgable, it seems he is very much entrenched in a certain pre-Open-Source way of thought; at the same time, he is not above hurling threats and insults, and overreacting in very obtuse ways to others who behave similarly. On the other hand, big time hackers like Rik van Riel seem to have some respect for his technical work, so it's impossible to dismiss him altogether. In any event, he is a prolific poster to linux-kernel, and is apparently here to stay. Maybe in time he'll come to understand Linus Torvalds' discoveries well enough to participate in them. --Ed)

8. Standards Of Behavior In linux-kernel

10�Feb�2000�-�12�Feb�2000 (26 posts) Archive Link: "Standards of Conduct"

People: Karen Shaeffer,�Nathan Hand,�Lauri Tischler

Karen Shaeffer proposed:

I really do not care about the colorful history of the linux-kernel mailing list and respect the right of individuals to speak their mind--without constraint, when they are conversing one-on-one either in person or via email. But considering the rapid adoption of Linux by global corporations and the connection of this list to the work environment of peoples of all types and background (and including women) as a side effect of that adoption, it is imperative that some minimum set of Standards of Conduct be enforced on this list.

I would urge those controlling this list to take responsibility and put all list subscribers on notice that certain behaviors can no longer be tolerated. It is clearly in the best interest of Linux as we move forward.

I hope I haven't offended anyone by suggesting that we, as a community, can expect a minimum standard of conduct to be observed on our most important mailing list. Think about it.

Mathijs Mohlmann referred to the FAQ, to the question, "Are there any implicit rules on this list that I should be aware of?" He quoted part of the answer, "Be nice, there is no need to be rude. Avoid expressions that may be interpreted as aggressive towards other list participants, even if the subject being treated is particularly relevant to you and/or controversial."

Phillip Ezolt suggested, perhaps facetiously, forming a "linux-conduct" mailing list to discuss such issues. Nathan Hand also replied to Karen, saying, "You would further the freedom of software by censoring the freedoms of programmers? Sounds weird to me." Lauri Tischler also replied to Karen, describing her as an "Apparently narrowminded bigot trying to push her morals to everybody." To this, Karen replied:

Actually, I'm not a bigot and accept people for who they are. I've also learned in life that running around insulting people never leads to positive outcomes. The main intent of my post was concerned with treating others with civility.

I've received a significant number of positive and thoughtful private emails due to my original post. In retrospect, I should not have spoken my mind on this matter. Each individual can draw their own conclusions about the conduct of others and are free to /dev/null them or filter their mail or just ignore them. Enough said, I won't be discussing this matter on this list in the future.

9. devfs With reiserfs

11�Feb�2000 (2 posts) Archive Link: "devfs + reiserfs?"

Topics: Disk Arrays: RAID, FS: ReiserFS, FS: devfs

People: Richard Gooch

Martin Maciaszek noticed that the devfs patch wouldn't apply cleanly to a kernel that had already been patched for reiserfs. Richard Gooch replied, "Depending on how bad the breakage is, you could try producing a devfs-for-reiserfs patch yourself. If you whip up a working patch, I'm happy to plonk it onto my ftp site and announce it. I'll even work with you to ease ongoing maintenance. It's been done before with the RAID patches." End Of Thread.

10. Running User-Space Helper Programs

12�Feb�2000�-�13�Feb�2000 (4 posts) Archive Link: "Exec'ing user space helper programs"

Topics: Hot-Plugging, Networking

People: Thomas Sailer,�Rik van Riel,�Linus Torvalds,�Chris Proctor

Thomas Sailer explained:

kmod calls a user mode helper program (modprobe) to do the work. I find this concept generally useful, I have a device that requires complex initialization, which is currently done in userspace with a helper program. I'd like to keep it that way.

Exec'ing a user mode program however is not so straigtforward, everytime I look into kmod.c the code gets more complex 8-) Up to now I just copied the code from kmod.c into my driver, but the code duplication starts to get annoying, furthermore a couple of routines (free_uid and flush_signal_handlers) are not even exported.

So I'd like to export the exec-facility of kmod.

The following is a patch to do this.

Rik van Riel replied, "Sounds like a winning idea. I can imagine that we want to do exactly the same when we want to implement per-user resource limits (the beancounter stuff)." And Linus Torvalds also replied to Thomas, "Looks good. This is something we'll probably need for other things: hot-plug devices need to inform user space about [dis-]appearance etc. --Applied." But Chris Proctor replied to Linus, "In 2.3.44 this change causes one of the ppp-* files to not compile if CONFIG_KMOD is not defined." He posted a patch to fix it, but there was no reply.

11. Bug Hunt In Unstable Series

13�Feb�2000�-�14�Feb�2000 (16 posts) Archive Link: "2.3.44 bug"

Topics: FS: NFS, FS: ext2, SMP

People: Harold Oga,�Rik van Riel,�David Wragg,�Manfred Spraul,�Garst R. Reese,�Tigran Aivazian,�Andrea Arcangeli,�Arjan van de Ven,�Brendan Cully,�Alexander Viro

Tigran Aivazian reported that in 2.3.44 on his dual PIII, he couldn't start any large program such as X, and 'gcc' would randomly die with a SIGSEV. David Wragg confirmed the problem, though different programs weren't working for him. Brendan Cully also saw the problem, though with different programs. Garst R. Reese had no problems on his UP machine though he was using the same kernel. Arjan van de Ven also saw the problem on his dual Celeron. Alexander Viro saw no problem on his UP box.

Harold Oga had no problems on his Dual Celeron 400, and reported that for him, 2.3.44 was the best kernel he'd seen in the unstable series. But he replied to himself, "Looks like fat is broken on 2.3.44 though, as all the kernel tarballs I have stored an a zip disk are giving me errors with unexpected EOF, whereas 2.3.42 and 2.2.14 say the same files is ok." Alexander replied that this had been broken in 2.3.43, and posted a one-line patch to fix it.

Elsewhere, Rik van Riel also replied to Tigran's initial report, saying that 2.3.42 and 2.3.43 also had the same problem. He explained, "It's a typical SMP problem, supposedly caused by bugs in the TLB flush code." Kernel 2.3.37 was the most recent kernel he'd tried, where the error did not show up. Rik posted an exploit:

for i in `seq 50` ; do md5sum <file_bigger_than_ram> ; done

and explained, "Over NFS this works, gives the same (correct) md5 sum every time. Now I copy this file to my ext2 /tmp partition and try again. About one in five md5sums is incorrect ... but the file _write_ has always been correct." Manfred Spraul replied that the Trasnaltion Look-aside Buffer flush code only changed in 2.3.43 and 2.3.44, so if Rik's exploit worked in 2.3.42 as well, the cause could not be those changes. He suggested that some patches from Andrea Arcangeli might have been broken, but David replied that he couldn't reproduce the bug in 2.3.42; he also asked, "But wouldn't TLB problems be intermittent, and tend to affect everything rather than very speific programs? When I see a program SIGSEGV unusually (sometimes it's SIGILL or SIGBUS), and md5sum on the relevant binary, then reboot into a plain old 2.2 kernel, then md5sum on the binary again, I get a different result! Unless the TLB changes in 43 could plausibly cause these symptoms, aren't the changes to fs/buffer.c a more likely culprit?"

Rik amended his post, saying he might have first noticed the behavior in a 2.3.43 pre-patch; but he agreed that fs/buffer.c might be the culprit, perhaps in combination with TLB problems. He went on:

A repeated md5sum on a big file over NFS always gives the correct result, ext2fs and isofs don't always work...

This suggests a slight bug in the read path. Did anyone do something suspicious with that code? :)

There was no reply.

Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.