Kernel Traffic
Latest | Archives | People | Topics
Latest | Archives | People | Topics
Latest | Archives | People | Topics
Home | News | RSS Feeds | Mailing Lists | Authors Info | Mirrors | Stalled Traffic

Kernel Traffic #43 For 15 Nov 1999

By Zack Brown

Table Of Contents


This week I'd like to highlight a new study being conducted on the Linux development process. You can find out all about it at Essentially their plan consists first of developing a questionnaire via group participation in a mailing list; next, presenting the questionnaire on the internet so people can write in their answers; and finally, making the raw data obtained and their own findings available to the public.

In some ways this experiment itself is an Open Source project, so the scientists may find themselves filling out their own questionnaire. I hope they preserve the mailing list archives.

I suppose I should say something about the Microsoft verdict, but it's a little noisy here because of all the popping champaign corks. Hey guys! Hold it down! Tom, take that lampshade off your head!

Can't take them anywhere. ;-)

Oh yeah, Microsoft. Well, it's been a tremendous break for Linux that MS has had to restrict itself to largely conventional warfare all this time. Who knows what (metaphorical) busses might have been lurking in the Transmeta parking lot on a dark and stormy night if they'd had a free hand? And now that MS will be caught up in appeals for the next few years or more, the future of Linux looks very bright.

Which is not to say there are no obstacles to be overcome. In the economic world, as more and more Linux companies start gaining power, competition between some of them will undoubtedly get ugly. Hopefully the best of them will strive to remember the ideals that made Linux and Open Source possible, and will emulate those ideals both in their dealings with the community and within their own corporate structure as well.

In the technical world as well there is much to be done. The craze over the Linux graphical desktop continues to make Linux a viable home system; at the same time Linux desktops remain embryonic in, among other things, the area of word processing and desktop publishing. Wine and VMWare have managed to cover up this problem to some extent, but they still can't improve the quality of the Windows packages they allow us to run. Microsoft Office will always be terribly bloated, buggy, susceptible to macro viruses, and a slave to a binary file format designed specifically to be incompatible with competing software. I look forward to a day when Wine, VMWare, and DOSEMU will merely be cute toys, as the TRS-80 emulator is today. For now, however, we are still dependant upon them for some of the larger apps.

The kernel as well is of course in a constant state of development, but there are perennial issues that keep coming up. Kernel code is still not fully compatible with the latest C compilers, and Linus Torvalds still recommends a compiler that is years old by now (GCC 2.7.2). The kernel source is gradually being updated for more recent compilers, but for now the problem remains. In addition, as we move closer to a new stable series, the last stable series still has not fully stablized. In particular, a file corruption bug has been biting people for quite a while now, and seems no closer to being found than when it was first reported, in spite of determined hunting by big-time hackers like Alan Cox.

In spite of these and other problems, there is much to rejoice at, not least of which is everyone's ability to openly discuss the things that still need work. Unlike Microsoft and other members of the competitive software industry, we don't have to put a rosy face on everything. We don't have to say, "Linux is perfect," and we don't have to say, "Death to FreeBSD. Death to the Hurd." We can acknowledge the problems, and we can affirm the alternatives. And of course, we can continue to experience uptimes measured in months.

Mailing List Stats For This Week

We looked at 1038 posts in 3866K.

There were 375 different contributors. 168 posted more than once. 136 posted last week too.

The top posters of the week were:

1. /proc/pci Confusion

25 Oct 1999 - 2 Nov 1999 (49 posts) Archive Link: "/proc/pci unknown devices"

Topics: Hot-Plugging, PCI, Small Systems

People: Dan HollisGerard RoudierTheodore Y. Ts'oMiquel van SmoorenburgMartin MaresDavid WoodhouseBret IndreleeDave JonesJes SorensenJeff GarzikLinus Torvalds

All but about 4 posts in this discussion took place within the span of a single day. This accounts for the form of the discussion, which was a little repetitive I think. Keith Duthie started it off by reporting on some PCI devices that his system refused to recognize.

Keith had included a copy of his /proc/pci in his post, and Dan Hollis replied that /proc/pci was being dropped in future kernels, and that Keith should use 'lspci' instead. But Jeff Garzik quizzically pointed out that on the contrary, /proc/pci had just been made non-optional in the 2.3.x series.

Dave Jones seemed to recall that Martin Mares had submitted a patch to kill /proc/pci, and that Linus Torvalds had rejected it; and then that Martin was now working on a patch to put PCI data directly in the kernel, as a table of descriptive, human-readable strings. Martin confirmed this, but an upset Dan Hollis asked, "Does this mean we will be stuck forever with an ever-growing unswappable static text table in the kernel?" David Woodhouse pointed out that the text table was defined as __initdata, and would be discarded at bootup, after the kernel identified the devices that were actually installed. Bret Indrelee remarked that even as __initdata, it still had to fit into the kernel binary, which might make it too big for 'lilo' eventually. Torsten Landschoff added that bootdisks would also suffer from the extra size.

Elsewhere, Dan said:

For those running embedded pci systems the extra 35kb of wasted text strings is not really nice. Especially if you are going to have to ROM the image. For some users this could make the difference between kernel too large for lilo, and one that works.

I thought we were trying to avoid M$-thinking, or is redmond starting to influence developers, "16kb here, 32kb there, it won't matter".

Gerard Roudier put in, "May-be some kernel projects are still avoiding what you called M$-thinking, but userland seems to have been definitely converted."

Jeff Garzik objected that 32K that would also be compressed in its final form, was not so bad. Dan felt it should at least be a kernel option.

Going back to an earlier point in the discussion, when it was revealed that /proc/pci had been made non-optional, Dan had expressed his surprise. As far as he had been aware, /proc/pci had not only been optional but had also been deprecated as obsolete. He asked about this strange reversal, and Jes Sorensen explained that Linus had posted recently and said he liked /proc/pci and wanted to keep it. Later, Dan said, "Why does the *kernel* need access to descriptive text strings of devices in order to function? This is purely for user convenience, hence it can be in userspace with pciutils."

Later, Theodore Y. Ts'o said:

This whole discussion is silly; I believe Linus has already made a decision on this issue. Namely, that /proc/pci is going to stay, because it's useful to have the ASCII text description attached to elements in the struct pci tree. Yes, this means the initialization tables take up 35k of space, but (a) that's uncompressed; and the text probably compresses quite nicely, and (b) it's __initdata, which means the space is released back to the system after the kernel boots.

That means that the only real cause of angst might be (a) trying to fit a rescue kernel on a floppy disk along with rescue tools, and (b) embedded systems. For the general case (and remember that Linus tends to like optimizing for the general case), the cost/benefit ratio for having text description of pci devices is in favor of keeping the text descriptions. I imagine if someone came up with a patch that removed the __initdata table, in the interests of embedded or rescue disk kernels, either (a) Linus would accept it, or (b) it would be trivial to maintain separately outside of the kernel. Either way, it's a lot of discussion over a fairly minor point.

Elsewhere, Miquel van Smoorenburg objected that hotswapping seemed to require keeping the table in memory throughout the life of the running system. Ted replied:

It's a good point, but we may end up wanting to use different mechanisms to deal with hot-plug devices versus devices detected on boot. Yes, it's best if as much of the code path is shared as possible, but there may be enough other differences with how we want to handle hot-plug PCI that this isn't an issue.

At the very least, it's probably better to try to design a complete hot-plug solution than to make guesses about what's the best way to handle one tiny bit of the problem now. :-)

2. Journalled Filesystem For Linux

28 Oct 1999 - 2 Nov 1999 (16 posts) Archive Link: "jfs/linux"

Topics: FS: ReiserFS, FS: XFS, FS: ext2, FS: ext3, Microsoft, POSIX, Patents

People: Stephen C. TweedieShawn LeasMike TouloumtzisBjorn WesenDavid WoodhouseHans Reiser

Journalling was first covered in a flame war in Issue #7, Section #6  (12 Feb 1999: fsync(); syslogd; Ext2 Extensions; Linus Chastized) , where it came out that Stephen C. Tweedie was extending ext2 for journalling. Then it came up in another flame war in Issue #15, Section #2  (31 Mar 1999: Journalling And 'Capabilities' In ext3) . The announcement that XFS might go Open Source appeared in a heated discussion covered in Issue #21, Section #2  (20 May 1999: XFS Going Open Source) . There was a heated discussion about reiserfs and ext2 in Issue #34, Section #4  (29 Aug 1999: ReiserFS Nears Readiness; Difficulties Discussed) . Most recently, an ext3 status report and discussion took place in Issue #38, Section #2  (16 Sep 1999: ext3 Filesystem Status; ACLs) .

This time, Josef Höök asked if anyone was working on a journalled filesystem for Linux. Stephen replied, "There is journaling in test for both reiserfs and ext2, and SGI are porting their XFS journaled filesystem to Linux." Hans Reiser, the author of reiserfs, added that the latest SuSE already came with reiserfs. Shawn Leas also gave a link to the DTFS homepage, but Stephen pointed out that DTFS hadn't been under active development for months, and was also a not really a journalled filesystem, but more of a log structured filesystem. Christian Czezatke, the author of the DTFS, objected that he'd been preoccupied by other things, but that he was still actively developing the package. Elsewhere, Shawn added that DTFS had certain problems. For one thing, since it didn't de-allocate blocks, it was still in need of a good FS cleaner. Also, being a log-based filesystem it had performance issues because of writing its data via 'append' rather than 'modify in place'. But he acknowledged, "The benefit comes in things like snapshots, where you can preserve metadata at some point in time, basically having a readonly copy of the whole FS from time HH:SS." Mike Touloumtzis added, "Log-structured file systems are also very interesting for Flash-ROMs in embedded devices. Wear leveling is a big concern there, seek time is not. Reconciling proper GC with decent random access times is still the trick, though."

Bjorn Wesen and David Woodhouse both had interesting replies to this. Bjorn said:

Yes. We've designed a log-structured flash FS for embedded linux (which we'll probably release next month as GPL). It is quite simple and optimal for small flashes with large sectors (normally the erase granularity for flashes is big, so you can't use a normal filesystem). I'd expect it to work for those cheaper sequential flashes as well, although I have never built anything with that.

The GC is still under tuning. As it's a log-cleaner it can obviously run pretty incremental, but the flash device's erase delay still sets the minimum latency.

And David said:

All the solutions we currently have for filesystems on flash present a block device rather than a filesystem interface - that is, both FTL and NFTL act as an extra translation layer below the actual filesystem, providing 512-byte random block access for use by a traditional filesystem.

It would probably be more efficient to implement a filesystem directly on the flash, much like Microsoft's FFS2, except that FFS2 obviously isn't POSIX-compliant, and hence isn't very interesting to us, although we do have the beginnings of a filesystem driver for it.

As there are patent problems with both FTL and NFTL, it would be extremely useful to have a POSIX-compliant filesystem designed specifically for use on flash devices.

I think that there is already such a beast in existence, designed for QNX. It would be nice to either support this or write a replacement.

3. Boot-time Tests For RAM Size And Integrity

29 Oct 1999 - 4 Nov 1999 (31 posts) Archive Link: "Perform minimal RAM test at boot"

People: Pavel MachekAlan CoxPeter Steiner

Pavel Machek posted a patch to perform a simple memory test at boot time. He explained, "I have been bitten by non-working memory detection and stale mem=XXX option in lilo by 5 times now. Once, system even went to full multiuser and then corrupted disk like hell."

Adrian Bridgett suggested using something similar to Pavel's patch, to check memory sizes; then users wouldn't have to set boot parameters to indicate the amounts of existing ram. Alan Cox replied that Pavel's code had theoretical problems that made it bad for detecting memory sizes, though he acknowledged that in practice it might be a different story.

Peter Steiner replied with a patch that he felt did the trick. Pavel was impressed, but suggested that when Peter's code found a memory error, rather than marking the page as reserved and moving on as Peter had coded it to, the patch should cause an immediate kernel panic. Peter was ambivalent, and they appear to have had some private emails on the subject.

4. CPU Speed-Change On Running Systems

29 Oct 1999 - 2 Nov 1999 (37 posts) Archive Link: "Wrong bogomips after plugging in AC power"

People: Pavel MachekAlan CoxJeremy FitzhardingeRalf BaechleChad MillerErik MouwMark Hahn

Pavel Machek discovered that his Toshiba's CPU would change speed under certain circumstances, e.g. it would start up at 150MHz when booting at less than 20% battery power. The problem was that if he then plugged in the AC power, the CPU would speed up to 300MHz, while the bogomips still retained their old value. He added, "Therefore all udelays are wrong by factor of two -- udelay(50) will only wait approx. 25usec. That seems pretty dangerous to me. Maybe we need some other source of short loops?"

Harald Koenig pointed out that this also worked in reverse: if Pavel started his machine in fast mode and then went to slow mode, udelay(50) would wait 100usec. He added that on the Tecra 750/780 it was possible to switch between three different CPU speeds via the Fn-F2 key combo. He agreed that there was a problem, but didn't see when or how to check the CPU for this change.

Mark Hahn suggested that '__udelay' should base its timing on rdtsc (a clock cycle counter) on machines that fiddled with their clock in this way. Pavel objeted that when his machine changed the speed of his CPU, it also changed the speed of the cycle counter as well. "That sounds like show stopper," he said. But Alan Cox replied, "Thats a gift not a show stopper. Just check the rdtsc change between each timer tick looks believeable. Timer ticks might get delayed but if you start to get an excessive numbers of ticks per n cycles of the tsc you know your CPU slowed down."

Pavel replied to this with some code to detect CPU speed changes. It didn't recalculate bogomips, but it did detect the change. However, he pointed out that the machine would not be aware of the speed change for at least one timer tick. He said, "If by chance some flaky device is used for that one-tick period, we already lost the game..."

Elsewhere, Jeremy Fitzhardinge said, "Bogomips calculation is pretty slow and CPU consuming. The basic problem is that the premise, the CPU always runs at the same rate, is flawed. The solution is to find some other timebase." And Ralf Baechle added, "This is a result of all current CPUs being synchronous designs. There is very promising research work about asynchronous processor designs. One of the key properties is that these processors don't have a processor speed that is exactly defined by the frequency of an oscilator but rather by the temperature, production process used and many more. Even two chips that were born on the same wafer side by side will probably differ somewhat. They use significantly less power, so they'll probably be used in battery powered devices like laptops first."

Chad Miller felt Jeremy was on the right track, and suggested that if no other suitable timebase could be found, one solution might be to "better control the APM hardware to detect what it intends to do to the system, and adjust our bogomip constant with a multiplier."

Elsewhere, Erik Mouw said, "I don't know if APM warns you about the changed processor speed, but if it does, I'm sure it will vary wildly between computers. A userland program to inform the kernel would be nice. If APM doesn't tell you, the userland program will till be able to recompute the processor speed using things as load average, initial CPU speed, and the time do to spin in a closed loop."

But Pavel replied:

That is not going to work. It is too late by then. One single wrong udelay() can corrupt your data, or crash your machine [assuming broken hw. On good hw, no udelay is needed]. You should not run with wrong bogomips, not even for short periods between real change and you noticing it...

What are other possible timebases? [I'm talking i386 architecture]

Any other ideas?

No solution presented itself on the list.

5. Mirroring Via The Buffer Cache

29 Oct 1999 - 8 Nov 1999 (81 posts) Archive Link: "Linux Buffer Cache Does Not Support Mirroring"

Topics: Disk Arrays: RAID, Disks: IDE, Disks: SCSI, FS: ext3, Virtual Memory

People: Rik van RielGadi OxmanStephen C. TweedieJeff V. MerkeyLinus TorvaldsAndrea ArcangeliPavel MachekGerard Roudier

This was a long and very interesting thread, worth reading in its entirety in the archives. Jeff V. Merkey started it off, saying that the Netware filesystem for Linux was now running mirroring, with up to 8 mirrors per logical partition. However, using Linux's buffer cache to handle the mirroring resulted in data being cached multiple times. He and his fellows had rewritten the Linux buffer cache to handle their needs, and they were anxious to get some changes into the main kernel tree.

Rik van Riel replied:

Basically, the buffercache is going to be reduced hardly more than an I/O mapping layer (with probably some extras for transactions and RAID). Caching will have to be done in the page cache.

But that's something for kernel version 2.5 (when we'll probably overhaul most of the cache stuff).

In the meantime, it would probably be best to use either your own buffer.c or accept the fact that it's not going to be as efficient as you'd like it to be.

Jeff said this was the conclusion they had come to as well, and offered to write the subsystems they'd require for 2.5; he added that this would allow Linux file systems to support multi-segmented mirroring and fault tolerant failover without the RAID drivers. Pavel Machek asked what was wrong with the RAID drivers, and Jeff replied that the RAID code was extremely primitive. Gadi Oxman replied:

I take personal offence from the *extremely primitive* description. The current RAID architecture supports:

  1. Compatibility with *every* block device driver -- every block device can be part of the RAID array, be it SCSI, IDE, whatever.
  2. Compatibility with *every* file system, which just sees the RAID array as any other block device.

The above two features are achieved especially because the RAID code sits just at the request queueing layer, above the drivers but below the buffer/page cache and filesystems.

In addition, the RAID subsysystem provides the designed redundancy, good performance, and support for hot rebuilding. The RAID-5 code, for example, includes "full stripe" write optimization, read-modify-write cycles, completion-write cycles, maintains an internal cache, etc.

Perhaps primitive, but clearly not an *extremely primitive* architecture.

Andrea Arcangeli took exception to Gadi's statement that RAID supported every filesystem, pointing out that jfs would break with RAID. Someone asked for more of an explanation of this, and Stephen C. Tweedie replied, "Currently raid expects access to be able to do things in the buffer cache bypassing the normal ll_rw_block() device driver interface, and this violates the IO ordering requirements imposed by jfs. We're working on it." Someone else gave a pointer to a post by Stephen in the archives of the linux-fsdevel mailing list.

Returning to Jeff's problem, Gerard Roudier said elsewhere that Jeff's implementation of mirroring had been done at the wrong layer out of laziness. Jeff replied, "Actually, if you guys would design something a littler newer than circa 1973 for your buffer cache design, things would be easier. The buffer cache acts this way becuase it is based on a PRIMITIVE design that's not far off from the textbook description found in "UNIX: a practical implementation." Novell was doing mirroring and distributed mirroring since about 1984. We are just trying to get linux closer to the 1990's."

And Stephen replied:

The buffer cache is not intended to provide primary support for doing advanced IO control. If you want to do mirroring, then fine, do it at a different layer. We have 2 open source journaling implementations for Linux right now which work above the buffer cache, and software raid working below it. In 2.3, _all_ file write activity bypasses the buffer cache entirely. The fact that the process performing write-behind in 2.3 happens to be bdflush in buffer.c is almost incidental: it could just as easily be done at the VFS layer. Ext3 journaling on 2.2 also bypasses the buffer cache to perform IO.

The buffer cache isn't the place to provide mirroring. Do it above or below. Why wouldn't a low-level mirroring driver below ll_rw_block do what you want? Even if it didn't, you could achieve the same effect within the page cache using specialised write-back routines.

The main lack in this area is one you pointed out in an earlier email --- the lack of any way for the VM to request that the owner of an arbitrary page of the page cache or buffer cache release that page prevents easy memory balancing between cache consumers. We're discussing ways to deal with that in 2.3/2.5.

Jeff replied, "What everyone is side-stepping is that the interface between the drivers and the buffer cache is incestuous -- this prevents folks from building async I/O based FS's on Linux. The solution is not a simple one -- the drivers and buffer cache interface needs to be changed to elimnate these dependencies."

At this point Linus Torvalds came in, with:

Actually, not at all.

The way to fix what you call "incestuous" is to just marry the two off for good. They aren't incestuous, they're just living in sin..

The buffer cache is closely related to the drivers, and will become only _MORE_ closely related to the drivers. That's because the buffer cache is quite consciously being evolved into just a driver interface for IO: the "buffer cache" is transforming away from a "cache", and into a pure "block IO interface".

So no, the buffer cache doesn't support the kind of mirroring you want, and almost certainly never will. But the page cache may eventually evolve into the direction you're looking for.

At this point several folks made technical suggestions that appealed to Jeff, or at least that he couldn't refute right away, and the discussion petered out.

6. Some Explanation Of The 'dentry' Struct

1 Nov 1999 - 4 Nov 1999 (36 posts) Archive Link: "structure dentry help..."

Topics: FS: NFS

People: Alexander ViroNeil BrownRichard Gooch

Someone asked for an explanation of the dentry structure. Neil Brown gave a pointer to his article (based on work by Richard Gooch) on The Linux Virtual File System Layer, which included a comprehensive description of the dentry structure.

Meanwhile, Alexander Viro gave his own explanation:

It's slightly messy, but the current layout looks so:

Notice that:

  1. we can't handle more than one dentry for a directory. VFS does not allow that, partially due to the d_alias abuse in NFS and friends.
  2. _all_ work with dcache still requires big lock. Again, fixing that will require cleanup of d_alias/i_dentry mess.
  3. we have only two states for dentry - hashed and unhashed. Life would be much easier if we had finer separation (e.g. special case for dentry in process of lookup()). To be changed.
  4. all locking is done on inode level. It makes race prevention much harder.
  5. cached_lookup() acts funny if we find a directory dentry _and_ attempt to revalidate it gives negative. Compare with the case of non-directory. Handling of invalidation is tied with the d_alias/i_dentry problem - same offenders hold the thing. It will take a large and nasty rewrite.

There followed a bit of a technical discussion regarding possible improvements.

7. SBLive Driver Source

1 Nov 1999 - 2 Nov 1999 (7 posts) Archive Link: "SBLive driver source"

Topics: Sound: SoundBlaster

People: Alan CoxJeff Garzik

Jeff Garzik gave a link to the sources of some Soundblaster drivers for Linux, written by the company itself. Chris Jones jubilantly pointed out that the drivers were GPLed, and asked if they would be going into the main kernel soon. Alan Cox replied, "I've been sitting on a copy of the sblive code for a few days actually. I did an initial run through of the code and sent them some comments back. It'll be there fairly soon."

(ed. [] I don't know about anyone else, but I'm still always shocked to find hardware vendors releasing specs and publishing GPLed drivers. It ain't like the old days. -- Zack)

8. Serial Driver Restructuring

2 Nov 1999 - 4 Nov 1999 (10 posts) Archive Link: "Bogus serialP.h patch?"

People: Theodore Y. Ts'oAlan CoxLinus Torvalds

Theodore Y. Ts'o complained that a patch that had made it into 2.3.25 was bad. According to him, it defined two structures that were "internal serial structures that should *not* be exposed outside of the serial driver; they're not part of the exported interface." Alan Cox pointed out in reply that they were not just internal to Ted's driver, but to many other serial drivers as well.

Ted's point had been that these structures had been moved from a private header file (serialP.h) into a public one (serial.h), and he suggested that any drivers needing those structures should just include the private file. To this, Alan replied, "Then the drivers pick up all the other junk too. If you want to split serial.h 3 ways according to whether an item is private to your driver, private to the kernel serial drivers or public to the whole kernel, sure."

Ted replied that if drivers wanted that code, which amounted to 10 lines, they should include those lines in their .c files, not make them part of a public header file. He added, "Putting private declarations in the public serial.h is Just Wrong."

Linus Torvalds came in at this point, saying:

Actually, I've always hated the notion of "public" versus "private", and I think "serialP.h" is just a band-aid around a real problem which is that there is quite a lot of incestuous knowledge about the tty layer and serial devices all over the place.

First off, if it's really a _private_ header file, then it shouldn't be in <linux/serialP.h> in the first place. It should be in drivers/char, and you should use #include "16550.h" to make it clear that (a) it's not about "serial devices", it's about a specific _class_ of serial devices and (b) it's really just private to a specific driver (or two similar drivers), and not a generic Linux header file.

Btw, calling the dang thing "16550.h" may be technically inaccurate (it obviously is used a lot more chips than the 16550), but I think it is _psychologically_ a lot closer to what you seem to have in mind for the use. It would tell people what the file is about - which the current name does not at all. The current name probably makes most people think that the programmer was spastic and wrote an extra 'P' by mistake.

But I don't care all that much. I think that either we should do the one-liner to make existing things happy as things stand (which is basically what Alan did), or we should just fix the thing _right_, in which case the "serialP.h" file goes away - moved or integrated, I don't much think there is all that much of a difference (the integration should be much more complete, with a "serial device layer" kind of support structure etc).

Ted replied, "One of the things which has been on my todo list for a long time, but which has languished due to lack of time, has been to move more functionality into the tty layer. Break handling and tty baud calculations have already been moved into the tty layer; the same goes for the tty flip buffers, and so on. That's probably far cleaner than creating a new "serial device layer". But other than that, I agree."

9. Patent Infringement Or Prior Art In Linux Code

2 Nov 1999 - 6 Nov 1999 (23 posts) Archive Link: "Patent"

Topics: Patents

People: Gregory MaxwellRichard M. StallmanJoe AcostaMichael H. WarfieldTheodore Y. Ts'oRichard B. JohnsonH. Peter Anvin

Gregory Maxwell gave a pointer to a search in the US Patent And Trademark Office, and added:

I thought you all might want to know: Almost all Linux kernels today are infringing on US patent #5,806,063. The infringing code is in linux/arch/i386/kernel/time.c:get_cmos_time. It deals with using 'windowing' to convert non-y2k-ok dates into 4 digit dates.

Nevermind the fact that Linux had this code more then a year before the patent was applied for. :)

How does the GPL look opon this, can I still distribute Linux since I dont agree with the patent? If I (as say a linux distro) license the patent (to cover my ass) could I still distribute Linux?

Richard M. Stallman replied, "I will ask our lawyer to double-check whether Linux constitutes prior art for the patent. If it does, it would be grounds to render the patent invalid. Could you tell mre precisely what Linux does with the dates, and in what context, for what purpose? The lawyer may need to know those things."

In response to Greg's question, "If I (as say a linux distro) license the patent (to cover my ass) could I still distribute Linux," Richard went on:

If the patent license you get covers redistribution by the people who get copies indirectly from you, that is consistent with the GPL. However, if the license does not cover this, if you would not be able to extend that permission to redistributors, you would not be able to distribute in a way that satisfies the GPL.

The situation is the same whether you are distributing just Linux or a whole Linux-based GNU system.

I don't think you need to worry about getting a license for this particular patent, though.

Elsewhere, Joe Acosta said:

As a former Patent Examiner, I feel that I can shed some light on patents. Someting to keep in mind when looking at them is the claims. Claims are really the major part of a patent that have any relevance.

Claims 1 and 11 are the independant claims and both of those claims specifically state "A method of processing dates in a database,..."

Does the linux kernel use a database? (retorical q here) I think not .. thus this patent is irrelevant to what is being done in kernel code.

However all the compaines that are using 'pivit logic' with databases are at risk of possible infringement. Now even if you have proof of this logic in the linux kernel one must prove that it is obvious to use such logic within a database. Seeing as just about eveyone in the industry uses this kind of logic in there database apps, I'd say it was not rocket science but I am not an attourney 'I have legs' (LOL).

Personally I am ammazed at what people today call novel and want patent protection over. Companies want to patent there data structures.. no joke a certain co in a certain northwest state was and probably still is trying this. I had to leave that place cause my waders were not tall enough.

So next patent you see read the claims carefully first and look for the independant claims as they are the 'meat'. The rest of the patent is there for clairification as to what the claims are referring to and for the legal b******t that goes into patents.

Elsewhere, Gregory McLean suggested pointing out to the US Patent Office that Linux represents prior art, and getting the patent anulled on that basis. But Michael H. Warfield caught him by the sleeve and said:

Be careful there! You do NOT want to raise the issue with the PTO! Under their administrative rules, they can review the patent and only the patent hold is allowed to present "evidence" and challengers are not permitted any standing or any position to counter that evidence. If they "win" the review, which they often do, that fact is then admissible in court. At least one individual was known to try and get people to challenge him at the PTO knowing full well that he would lose in open court. After winning an administrative ruling, he then held the advantage in subsequent court challenges and had an improve chance of winning at court.

You are generally better off challenging a patent in court FIRST before any PTO review.

Elsewhere, Theodore Y. Ts'o suggested:

the best thing to do is ignore it. Let the patent holders try to sue us first, at which point it can be defeated pretty easily.

It would be interesting for someone to set up a web site, dedicated towards finding and exposing stupid USPTO tricks; the problem is that it would be a legal lightening rod, and it would have to be careful to disclaim that it was giving anything that might be construed as legal advice, or inducements to infringe patents (valid or otherwise); but just as a data repository of data that might or might not be accurate. Followups on this should go elsewhere, as it's not really a kernel issue.

Elsewhere, Richard B. Johnson said:

If anybody's interested, I can provide source-code, dated in the first part of this decade (1991), that uses the obvious "windowing" mechanism to set the century byte of the CMOS chip. This is used in the Analogic 2030 arbitrary function generator. I wrote the BIOS. This source-code is proprietary, however, to support a petition against MD, it could certainly be referenced and possibly forced into evidence if a complaint ever went that far.

Just because a patent was issued, it does not mean it's valid. If the patent holder is informed that your use predates his, and it becomes obvious that there was prior art not cited in the application, the patent holder will usually issue an "unrestricted license" so that nobody has to show patent validity or otherwise.

H. Peter Anvin added, "Actually, the windowing approach was used in PC-DOS 2.0 (1981-or-so): if you enter a two-digit date it is mapped on the 1980-2079 window."







Sharon And Joy

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.