Kernel Traffic #231 For 10 Sep 2003

By Zack Brown

If you like Kernel Traffic and want to send me a little money, click here:

Table Of Contents

Mailing List Stats For This Week

We looked at 2456 posts in 12233K.

There were 633 different contributors. 325 posted more than once. 192 posted last week too.

The top posters of the week were:

1. Linux 2.6.0-test4 Released

22 Aug 2003 - 30 Aug 2003 (20 posts) Subject: "Linux 2.6.0-test4"

Topics: Kernel Release Announcement, Power Management: ACPI

People: Linus TorvaldsErik Andersen

Linus Torvalds announced 2.6.0-test4, saying:

There should be a lot of compile fixes here, along with updates for ia64, and the (painful) move of the 'name' entry out of the "struct device" that helps avoid unnecessary memory waste.

It's a lot of small stuff all over: nothing really stands out in diffstat, except the big update of the Zoran video capture driver, and the blkmtd driver - both updated from their respective development trees (and the ips scsi driver, but that was due to massive whitespace fixing).

Normal merges with Andrew and arch maintainers (x86-64, ia64, sparc64, arm), and AGP updates (notably the merging of the ATI IGP). And network driver updates, ACPI and power management infrastructure.

Erik Andersen posted a patch and reported:

In both 2.4 and in 2.6, error handling for bad cdrom media is wrong. And it is my fault I'm afraid, since I botched an earlier fix for the problem by putting the fix in the wrong spot.

My kids have a "Jumpstart Toddlers" cd they have long since completely killed, which makes a great test disc. Without this fix, the best time projection I can get for completing a dd type sector copy is about 2 years... Most of that is spent thrashing about in kernel space trying to re-read sectors we already know are not correctable.... After the fix, I was able to rip a copy the CD (or rather muddle through it getting lots of EIO errors) in about 15 minutes.

2. Status Of ReiserFS 4

26 Aug 2003 - 28 Aug 2003 (35 posts) Subject: "reiser4 snapshot for August 26th."

Topics: FS: ReiserFS, FS: rootfs, SMP

People: Oleg DrokinFelipe Alfaro SolanaHans ReiserSteven Cole

Oleg Drokin announced:

I have just released another reiser4 snapshot that I hope all interested parties will try. It is released against 2.6.0-test4. You can find it at http://namesys.com/snapshots/2003.08.26. I include release notes below.

Reiser4 snapshot for 2003.08.26

WARNING!!! This code is experimental! WE ARE NOT KIDDING! DO NOT PUT ANY VALUABLE DATA ON REISER4 YET!

Fixed some bugs. And finally reiser4 should compile on 64bit boxes (hm. somebody try it, as I am unable to build any 2.6 kernel for ia64). Also reiser4 should now build without debug enabled. Important SMP bug was fixed (only was in effect for SMP kernels on boxes with less than 3 CPUs). There are still some OOM problems sometimes that we are working on, but generally I hope problems reported by various people about compile failures should be fixed now. Readonly mounts (and hence - reiser4 as rootfs) are not supported too.

reiser4progs update includes some 64 bit fixes too along with other stuff. fsck still does not work, so don't even try to run it.

A couple posts down the line, Felipe Alfaro Solana said he'd run into problems building ReiserFS as a module as opposed to building it in the kernel binary itself; and Oleg replied, "Building as module is also not yet supported." Steven Cole posted a patch to warn about this situation, but Hans Reiser said it would be better to disable the feature entirely if it wouldn't work, rather than just give a warning.

3. Some Discussion Of Binary Modules

27 Aug 2003 - 28 Aug 2003 (6 posts) Subject: "binary kernel drivers re. hpt370 and redhat"

Topics: FS: initramfs, FS: ramfs

People: Joe BriggsStephen HemmingerAlan Cox

Joe Briggs asked, "I have a client who has a raid controller currently supported under windows, and now wants to support linux as a bootable device. Currently, some of their trade secrets are contained in the driver as opposed to the controller firmware, etc., so for now they wish to release a binary-only driver to certain beta customers. (i.e., 1st stage of porting is similar functionality as windows). Am I correct that in order to boot off of this device that the driver would have to be statically linked in vs. a module which could be distributed as a binary-only driver keyed to the kernel.revision of the distribution's kernel?" Stephen Hemminger replied, "The driver could be a module and live in initramfs. If you can get the initial Linux image and initramfs loaded, you would be okay. The problem is more in the bootloader (LILO or GRUB) would not know how to do raid. The /boot partition would have to be on a non-raid partition. Same problem if driver is statically linked in the kernel." Alan Cox remarked, "its not IMHO so much trade secrets as "improving the barrier to vendor change" 8). Pretty much all of the older PATA controllers don't actually do hardware raid but bios/driver raid - ie its the equivalent (or roughly so) of the md layer but locks you into the vendor. The notable exception here is the 3ware card (there are a couple of others too - Promise Supertrak100, SX6000)" . And Joe replied, "I believe that companies will eventually see it costs less and the benefits higher to open source their drivers. But that is a conclusion and paradigm that they will have to evolve to themselves, and shoving it down their throats will only lengthen the process. The first step is to support their hardware under the platform (linux), and that is what I am focusing on."

4. Status Of CFQ Scheduler

28 Aug 2003 - 29 Aug 2003 (6 posts) Subject: "State of the CFQ scheduler"

Topics: SMP

People: David NielsenJens AxboeFelipe Alfaro Solana

David Nielsen asked, "What ever happened to Jens Axboe's CFQ scheduler - as a regular users I really enjoyed the CFQ scheduler as it made my desktop feel a bit smoother. Is any work at all been done to this fine piece of code or has it been dropped completely in favor of AS ?" Jens Axboe replied, "I'm glad you enjoyed it. No CFQ hasn't been dropped, it was/is just on hold waiting for the loadable scheduler infrastructure. The reason for that is that I made lots of changes to that code base, not the old one that was in -mm. It shouldn't be too hard to adapt the latest version from before -mm dropped it and adapting to the current kernels." Felipe Alfaro Solana asked for a patch against the current kernel version, and Jens replied, "Alright, here's a version for 2.6.0-test4. It builds, it survived a 128 client dbench on SMP. And that's all the testing I did, so be careful. You need to boot with elevator=cfq to activate it." Felipe was happy.

5. New netplug Daemon To Handle Network Cable Hotplugging

28 Aug 2003 - 3 Sep 2003 (13 posts) Subject: "[ANNOUNCE] netplug, a daemon that handles network cables getting plugged in and out"

Topics: Hot-Plugging, Ioctls, Networking

People: Bryan O'SullivanAaron LehmannJeff GarzikJ.A. MagallonStefan Rompf

Bryan O'Sullivan announced:

Netplug is a daemon that responds to network cables being plugged in or out by bringing a network interface up or down. This is extremely useful for DHCP-managed systems that move around a lot, such as laptops and systems in cluster environments.

For more details and download instructions, see the netplug homepage: http://www.red-bean.com/~bos/

Aaron Lehmann replied gratefully, "Thank you, thank you, thank you. I was just thinking today how annoying it is that whenever I boot up my laptop, dhclient runs and tries to get an IP address on the ethernet interface until it's ^C'd. Since I often use the Ethernet interface this is not a bad default, but dhclient can't even realize on its own that there's no cable plugged in." But elsewhere, J.A. Magallon asked if Bryan had known of the ifplugd (http://www.stud.uni-hamburg.de/users/lennart/projects/ifplugd/) project, which seemed to perform a similar task. Jeff Garzik replied, "ifplugd doesn't appear to use netlink. Did I miss something? netlink is definitely the preferred way to get link notification. Maybe the two authors can work together to merge the best parts of both..." J. A. replied, "That would be very nice, but there is still a problem. Does netlink solve the fact that there are cards (at least in 2.4) that do not support any detection method?" And Bryan said, "netlink doesn't work through the ioctl interface at all. If a card is capable of reporting that its flags include IFF_UP or IFF_RUNNING via the netlink interface, then netplug will work." And Stefan Rompf added, "even in 2.6 not all cards support link state via netlink, it requires some updates to the driver. Maintainers should take this as a hint to add netif_carrier_on()/_off() or mii_check_link()/mii_check_media()-calls ;-). This does not hurt for 2.4 as these functions are already available there, but do not create notifications in the stock kernel."

6. Kdb 4.3 Released

29 Aug 2003 (1 post) Subject: "Announce: kdb v4.3 is available for kernel 2.4.22"

People: Keith Owens

Keith Owens announced, "ftp://oss.sgi.com/projects/kdb/download/v4.3/. Current versions are kdb-v4.3-2.4.22-common-1.bz2, kdb-v4.3-2.4.22-i386-1.bz2. Other platforms will follow as they get updated to 2.4.22. This is just a maintenance version to sync with kernel 2.4.22, kdb v4.4 will have more changes. Changelog extracts since 2.4.21."

7. Status And Discussion Of EFI (Extensible Firmware Interface) Support

29 Aug 2003 - 5 Sep 2003 (16 posts) Subject: "[UPDATED PATCH] EFI support for ia32 kernels"

Topics: Assembly, BSD: FreeBSD, Disks: SCSI, PCI, Patents, Power Management: ACPI, USB

People: Matt TolentinoAndrew MortonEric W. BiedermanLinus TorvaldsMark DoranJamie Lokier

Matt Tolentino announced:

Attached is an updated patch against 2.6.0-test4 that enables Extensible Firmware Interface (EFI) awareness in ia32 Linux kernels. I've incorporated the feedback I've received since my initial posting (http://marc.theaimsgroup.com/?l=linux-kernel&m=105848983307228&w=2) including:

He went on:

I've been able to successfully boot kernels on EFI systems with this patch using version 3.4 of the ELILO boot loader released last week by Stephane Eranian as well as using GRUB on ia32 systems with legacy BIOS.

Special thanks to Bjorn for providing valuable feedback on the initial patch.

Andrew Morton asked:

Just for my edification: why does EFI exist?

"The EFI specification defines a new model for the interface between operating systems and platform firmware. The interface consists of data tables that contain platform-related information, plus boot and runtime service calls that are available to the operating system and its loader. Together, these provide a standard environment for booting an operating system and running pre-boot applications.

"The EFI specification is primarily intended for the next generation of IA-32 and Itanium Architecture-based computers, and is an outgrowth of the "Intel Boot Initiative" (IBI) program that began in 1998."

It sounds like it's filling in some gaps in ACPI?

Matt replied, "Not really. EFI is a broader interface to platform firmware and the hardware that has been designed to be generic, such that it may be implemented on any architecture and/or any platform. You can think of it as an interface to the traditional BIOS. In a pure EFI environment, the device model, various defined services and protocols, and structure negate the need for traditional BIOS calls. For example, you would no longer call int10h to change the video modes - instead you would call a function of a video/console protocol for the video device. Another example is the int15h call to get the e820 memory map is no longer required - instead EFI provides a memory map of all usable memory in the system, along with attributes, ranges, types, etc. As for its relationship to ACPI it is complementary. The EFI specification does not rewrite or redefine accepted standards such as ACPI. Instead it enables this type of platform configuration information to be obtained in a standard fashion."

Elsewhere, Eric W. Biederman answered Andrew's question of why EFI existed. Eric said:

As I have heard the story.

The guys at Intel were having problems getting a traditional PC style BIOS to run on the first Itaniums, realized they had a opportunity to come up with a cleaner firmware interface and came up with EFI. Open Firmware was considered but dropped because it was not compatible with ACPI, and they did not want to dilute the momentum that had built up for ACPI.

And now since Intel has something moderately portable, they intend to back port it to x86 and start using/shipping it sometime early next year.

What I find interesting is that I don't see it addressed how the 16bit BIOS calls in setup.S can be bypassed on x86. And currently while it works to enter at the kernels 32bit entry point if you know what you are doing it is still officially not supported.

A couple posts later, he added that even though EFI was present in the first Itaniums, "EFI came very late in the game. I have talked to the Intel guys who thought it up. And from a practical standpoint the EFI interface is still stabilizing."

In parallel to these comments, Matt had remarked, "the EFI sample implementation can be used on boxes with legacy BIOSes and the interface is consistent with what is currently shipped on ia64 platforms. The intention is to have an interface to the firmware that is portable and consistent. For example, much of the linux loader is shared between ia64 and x86. Assuming add-in cards have EFI compliant drivers, this also makes option ROM and even system BIOS upgrades easy with EFI utilities and without the need for DOS." Eric replied, "Getting EFI drivers in a byte code format would of course be nice. But mostly this helps the Itanium, not x86. I can already get standard x86 option roms." And Matt replied, "It would be nice. It is especially nice for vendors because they can reuse a single driver image for multiple architectures assuming there is an interpreter and EFI support." At this point Linus Torvalds said:

No. It would be a total nightmare.

Vendor-supplied drivers without source are going to be BUGGY.

They are going to be doubly buggy if they are run with a compiler that has a buggy back-end.

And that back-end is going to be buggy if it's for some random bytecode that isn't widely used except for some silly EFI thing and is tested exclusively with just a few versions of Windows and _maybe_ occasionally on Linux.

Face it: firmware bytecode is a total braindamage. The only thing that works is _source_code_ that can be fixed, and lacking that, we're better off with a well-defined ISA that people are used to and that has stable simple compilers.

In other words: x86 object code is a better choice than some random new bytecode. It's a "bytecode" too, after all. And it's one that is stable and runs fast on most hardware. But as long as it's some kind of binary (and byte code is binary, don't make any mistake about it), it's going to always be broken.

EFI is doing all the wrong things. Trying to fix BIOSes by being "more generic". It's going to be a total nightmare if you go down that path.

What will work is:

Don't screw this up. EFI is not going in the right direction.

Mark Doran of Intel replied:

As one of the people responsible for the EFI Specification and our industry enabling efforts around that spec, I'd like to offer some background that I hope will illuminate some of the issues discussed in this thread. This is going to be a bit long...let me apologize in advance for that but I think there's quite a bit of context here and sharing that may help people understand why EFI works the way it does for Option ROMs.

In 1999 when we were first working on the EFI spec in draft form, a number of the OEMs and IHV companies that we talked to told us that an EFI spec without a solution for the "option ROM" problem would not be accepted in the industry.

At that time, I tried to make the case that instead of propagating the problem into the future we should focus on moving the industry to "architectural hardware" that wouldn't even need option ROMs. What I meant by that was add-in cards with common register-level hardware interfaces to allow operating systems code to carry driver and boot loader code that would be able to work across a range of vendors' products. Perhaps the UNDI network card interface that Intel developed would be a good model for a start at this approach as an example; both in terms of how to do it and the level of traction (or lack thereof) one can expect taking this approach.

The trouble with the "architectural hardware" argument proved to be that PCI is already well established and there is a vibrant industry churning out innovative PCI cards on a regular basis. The idea of a single interface definition for all cards of each of the network, storage or video classes is viewed as simply too limiting and the argument was made to us that to force such a model would be to stifle innovation in peripherals. So effectively the feedback we got on "architectural hardware" was therefore along the lines of "good idea but not practical..."

Faced with that and what amounts to a demand for a solution, we tried to scope the problem. Today's IA-32 Option ROMs are typically 16-bit, IA-32 real mode code, they must live in a magic 128k (192 on some boxes) window below 1MB, and there are no hard and fast rules about what resources on the machine they may or may not touch. The reason the OEM folks asked us to look at solving this issue set in the context of EFI is to try and improve the real nightmares that they face every day. The kind of thing where you plug in an adapter card and suddenly the floppy doesn't work anymore. The kind of thing where you plug in four SCSI controllers and it's highly likely that you can't reach a perfectly good OS install on a drive connected to one of those controllers because the BIOS can't shadow that much ROM code. The kind of thing where ROM code uses I/O reads in lieu of calibrated delays causing controllers to fail on newer, faster systems.

EFI's origins come from the 64-bit side of the house. It was originally conceived in the context of a need for a means to handle programmatic transfer of control from the platform code (BIOS/firmware) to the OS; in other words an abstraction for the platform to support booting shrink-wrap OSes, installed right off the distribution CDs. However, we also worked hard at building a C language binding for those interfaces that would work just as well for IA-32 or even XScale or perhaps even for non-Intel Family processors in fact. The idea being a piece of code written to consume EFI services can compile unmodified and without gratuitous #ifdef's for 32-bit or 64-bit system merely by choice of compiler.

In the context of option ROMs then, this approach would say that you can write a single chunk of C language EFI driver code, your option ROM equivalent. This code can load anywhere in the address space of the machine (EFI uses protected mode, virtual equals physical addressing model), it can use the full address width of the machine for data references and you can compile it for your target machine architecture of choice. So far so good.

However there are some other practical deployment issues that add-in cards bring to bear that we also had to address.

These cards have a habit of traveling from machine to machine. Customers have a reasonable expectation that cards just work when you move them from one system and plug them in to another. Since the receiving system motherboard might have no knowledge of the card you just added, how does it present devices connected to that card as candidates for booting?? Motherboards cannot reasonably carry code for every device a customer might choose to plug in. Addressing that problem is what the Option ROM does for you.

We also tried to advocate for having the drivers/Op ROM images be separately distributed. Some of the IHVs liked that: just ship a floppy with the card, much cheaper than putting NVRAM memory on card. The OEM folks however point out that the floppy gets lost and now the card is useless to the customer...support calls ensue. Thus the code needs to travel as part of the card.

If a card can travel from system to system, that also means it can cross processor architecture boundaries too - there are Itanium Family and IA-32 family machines with PCI slots that are electrically compatible and the expectation is that the cards work equally well in both system types. For the Option ROM content though that presents a dilemma - what do you carry in the ROM?? Native compiled IA-32 code and also native compiled Itanium family code perhaps. Well that works, the PCI spec says a ROM container can have multiple images; we take advantage of that now to build cards that carry a 16-bit conventional ROM and an EFI driver together and there are also Forth images out there for SPARC and Power systems.

As a practical matter carrying multiple instruction set versions of the same code gets expensive in FLASH memory terms. Consider an EFI compiled driver for IA-32 as the index, size: one unit. With code size expansion, an Itanium compiled driver is going to be three to four times that size. Total ROM container requirement: one unit for the legacy ROM image plus one for an EFI IA-32 driver plus three to four units for an Itanium compiled driver image; to make the card "just work" when you plug it into a variety of systems is starting to require a lot of FLASH on the card. More than the IHVs were willing to countenance in most cases for cost reasons.

EFI Byte Code was born of this challenge. Its goals are pretty straightforward: architecture neutral image, small foot print in the add-in card ROM container and of course small footprint in the motherboard which will have to carry an interpreter. We also insisted that the C source for a driver should be the same regardless of whether you build it for a native machine instruction set or EBC.

We did some other things with EBC's definition too; like not including direct I/O instructions. That may sound odd for an environment specifically designed for I/O devices but if you think about it, it's the motherboard code that knows what is and is not "safe" to do by way of I/O more than the device itself that could find itself in pretty much any old machine design. This we believe will significantly improve the reliability of ROM code...it relieves the add-in card Op ROM writer of any attempts to guess and assume what the I/O environment is that the card will encounter out in the field.

You may ask why we didn't just use an existing definition as opposed to making a new one. We did actually spend quite a bit of time on that very question. Most alternatives would have significantly swelled the ROM container size requirement or the motherboard support overhead requirement or had licensing, IP or other impediments to deployment into the wider industry that we had no practical means to resolve. With specific reference to why we chose not to use the IA-32 instruction set for this purpose, it was all about the size of an interpreter for that instruction set. To provide compatibility 100% for the universe of real mode option ROM binaries out there would require a comprehensive treatment of a very rich instruction set architecture. We could see no practical way to persuade OEMs building systems using processors other than IA-32 to carry along that much interpreter code in their motherboard ROM.

Consider the model of Alpha and FX32 as an example; FX32 would be impractical to carry on the motherboard outside the scope of a running OS. At one point we did have an EFI draft that included a processor binding for Alpha at Compaq's request. That material isn't in the final spec for reasons that don't really relate to EFI. Nevertheless, making an Option ROM solution that could plausibly work on multiple CPU architectures (including ones from outside the IA family) and before there is an OS on the box to support an expansive interpreter loomed large in our thinking at the time. [By the by, we remain open to adding other CPU bindings into the EFI spec should anyone approach us with such a proposal in hand.]

By contrast EBC requires a small interpreter with no libraries (roughly 18k uncompressed total on IA-32 for example) and the average add-in card ROM image size is 1.5 units relative to native IA-32 code. And keep in mind that using byte code for this purpose is in widespread, long time use on other CPU architectures so we felt the technique in general was viable based on industry experience with it. Yes, it's a compromise but the best balance point we have been able find to date.

I agree that the compiler back end for EBC will be used for small chunks of code and relatively few of them at that. That compiler and its back end will by definition end up with less code-mileage on it, if you will. I can only say that Intel is supporting the compiler as a commercial product and we stand behind it just as much as we do the native IA-32 and Itanium compilers. Find a bug, let us know - we'll fix it. We run the same tests on the EBC compiler as we do on the native compilers and a few more besides that do EBC torture exercises. The compiler has been in testing for more than a year and in release for nearly than long now. At any rate, feedback we've received so far doesn't seem to indicate stability problems in the compiler; if your experience varies from that please let me know - I'd like to help fix it for you! Incidentally, nothing prevents someone from retargeting GCC for this application.

It is with no small trepidation, given the assembled company, that I turn to the question of Open Source as it relates to EFI and Option ROM code. However...

There is nothing about the definition of the EFI spec or the driver model associated that prevents vendors from making add-in card drivers and presenting them in Open Source form to the community. In fact we've specifically included the ability to "late bind" a driver into a system that speaks EFI. In practice that late binding means that code that uses EFI services and that is GPL code can be used on systems that also include EFI code that is not open source.

The decision on whether to make any given driver Open Source or not therefore lies with the creator of that code. In the case of ROM content for an add-in card that will usually be the IHV that makes the card.

Now, we observe that the high-end add-in card makers often preserve their intellectual property behind proprietary code (via binary drivers) and/or "object models" that they implement in the ROM. In today's lexicon that means an INT13 service for a SCSI card or an INT10 service for a video card. Even for Linux OS-present drivers I understand that some open source drivers for such cards don't actually touch the metal directly for all operations they perform - they use some abstraction between the driver and the actual hardware, something that is carried around in the ROM. I suspect that this paradigm is one that will continue for some time to come. Any change in this approach will have to be worked with the vendors who feel commercial pressure to protect their IP with these kinds of mechanisms.

The EFI spec itself is published with a simple copyright statement and not one of Intel's "colored" NDA covers. The sample code that you can download from our web site is free to you and comes with what amounts to a patent license grant so that you can implement EFI, perhaps using our code in derivative fashion, without royalty or other concern. We also have support tools for folks building EFI code, like Option ROM drivers, that is distributed under the FreeBSD license.

Jamie Lokier asked about the EFI patents, and Mark replied:

Actually no, there are no patents that I'm aware of that read on EFI. We made a point of not filing any on the spec. We told everyone we talked to during the spec's development that we wouldn't file any so that the spec would end up free of any IP considerations when complete. This was a deliberate effort to support the goal of minimizing any potential barriers to adoption of EFI as much as possible.

The patent license grant is thus in some sense a double coverage approach...you don't really need a patent license grant since there aren't any patents that read but to reinforce that you don't need to worry about patents we give you the grant anyway. This helped make some corporate entities more comfortable about implementing support for EFI.

In practice we have required that any feedback or contribution to the EFI spec or code from third parties that is given to us also comes without any IP encumbrances. There are a couple of things that I've been offered that I would like to have included but couldn't in the end because it would not have been possible to continue telling folks using the EFI spec that they can do so without concern for IP issues.

8. Documentation For /proc/stat

30 Aug 2003 - 31 Aug 2003 (3 posts) Subject: "[PATCH] Add documentation for /proc/stat"

Topics: Version Control

People: Bryan O'Sullivan

Bryan O'Sullivan said, "This patch adds documentation for the contents of the /proc/stat file. The BK version of the patch at the URL below also instructs BK to ignore cscope database files."

9. Revising BK ChangeLog Entries After Acceptance

31 Aug 2003 - 1 Sep 2003 (18 posts) Subject: "bitkeeper comments"

Topics: Version Control

People: Larry McVoyAlbert CahalanGeert UytterhoevenLinus TorvaldsChristoph HellwigJakob Oestergaard

Albert Cahalan noticed that a false changelog entry ended up in BitKeeper. He asked if it could be fixed. Larry McVoy said, "If you want the comments changed I can do that on bkbits.net and anyone who grabs the update from there will get the new comments. If you want the patch gone out of BK anyone can do that with a cset -x." Albert said the code itself was fine, and only the comment needed to be adjusted. He said, "I'm OK with whatever ensures that somebody looking back through the BitKeeper logs isn't going to come to the conclusion that I broke something." Larry said, "Unfortunately the checkin comments themselves are not revision controlled. You have to run a command on each repository that needs to be fixed, if you send me the desired comments I'll post the command. Then if Linus or Marcelo says do it I'll do it on bkbits.net. That should be good enough, the logs there are what people tend to browse." Albert posted the comment he wanted to see replace the original, but Christoph Hellwig objected to the whole idea of retroactively modifying the changelog entry. He felt it was censorship, and Larry replied, "it's not my place to make that call. I said "if Linus or Marcelo says do it" specifically for the case that there is some hanky panky going on. On the other hand, it's perfectly possible that the wrong comment got stuck in there and if that's the case why shouldn't it get fixed?" Close by, Geert Uytterhoeven cautioned, "Retroactively changing a commit message may be a dangerous precedent. While there may be legitimate reasons (E.g. plain wrong comments or `actually this part was not written by x but by y'), one day The Evil Empire may claim we changed the evidence of who did what. Putting comments under revision control is another option, but may be too deep-involving..." At around this point, Linus Torvalds remarked:

Actually, I do that all the time. In fact, it was I who asked Larry to add the "bk comment" command in to make it easy to do so.

The thing is, it's hard to do after the message has already gone out into the public - but I fix up peoples email commentary by hand both in the email and often later after it has hit my BK tree too. I try to fix obvious typos, and just generally make the things more readable.

And if the comment was wrong, then it should be fixed. Not because of any "censorship", but because it's misleading if the comment says it fixes something it doesn't fix - and that might make people overlook the _real_ thing the change does.

Jakob Oestergaard objected (and Geert agreed) that this defeated the revision control of the archive, making it possible for changelogs to say one thing at one time, and another at another time, with no indication that the first changelog version ever existed. However, at this point, the thread petered out.

10. dontdiff In The Kernel Tree

1 Sep 2003 (17 posts) Subject: "dontdiff for 2.6.0-test4"

Topics: Kernel Build System, Version Control

People: Tigran AivazianJeff GarzikSam RavnborgHerbert PoetzlChristoph Hellwig

Tigran Aivazian announced:

I have updated dontdiff in the usual place:

http://www.moses.uklinux.net/patches/dontdiff

for the 2.6 kernels. Obviously this was only tested on my configuration(s) so any additions are welcome. Just email them to me and I will add them.

For those who don't know what "dontdiff" is --- grep the file:

/usr/src/linux/Documentation/SubmittingPatches

Christoph Hellwig suggested putting this in the kernel source itself, and Tigran agreed, saying, "Probably a good idea, because I hesitated whether to call this "dontdiff-2.6" and leave the existing dontdiff for 2.4 or just switch to 2.6 (assuming it is applicable to 2.4 as well). But if it is in the kernel tree then no need to worry about which dontdiff matches which kernel." Jeff Garzik replied, "I'll throw it into 2.6. I use dontdiff all the time :) FWIW I use the same dontdiff for 2.4 and 2.6..."

Elsewhere, however, Sam Ravnborg said he didn't think dontdiff should go into the kernel tree. He said, "What is included in dontdiff is redundant information already known by kbuild. Effectively dontdiff should not list any files that would not be removed during a "make mrproper". Instead why not use the knowledge kbuild has and implement 'make dontdiff'? This could generate the list of files used for 'diff -X'. I can try to hack up something during the week just to see if it looks ok." Herbert Poetzl agreed, but Jeff objected that dontdiff "must know about many things that 'make mrproper' need not care about" , such as files with a '.bak' or '~' suffix, or various version control directories used bit BitKeeper, CVS, Subversion, RCS, and SCCS. Sam replied that 'make mrproper' already knew about all of those, and Jeff took a look and admitted (with a smile) this was true. But Jeff added, "dontdiff is a file that's useful precisely because of the form its in. So, as something that's proven itself useful to a bunch of people, I definitely think it has a home somewhere in Documentation/* It need not be referenced in any way by kbuild; that's not a big deal. The two really serve different purposes." Sam in turn discovered he himself had been wrong to assume that dontdiff could be handled safely by kbuild. On further inspection he found this was not necessarily true; and concluded, "So I have changed my mind - do not autogenerate it. Stuff in the dontdiff file somewhere (scripts/?)."

11. Status Of i8xx Maintainership; Alan On Sabbatical

3 Sep 2003 - 4 Sep 2003 (9 posts) Subject: "Who maintains drivers/sound/i810_audio.c?"

Topics: MAINTAINERS File

People: Alan CoxJeff GarzikAndrew MortonMarc-Christian Petersen

Mehmet Ceyran found a small bug in the "Intel ICH (i8xx), SiS 7012, NVidia nForce Audio or AMD 768/811x" driver, and wanted to report it to the maintainer, but couldn't find anyone in the MAINTAINERS file to match that driver. He asked what he should do, and Jeff Garzik said to post the bug report or patch to the linux-kernel mailing list, and CC Alan Cox. Marc-Christian Petersen confirmed that Alan was maintaining i8xx audio, but was away on sabbatical for one year. Mehmet posted his patch and there was some discussion, in which Alan did also participate. He was also the second most active poster of the week (after Andrew Morton), so perhaps his sabbatical had not yet begun.

12. Code Fork In Software Suspend In 2.6-test

3 Sep 2003 - 5 Sep 2003 (13 posts) Subject: "swsusp: revert to 2.6.0-test3 state"

Topics: Software Suspend

People: Patrick MochelPavel MachekNigel CunninghamBrian Litzinger

In the course of various threads, Pavel Machek had really blasted Patrick Mochel for some software suspend (and other) changes that had slipped past Pavel (the software suspend maintainer) and broken various things in the official 2.6-test tree. In this thread, Pavel posted a patch to undo all of Patrick's work, and Patrick replied, "I realize you're sore that I modified the code you maintain and unintentionally broke. However, it does not benefit either of us for you to intentionally break my code in return, especially considering I've since fixed the outstanding problems in my changes." A couple of posts later, he said:

No, you have to understand that I don't want to call software_suspend() at all. You've made the choice not to accept the swsusp changes, so we're forking the code. We will have competing implementations of suspend-to-disk in the kernel.

You may keep the interfaces that you had to reach software_suspend(), but you may not modify the semantics of my code to call it. At some point, you may choose to add hooks to swsusp that abide by the calling semantics of the PM core, so that you may use the same infrastructure.

Please send a patch that only removes the calls to swsusp_* from pm_{suspend,resume}. That would be a minimal patch.

Brian Litzinger was surprised to hear of such a significant code fork so late in the game before the next stable series. Elsewhere, Pavel said to Patrick, "I've said I want the patch reverted. I still want that, because you changed way too quickly with too little testing. That does not mean I'm not going to accept your patches in future. (In fact, my plan is to get -test3 version of swsusp back for -test5, then fix up driver model/swsusp until we have -test3 functionality back, then start taking your patches). Of course, that is going to be easier with your cooperation." Patrick replied:

That's fine. Do what you want at your own pace, with your own code.

I don't think you understood my assertion of not working with you, though. I'm not going to wait around for you to merge my patches, or take more abuse from you. I have better things to do, and a stringent time frame in which to do them.

I recommend either a) accepting my changes and fixes, and help merge Nigel's 2.4 changes into the base or b) accepting the fork, merging Nigel's changes, and later trying to merge the two source bases.

Nigel Cunningham said, "Okay. Since you've both given 'comforting' replies, I'll stop getting worried, get on with finishing the 2.4 version of 1.1 and then get the port up-to-date and going. But I'm still not sure what to prepare patches against or who to send them to. Hopefully you'll have that sorted by the time I'm ready to release 1.1 for 2.4." And the thread ended.

13. ReiserFS/ext3 Comparison

4 Sep 2003 - 5 Sep 2003 (19 posts) Subject: "precise characterization of ext3 atomicity"

Topics: FS: ReiserFS, FS: ext3, POSIX

People: Hans ReiserAndrew MortonDaniel Phillips

Hans Reiser asked:

Is it correct to say of ext3 that it guarantees and only guarantees atomicity of writes that do not cross page boundaries?

I am trying to define the difference between "Atomic Reiser4" and ext3, as it seems to be a frequently asked question, and I am thinking of saying something like:

Reiser4 allows you to define a set of up to A separate arbitrary filesystem operations (where A by default is not allowed to exceed 64) that are to be committed to disk atomically. Every individual filesystem operation is atomic without the need to specify it.

By contrast, ext3 only guarantees the atomicity of a single write that does not span a page boundary, and it guarantees that its internal metadata will not be corrupted even if your applications data is corrupted after the crash.

Andrew Morton confirmed that ext3 only guaranteed atomicity for writes that did not cross page boundaries. Daniel Phillips asked if this was just "happenstance", or was it a POSIX requirement. Andrew replied:

Happenstance.

It's semi-trivial to do this in ext3. You'd open the file with O_ATOMIC and a write() would either be completely atomic or would return -EFOO without having written anything.

The thing which prevents this is the ranking order between journal_start() and lock_page().

It's not trivial but also not too hard to change things so that journal_start() can rank outside lock_page() - this would also offer some CPU savings.

Can't say that I'm terribly motivated about the feature though.

A couple of posts later, Daniel said to Hans, "More power to you for adding a transaction interface to Reiser4, and blazing that trail. It's totally missing as a generic api at the moment, and needs a push."

Back in Andrew's initial reply to Hans at the start of the thread, Andrew also questioned Hans' statement that ext3 "guarantees that its internal metadata will not be corrupted even if your applications data is corrupted after the crash." Andrew replied to this, "Not sure that I understand this. In data=writeback mode, metadata integrity is preserved but data writes may be lost. In data=journal and data=ordered modes the data and the metadata which refers to it are always in sync on-disk." Andrew suggested a couple posts later, using the text:

"In all journalling modes ext3 guarantees metadata consistency after a crash. In its data=journal and data=ordered modes ext3 also guarantees that user data is consistent with metadata after a crash.

However ext3 does not provide user data atomicity guarantees beyond the scope of a single filesystem disk block (usually 4 kilobytes). If a single write() spans two disk blocks it is possible that a crash partway through the write will result in only one of those blocks appearing in the file after recovery"

Hans suggested in turn:

"Ext3 guarantees that its metadata will be comitted sufficiently atomically that after a crash it will be consistent with itself.

In data=journal and data=ordered modes ext3 also guarantees that the metadata will be committed atomically with the data they point to. However ext3 does not provide user data atomicity guarantees beyond the scope of a single filesystem disk block (usually 4 kilobytes). If a single write() spans two disk blocks it is possible that a crash partway through the write will result in only one of those blocks appearing in the file after recovery."

At this point other folks came into the discussion, and the thread ended inconclusively.

 

 

 

 

 

 

Sharon And Joy
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.