Kernel Traffic #71 For 12 Jun 2000

By Zack Brown

Table Of Contents

Introduction

Thanks again go to Tom Davey for typo corrections. That's three issues in a row. Thanks, Tom!

William Astle had some information to add about the 6809 chip in Issue #70, Section #1  (16 May 2000: Linux On Hand Calculators) . Thanks a lot, William!

Also, many thanks go to Stephen Landamore for pointing out that the multi-mount feature discussed in Issue #70, Section #2  (18 May 2000: Things To Do Before 2.4: Saga Continues) had a predecessor in the devfs discussion covered in Issue #64, Section #9  (14 Apr 2000: More 'devfs' Discussion) . Stephen, you totally rock! Those are tough connections to spot, but they really help make sense out of what's going on.

As of KT publication time, kernelnotes.org seems to be down, so there are no links into the mailing list archives at the moment. I'll fix them when it comes back up.

Mailing List Stats For This Week

We looked at 1855 posts in 7757K.

There were 527 different contributors. 272 posted more than once. 167 posted last week too.

The top posters of the week were:

1. Troubles Getting IDE Code Into The Stable Series

21 May 2000 - 30 May 2000 (15 posts) Archive Link: "ide.2.2.15.20000509 patch breaks WDC31200F w/ VIA 82C586B"

Topics: Disks: IDE

People: Alan CoxAndre HedrickSasi PeterAndrzej KrzysztofowiczBartlomiej Zolnierkiewicz

In the course of discussion, Bartlomiej Zolnierkiewicz tried pre-2.2.16-2 with the ide.2.2.15.20000504 patch and VIA82CXXX support, and got an "unknown partition table" error during bootup. Alan Cox traced this to the ide.2.2.15.20000504 patch, and added that 2.3.x had the same patch merged in and thus had the same problem. He finished, "I imagine Andre" [Hedrick] "is still looking for this one." Andre Hedrick replied, "2.3 is going to be clean or cleaner.........do not try to drop ide.2.2.15.20000509 on a 2.2.16-preX tree........the fragmentation is bad." (He added peripherally, "The real pain to try and write blind code! I have ZERO VIA systems to date, but they are promised to send something?!" ). Sasi Peter remarked, "At least if they succeed to bring 2.2.16 performance back up to the level of 2.2.14 even in some 2.2.16preX, it would be good to have an ide patch for it..." But Alan replied that even if this were the case, he wouldn't merge the patch because too many systems broke with it. Andrzej Krzysztofowicz asked if there was a chance to merge the patch after all the known bugs had been fixed, and Alan replied, "By then it'll be 2.4 time. (and I dont mean that rudely - thats a serious estimate on timescales for the two things)"

2. Some Discussion Of Signals And Threads

23 May 2000 - 1 Jun 2000 (9 posts) Archive Link: "pthread problem with asynchronous signals"

People: Robert M. HyattChristopher Smith

In the course of discussion, Robert M. Hyatt remarked:

Signals and threads just don't mix. Signals terminate many system calls with EINTR (for example). Multiple threads can receive the signal, including every thread _but_ the one you really want to see the thing.

I've been doing threads for a long time, dating back to Cray's first version of Unicos. I found that signals are simply something to be completely avoided if threads are being used.

But Christopher Smith objected:

This is not entirely fair. Certainly, lots of things that are done via signals in single-thread models don't map so well in multi-thread models (and indeed are better done in the first place using threads and blocking), but signals have a very valid place in a threaded environment, if for no other reason than a threaded program should still be able to respond to SIGINT. ;-)

Certainly it's not easy to provide a good interface to signals in a threaded environment, but it can and has been done. Linux's implementation isn't exactly perfect yet, but it's getting there.

3. Some Discussion Of Compiler Optimization Switches

24 May 2000 - 1 Jun 2000 (17 posts) Archive Link: "-O2 vs -O3"

People: Dave JonesJamie LokierRask Ingemann LambertsenAndreas SchwabJohan KullstamMatthew WilcoxThomas PorninMatthias AndreeChristopher Thompson

Christopher Thompson asked why the kernel was compiled with -O2 optimization, instead of the more aggressive -O3. Several sub threads branched out in reply. Dave Jones said anecdotally, "During the early development of Powertweak I set the Makefile files to compile some routines with -O3 instead of -O2. I then spent hours looking for a bug which wasn't there. The optimiser was over-zealous, and removed a pointer assignment." Jamie Lokier explained, "-O3 turns on extra inlining. Inlining can allow the optimisers to do fuller alias analysis. Type alias analysis breaks things which recast pointer types such as malloc, memcpy, graphics rendering loops, etc. (But usually only when they're inlined)." Dave had also remarked that -O2 probably used only known-to-be-safe optimizations, but Rask Ingemann Lambertsen corrected, "It doesn't. None of the -Ox options turn on unsafe optimisations. While compiler bugs do occasionally happen, most of the time it is the programmer that violated the rules."

Andreas Schwab also replied to Christopher, explaining, "the only difference between -O2 and -O3 is -finline-functions, which is bad for the kernel sources (which wants to control inlining explicitly)." Matthias Andree replied that this could in turn be defeated with a '-fno-inline-functions' argument, but Johan Kullstam objected, "sure, but what exactly would be the point? the compile command line is already long enough without this completely gratuitous bloat."

Matthew Wilcox also replied to Christopher, saying, "-O3 introduces some optimisations which may not be appropriate for the kernel. In particular, it tries to know better about function-inlining. It is asserted (note I don't necessarily agree with the assertion :-) that the kernel developers in their infinite wisdom have analysed which functions would best be inlined and which are best out-of-line." And Thomas Pornin elaborated, "Just for completeness: there are a couple of functions that must not be inlined (for instance, __delay() that is in arch/i386/lib/delay.c); but the main reason seems to be the following: the compiler uses some heuristics to guess whether a function is worth inlining, or not. These heuristics are tuned for userland code, not kernel code, where the situation is different (for instance, memory is much more expensive in the kernel, since kernel memory cannot be swapped out)."

There was a bit more discussion.

4. CML2 Replacement For The 'kbuild' System; Language Dispute

24 May 2000 - 3 Jun 2000 (181 posts) Archive Link: "Announcing CML2, a replacement for the kbuild system"

Topics: Disks: SCSI, Kernel Build System

People: Eric S. RaymondAlexander ViroAlan CoxPeter SamuelsonLarry McVoyGiacomo CatenazziJames SutherlandMichael Elizabeth Chastain

Eric S. Raymond announced (quoted in full):

For some weeks now, I have been developing a replacement for the kbuild system used to configure Linux kernels. This effort has had the support of Michael Elizabeth Chastain, the principal kbuild maintainer, and has benefited from input by others on the kbuild list.

The project is not yet complete, but it has reached a beta stage at which it is usable and in significant ways functionally superior to the present system. I am confident that it will complete. I am announcing now rather than holding off until I'm completely done because there are some preparations which, if begun now, will significantly reduce total transition costs. These preparations will *not* break the present kbuild system.

Why this project at all? It all started when I realized that building kernels is way too hard. I wanted to simplify the configuration task enough to make configuration accessible to non-gurus. It needs to have more policy options. Rather than hundreds of questions like "Include FOOBAR2317 driver?", the novice should see stuff like "Compile in all modular drivers as modules without prompting?"

This just can't be done with the existing kbuild system. The existing config-language programs are hard to read and modify, and the code that interprets them has become a huge, unmaintainable hairball of Tcl/Tk, C, makefiles, and shell. It has all become terminally brittle, and the maintainers agree that it needs to be nuked and rebuilt from scratch.

It happens that I love writing domain-specific minilanguages, so I have tackled this problem head-on. I have designed a new configuration language I call CML2 (the existing language I have retrospectively named CML1). The implementation has two parts:

  1. I have implemented a CML2 compiler that validates CML2 rulesets and generates a rulebase that can be used to drive a configuration process. I have translated almost all of the 7049 lines of CML1 in the 2.3.99-pre9 source tree to validated CML2 (and *that*, believe me, was hard work -- it took longer than the CML2 design and coding!).
  2. I have written a configurator that is ready for testing. This program reads in the rulebase and uses it to do the actual config-file generation from a dialog with the user. Though not yet equipped with a Tk interface, this program fully demonstrates the capabilities of CML2. It runs in either line-oriented or curses mode depending on the display environment (line-oriented mode can be forced with a command-line switch).

The line-oriented mode of the new configurator is much more powerful than the original Configure. It's possible to move backward or jump around in the configuration sequence; the constraints that were expressed by if-then-else logic in CML1 are now checked every time the value of a relevant symbol is changed. It also has full access to the help system.

The curses mode, unlike the old menuconfig code, also has full access to the help text. It reports attempts to set symbol combinations that would result in an invalid configuration.

The configurator should be able to present a Tk-based menu interface when it detects that it's running on an X display. This is the part I haven't written yet.

The code needs more testing, which is one reason I am announcing now. It would be useful for configure maintainers to begin running through odd configurations to see if they can get it to misbehave.

The first alpha of CML2 is available at

http://www.tuxedo.org/~esr/kbuild/

It includes the alpha implementation, documentation, and a transition guide for maintainers of CML1 code.

OK, here's the bad news: the new system will not be an instant, painless replacement for the old. I tried hard, but there was just too much cruft to clean up for that to be possible.

The major source of problems is, as you might expect, that 6747-line mass of old code -- the new language is nontrivially different than the old, and the CML1 corpus is so tangled and nasty that I am certain I have made at least a few mistakes in the translation. There are a couple of places where I didn't understand the author's intentions well enough to translate some particularly grotty CML1 code. I'll need some help untangling these knots.

I apologize for this, but the translation overhead would only have been avoidable if CML1 had been good enough not to replace. It's a cost we have to pay to clean up a mess that would otherwise only have gotten worse, and eventually have become a serious drag on kernel development and porting.

There are some other minor problems, which we can fix up front. Mostly they have to do with cleaning up the configuration-symbol namespace (which would be a good idea even if we planned to keep CML1).

Now the good news: we will win big by changing over. Here are some of the advantages of the new language:

  1. Single parser and front end
    CML1 had three different interpreters, none perfectly compatible with any of the others. CML2 has one rule compiler and one rulebase-interpreter front end. This will be good for consistency and economy.
  2. A more expressive, easier-to-program configuration language
    The rather spiky and cluttered shell-like syntax of CML1 is replaced with a much simpler and more regular format resembling that of .netrc or .fetchmailrc. More importantly, the semantics of the language are declarative rather than imperative -- a better match for the problem domain, and thus more expressive and easier to code in.
  3. Drastic reduction in code size and complexity
    The 7049 lines of CML1 in the 2.3.99-pre9 kernel translate to a hair less than 2400 lines of CML2, a reduction by a factor of about three. The CML2 compiler and prototype interpreter are the same factor of three smaller than the nearly 10,000 lines of code in the CML1 interpreters and tools. Where CML1 is a complex mixture of C, shell, Tcl/Tk, and Makefiles, CML2 is all be written in a single language (Python).
  4. Eliminates (or at least drastically reduces) lag between port configurations
    The fact that the top-level CML1 files of the nine ports in the kernel tree are separate means there have been plenty of opportunities for the common code in them to suffer from version skew -- I point out about a dozen bugs of this kind in the list of errors at the end of this post. CML2's design and compilation rules should effectively prevent future bugs of this kind.
  5. Clean separation between configuration language and configuration UI
    CML decouples the configuration language from the configuration user interface (they communicate with each other only through the compiled rulebase). This means that it will be relatively easy to improve the UI and the language separately.
  6. Internationalization
    CML2 query prompts and menu banners are separated from the symbol dependency declarations. Thus CML2 system definitions can be internationalized and localized.
  7. Language is fully documented
    CML2 has a complete, explicit description. Syntax, language semantics, and front-end policy options are all discussed in detail.
  8. Policy-based options
    The declarative semantics of CML2 makes it much easier to set up and check interdependencies among symbols. I have done only enough of this in the CML1 translation for demonstration purposes (there are new symbols TUNING, EXPERT and WIZARD that change some visibilities). Once CML2 is in place, it should be a relatively small effort to give the user a rich set of policy and don't-bother-me options.

So, how do we get there from here?

Obviously, I have to finish the CML2 front end. This is not a large job; I already have a demonstrable prototype that runs in tty and curses modes, and even on my heavy travel schedule I expect to have the Tk version ready in about two weeks.

I have designed the CML2 implementation to coexist with CML1, so both methods can be used while CML2 is being field-tested and debugged. I anticipate a phase-in over three or four point releases during 2.5.x, followed by a back-port to 2.4.x. Once CML2 is reported OK by the various porting groups, Linus can quietly nuke the CML1 machinery.

There are a couple of preparation steps that can fruitfully begin now and should happen before 2.4 in order to minimize backporting hassles later.

  1. Notably, I would appreciate it if config-file maintainers made the following changes in those files and relevant C code:

    CONFIG_6xx -> CONFIG_PPC_6xx
    CONFIG_4xx -> CONFIG_PPC_4xx
    CONFIG_PPC64 -> CONFIG_PPC_64
    CONFIG_8260 -> CONFIG_PPC_8260
    CONFIG_8xx -> CONFIG_PPC_88x
    CONFIG_060_WRITETHROUGH -> CONFIG_M68060_WRITETHROUGH
    CONFIG_21285_WATCHDOG -> CONFIG_DC21285_WATCHDOG
    CONFIG_3C515 -> CONFIG_ISA3C515
    CONFIG_8139TOO -> CONFIG_RTL8139
    CONFIG_82C710_MOUSE -> CONFIG_CT82C710_MOUSE
    CONFIG_977_WATCHDOG -> CONFIG_WB83C977_WATCHDOG
    CONFIG_3215 -> CONFIG_IBM3215
    CONFIG_3215_CONSOLE -> CONFIG_IBM3215_CONSOLE

    The reason for these is that CML2 symbol names drop the CONFIG_ prefix. It's unneeded clutter, and made CML1 programs harder to read (the eye-brain systems that handle spelling look for prefix matches to recognize things).

    Also, I had to change the KEYBOARD and MOUSE symbols used in the MIPS branch to MIPS_KEYBOARD and MIPS_MOUSE. This is because the MOUSE symbol seems to be used in different ways on different architectures (notably in the Intel branch).

  2. I also found some apparent errors. I need these explained so I'll know how to handle them in the translation. A summary of these apparent errors is included at the end of this post.
  3. Those are the easy parts. The hard part is that I'd like to ask config maintainers to eyeball-check the CML2 translation of their work *now*. Where I am most likely to have erred is in setting visibility constraints by architecture. Ideally, I'd like everyone to have confidence that the translation is correct by the time the Tk-based front end comes out.

Presently the entire CML2 translation lives in a single file, rather than being distributed into per-subdirectory files like the CML1 corpus. This is a temporary expedient to make the transition easier. CML2's "source" facility is quite powerful enough to support distributing the information later on.

The current CML2 menu tree is ugly and poorly organized -- that is to say, it has changed relatively little from the CML1 version. I am deliberately refraining from large changes yet. Once we have tested and switched over to CML2, it will be possible to do a complete redesign of the kbuild user experience. The most important feature of CML2 is that it will give us the capability to explore that design space without risking breaking the ability to build kernels at all.

Here are the apparent errors I found in the CML1 corpus:


There is what appears to be an error in the M68K configuration sequence. Inside a PARPORT guard, the question 'Q40 Parallel port' sets PARPORT again. I created a PARPORT_Q40 symbol and set it from this question.

The symbols SGI occurs in conditionals in config files but are never set or associated with a query, nor are they used in C code anywere.

The symbol SUN3 is used in conditionals and set to n at one point, but there is no place where it is set to y.

The symbol FB_CONSOLE is set at one point but never used in either C or config language code.

The symbol ABSTRACT_CONSOLE is not used in C code, nor set anywhere in config code.

The symbol AIC7XXX_TAGGED_QUEUEING is set in the sparc64 configuration code, but not used in C code. I suspect it should be AIC7XXX_TCQ_ON_BY_DEFAULT.

The symbol ADB_PMU68K defined in the M68K driver is not referenced anywhere in the config code and not used in the C code. It seems to be a misspelling of ADB_PMU. I have eliminated it.

In the ARM port configuration, symbols ARCH_TBOX and ARCH_SHARK and ARCH_NEXUSPCI and ARCH_NEXUSPCI are referenced but never associated with a query or defined.

There are two symbols in the configuration code that seem to relate to endianness on processors that can operate in either big-endian or little-endian mode. One is CPU_LITTLE_ENDIAN (in the MIPS ports) and the other is LITTLE_ENDIAN (in the SuperH port). Neither is used in the C code; CPU_LITTLE_ENDIAN is used in a guard, once, in the MIPS32 config. I have changed LITTLE_ENDIAN to CPU_LITTLE_ENDIAN.

The symbol PRINTER_READBACK is queried for once, but never used in the config or C code. I suspect it should have been asking for PARPORT_1284.


Note: this announcement was crossposted to the linux-kernel and linux-kbuild lists. You may want to use group reply in respnding to it to reach both populations.

Port maintainers and others with a continuing interest in the development of CML2 should probably join the kbuild list -- subscribe in the usual way via linux-kbuild-request@torque.net

There were a few minor complaints (the URL didn't work at first, etc.) and one big one: the code was written in python. Alexander Viro put it, "Dependency on Python is definitely a problem - not everyone uses 'everything and a kitchen sink' type of userland." Alan Cox replied, "Python is actually pretty small and you need it on the build box not on the runtime host." Eric also replied to Alexander, saying:

I'm aware of the problem :-). Python programs can be compiled to portable C sources using the `freeze' tool. The translation is inelegant (it embeds a Python byte-code interpreter in the generated C) but it works. So a working CML2 configurator can be shipped for an environment without Python.

Even if this weren't true, we'd be trading dependencies and not adding one. The Perl stuff in the scripts directory will go away shortly (that is, assuming that Linus approves the CML1->CML2 change). This would be a net gain in kernel autonomy, as Perl *can't* be compiled away.

James Sutherland pointed out that in fact it could, and various other linguistic comparisons started throwing off sparks. A perl vs. python war seemed likely to break out, with various people trying to pour buckets on the flames before the curtains caught.

In one place, on the possibility of rewriting CML2 in C, Peter Samuelson remarked historically, "When Eric first brought his proposal to linux-kbuild a few months ago, we made him promise to draft a full language spec (as opposed to just a RTSL spec). This is one reason -- so it can be reimplemented accurately. In fact, depending on how well I end up liking his language (haven't downloaded it yet, but I remember roughly what it looked like in earlier design stages -- it did seem sane) I might take this on myself. Because I don't know if I want to see the extra tool requirement."

Eric replied, "I feel impelled to note that the kbuild guys didn't "make" me promise that -- I would have written a careful language description anyway because I just *do* things that way. It's called professionalism, or something." And Peter amended, "I know, sorry for implying otherwise. MEC" [Michael Elizabeth Chastain] "asked you to, as I recall, and you said you had already planned on it."

Elsewhere, Larry McVoy interupted the arguing to remark, "I'd just like to welcome Eric back to programming. I personally think that you lead by example, and what better way to lead a bunch of programmers than by programming? Welcome back, Eric, it's good to see you doing what you do so well." Eric replied:

The truth is that I never stopped coding. But I guess this list can't be expected to be au courant on SNG or pnglib or the other stuff I've been up to in the last year.

I wrote a lot of CML2 on my VAIO in hotel rooms and on plane flights. Coding is how I stay sane amidst the suits and journalists.

Elsewhere, under the Subject: CML2 version 0.18 is available (http://kernelnotes.org/lnxlists/linux-kernel/lk_0005_05/msg00342.html) , Eric announced:

This version fixes some curses-mode lockups reported by Giacomo Catenazzi and David Kamholz; I had a crucial test reversed :-(.

This version also online command help, a go-to command, and a search command to the Tk interface.

Later on, he announced 0.2.0, 0.2.3, and 0.2.9 (at which point he said, "We seem to be closing in on production-ready status." ) At around this point, under the Subject: State of CML2 (http://kernelnotes.org/lnxlists/linux-kernel/lk_0006_01/msg00824.html) , he explained:

It occurs to me that amidst the noise of debate about CML2's implementation and my point release announcements, many people on the list may not be aware of the capabilities CML2 has grown since I announced it last week.

So here is a brief list of things CML2 can do, *right now*, given a correct rulebase:

  1. When a symbol is turned on, all the stuff it requires is turned on as well. Thus, features can be selected in any order.
  2. Invalid configurations will be caught at the moment you try to create them. In fact, the configurator won't let you create an inconsistent configuration at all; instead, you get notified of the rules it would break and your last change (including its side effects) is backed out.
  3. You can search for symbols matching a given regexp in their name or prompt. The search hits are presented as a menu like any other menu. You can also go to a symbol by name.
  4. In the normal, "top-down" view, questions are not visible until you have enabled all their pre-requisites. Thus (for example) you won't even see questions about individual SCSI drivers unless you specify that you want to support SCSI.
  5. However, there is a switch you can flip which will cause all symbols to show, so you can configure "bottom-up" by specifying your hardware and letting the front end deduce what it needs.
  6. I fibbed a bit in point 5. It is possible to commit or "freeze" symbols so they are never queried for, even in bottom-up mode. In particular, the equivalent of "make oldconfig" works by reading in your .config, freezing all the symbols in it, and then using those to deduce most of the rest of your configuration.

For balance, here is a list of CML2's known problems.

  1. The Tk front end sometimes creates panels too large for the screen. The xconfig trick using a canvas and scrollbar is the *only* CML1 thing that CML2 can't yet do.
  2. The curses front end cannot yet edit string values longer than 8 chars.
  3. The curses front end, as of two days ago, still had some crash bugs. These may be gone now but I'm not holding my breath. It will probably take another week to be reasonably sure they're nailed.
  4. Turning module support on and then off may leave some module symbols stranded in a bad state.

These things will be fixed. I have some long plane flights coming up :-).

5. Adaptec Blows Off Kernel Developers

26 May 2000 - 30 May 2000 (7 posts) Archive Link: "Adaptec AAA-131U2 RAID"

Topics: Disk Arrays: RAID, Disks: IDE, Disks: SCSI

People: Andre HedrickSteven N. HirschAlan Cox

Rick Stevens asked if the Adaptec AAA-131U2 RAID controller was supported under Linux. Andre Hedrick said that if that was their ATA RAID then no, it was not supported, and he went on with gar, "If they can not get off their ASS to meet me when I call them about it and show up in their campus the day they announce a press release, they can write it themselves for all I care at this point in time......" But Alan Cox corrected him, saying that Rick's hardware was Adaptec's Ultra2 SCSI RAID, and was not supported. But someone else pointed out that the card could be induced to work as a regular SCSI adapter, using the aic7xxx driver.

Elsewhere, Andre went into more detail:

I need to explain why I am upset with them...... Back at LinuxWorld they asked me to work with them on development. They never called upon the product release. They choose to ignore me after asking for help. It is a matter of professionalism given to the OS regardless of the people involved.

Think of it as not showing up for a blind date.

Steven N. Hirsch replied with darkened brow, "Actually, I think of it as being "business as usual" for Adaptec. They seem to go out of their way to be uncooperative."

6. Troubles Getting NFS Fixes Into 2.2.x

27 May 2000 - 3 Jun 2000 (76 posts) Archive Link: "Linux 2.2.16pre5"

Topics: FS: NFS

People: Chip SalzenbergAlan CoxAndi KleenNeil BrownDavid WeinehallStephen Frost

Alan Cox announce 2.2.16pre5, and Chip Salzenberg implored, "Trond, Neil, and Dave Higgen have NFS rock-solid. Could we please get it into 2.2.16? Please?!?!" Alan replied, "How many thousand testers on how many OS's ? I have Trond's small lockup fix in. That'll do for now. There are reasons for doing 2.2.16 promptly or I'd be willing to try the new NFS patches." Chip sighed, and asked if they might make it into 2.2.17. Alan replied, "Never a guarante, but we definitely need to get the NFS stuff in and tested." Chip argued that it had been well tested by various folks, and Alan compromised, "we'll give it a spin but understand that if it generates a pile of 'it broke' mail it goes back out."

Andi Kleen pointed out, "The new NFS code requires user land updates for nfs-tools/knfsd. Is that appropiate for a 2.2 kernel?" Elsewhere but close by, Alan said, "Either they make it work with the existing tools as shipped by the vendors or it doesnt go in. It has to work as well as before with the old tools." But, also close by, Neil Brown objected, "Which came first, the chicken or the egg? Having full v3 support in the kernel might help encourage util-linux to get up-to-date."

Elsewhere, under the Subject: Linux 2.2.16pre6 (http://kernelnotes.org/lnxlists/linux-kernel/lk_0005_05/msg00409.html) , in the course of discussion, David Weinehall lamented Alan's plan to release 2.2.16 without NFS. He said, "I know that this seems a little narrow-sighted, but NFSv3 (or indeed, any NFS-related fixups) has been on hold for so long now that I've lost count. For every new kernel release, I hold my hopes up "Yes, now we can spend the time between this minor and the next one merging NFS-fixes and testing them out properly", but so far, nothing of the kind has happened." Alan replied that there were good reasons to get 2.2.16 out as soon as possible. Ben McCann asked what these reasons were, and Stephen Frost replied, "2.2.15 ain't the best, from what I understand." There followed a few scattered reports against 2.2.15, but nothing comprehensive.

7. Backporting Filesystem Fixes To 2.2/2.0

27 May 2000 - 1 Jun 2000 (7 posts) Archive Link: "[PATCH] 2.2.X fix ext2 socket filetype"

Topics: FS: ext2, FS: ext3

People: Andreas DilgerDavid WeinehallStephen C. TweedieAlan CoxJamie LokierTheodore Y. Ts'o

Andreas Dilger posted a short patch against linux-2.2.14/fs/ext2/namei.c, and said, "Alan, can you please add the following patch into the 2.2.16-pre tree. It is a back-port of a fix in 2.3 which adds the missing de->file_type information for sockets in ext2. Without this patch, e2fsck complains about all of the inodes in the filesystem that are sockets because the filetype hasn't been set. This is easily seen after a crash and e2fsck on /tmp when X/Enlightenment/Gnome have been running, since they create a lot of named sockets." Alan Cox replied that he'd forwarded the patch along to Stephen C. Tweedie and Theodore Y. Ts'o, but hadn't heard back one way or the other. He said if he didn't hear from them soon he'd just use his own judgement. Jamie Lokier said he thought the patch looked OK; and Theodore replied that he seemed to have lost Alan's email, but that if the backport from 2.3 to 2.2 was straightforward, it should be OK. David Weinehall came in at this point, to ask if he should port the code back to 2.0; but Andreas replied, "Nope, the filetypes feature is not implemented in the 2.0 kernels at all for ext3, so it isn't just a matter of fixing a few lines. However, you may be interested in Ted's backport of the "sparse super" ext2 support for 2.0. This reduces filesystem overhead considerably for large filesystems, as well as speeding up e2fsck a bit." And David ended the thread with, "Ted's sparse superblocks patch has been included in pre-patch-2.0.39-pre2"

8. Sound Confusion On Dell Latitudes; No Docs From Neomagic

29 May 2000 - 30 May 2000 (6 posts) Archive Link: "NM256 audio trouble"

People: Per LundbergAlan Cox

Per Lundberg reported trouble with trying to get sound working on his Dell Latitude LS with the Dell Neomagic chip. The entire machine would hang right after the NM256 module initialized (later, he elaborated, "it's the correct chip. The driver detects it and everything, but there seems to be some incompatibility. If it would just oops or crash normally, it would be so much easier to debug..." ). Alan Cox replied that this driver wouldn't work on Dell Latitudes, and suggested that the OPL3-SA might work with some luck. Kjartan Maraas replied that actually, the NM256 did work on his Dell Latitude CPiA PII 366 with the 256 AV chip. Sean Harding also reported success with an identical machine. Alan admitted that he hadn't meant to say that the driver didn't work on any Dell Latitude, only that it wouldn't work on that particular one. He went on:

Its all really odd. It appears there are three species of neomagic audio - or one that has 3 microcode sets. We have no docs from neomagic so it is tricky to tell.

Some of them are nm256 driver, some are opl3sa* and some think they are sound blasters.

Confused ? join the club.

9. Porting International Patch To Recent Stable Kernels

29 May 2000 - 30 May 2000 (7 posts) Archive Link: "2.2.15int"

People: Bartlomiej ZolnierkiewiczIgmar PalsenbergH. Peter AnvinAndrew PamGleb NatapovAndrew Morton

Someone asked when the international/crypto patch would be available for 2.2.15; Bartlomiej Zolnierkiewicz gave a pointer to the patch (http://www.sericyb.com.au/patch-int-2.2.15.1.bz2) and added, "It's dated 2000-05-07 and Xanni posted announce the same day." Igmar Palsenberg took a stauncher approach, telling the original poster, "stop wining, and patch it yourself." Andrew Morton felt that this was a bit harsh, and that on the scale of cluelessness, the original poster could have scored a lot worse. At this point the original poster came back, explained that the intention was not to hurry anyone, the patch was just needed for a server at work. Gleb Natapov quoted Andrew Pam's announcement of the latest international patch, and H. Peter Anvin ended the thread with, "Note, if someone is actually maintaining the kerneli patches these days, with the new U.S. export rules we can actually make space for them on kernel.org. Please contact <ftpadmin@kernel.org (mailto:ftpadmin@kernel.org) >." There was no reply.

10. Anti- Open Source Article Discussed

29 May 2000 - 31 May 2000 (10 posts) Archive Link: "Bertrand Meyer challenges some open-source assumptions"

Topics: Sound: OSS

People: Henrik ErikssonAlan CoxJames SutherlandHorst von BrandDavid ParsonsIan SoboroffEric S. RaymondRichard M. StallmanSteve Dodd

Henrik Eriksson gave a pointer to "The Ethics Of Free Software" (http://www.sdmagazine.com/features/2000/03/f4.shtml) by Bertrand Meyer, "a famous guru in objectorientation." Alan Cox asked sardonically, "Is this the guy who designed Eiffel and now has a competing free Eiffel compiler to worry about ?" And James Sutherland replied, quoting the article:

You'd never guess - "For example the GNU Eiffel compiler was developed at the University of Nancy by employees of that university who (in contrast with commercial Eiffel vendors, who need paying customers to survive) get every month a salary from the State, whether the users are happy or not with the product. This is a typical case of taxpayer-funded software. "

Translation: "I can't take the competition, so I'm trying FUD instead."

I'm halfway through the article so far, and he's contradicted himself in every paragraph. I can't wait to see the rest...

Steve Dodd gave a pointer to a discussion of the article (http://www.advogato.org/article/94.html) on advogado (http://www.advogato.org) , but added that the linux-kernel thread was probably getting off-topic. Horst von Brand gave the article a read, and reported (referring to Richard M. Stallman and Eric S. Raymond by their initials):

No hard facts, quotes other people without giving a shred of evidence on why they say what they say against Linux (Ken Thompson might find it worse than Windows, but on what grounds?). RMS's opinions on free software are automatically to be dismissed because of his inflamatory writings, and somewhat peculiar views, and with them the whole GNU idea. Whatever ESR advocates is wrong because he is a firearm nut (I'm against firearms myself, that doesn't make me respect ESR's opinion on software less). So, due to this "highly moral" problems, OSS is bad for you. After telling you in detail that "The observation [that highly moral people can advocate evil causes] works the other way too: bad people can defend good causes. A corrupt and dishonest politician may sincerely support principles of democracy and freedom. His personal failings do not disqualify the ideas of democracy and freedom any more than the Nazi regime's impressive building of autobahnen disqualifies the merits of freeways."

Nice troll, this one. Best done example of ad hominem I've seen in a long time. But please take it elsewhere.

David Parsons gave his reaction as well, saying, "Well, it will certainly be good for pushing up the circulation, but chumming the water with guns and communism usually is. It would have been better if he'd not spent so much time poking holes in rms, but had edited his article to make it a little bit tighter."

Almost all of those were single-post subthreads stemming off of the initial post. Ian Soboroff started an actual (though small) discussion, with his take on the article:

it mostly seems to be a diatribe on (a) RMS's definition of free software and (b) ESR's perspectives on gun rights. he has a couple interesting but certainly not new things to say about RMS's views, some rather bald statements on the topics of ethics in general, some more rather bald statements about economics in general, and lots of vitriol about guns.

bondage-and-discipline ethics to go with a b&d language.

Eric S. Raymond came into the discussion, remarking on Ian's final sentence:

Heh. Putting it that way is funnier than you know -- because the term "bondage-and-discipline language" was originated by one of the targets of Meyer's vitriol.

(It was me. But it could just as easily have been RMS; he loves the term.)

Ian replied:

that i didn't know... hah! mind if i ask the context?

of course, i have two misgivings about the line... (1) there are a couple things i actually like about Eiffel, and (2) i'm jewish... we invented b&d ethics ;-)

Eric quoted one of his usenet posts from December 1988 (quoted in part):

From postnews Thu Dec 22 15:07:46 1988
Newsgroups: comp.lang.misc
Subject: Re: colon-equal vs equal
Message-ID: <eWbWj#4OkIf8=eric@snark.UUCP>
Date: 22 Dec 88 20:02:04 GMT
References: <3300001@uxg.cso.uiuc.edu>
Status: RO

In article <3300001@uxg.cso.uiuc.edu>, phil@uxg.cso.uiuc.edu writes:

> How did the := come into being in languages like Algol, Pascal, and Ada?

It originated with ALGOL-60. The European `bondage-and-discipline' school of language design (the people who brought you Algol-68, Pascal, Modula, Ada, and Modula-2 and are now having yet another try at getting their mistakes right in Modula-3) likes to claim apostolic descent from that language, and they've retained := and some of its other crotchets.

11. aic7xxx Problems In Latest ac Kernels

29 May 2000 - 30 May 2000 (3 posts) Archive Link: "2.4.0test1-ac4/5 problems"

People: Alan Cox

Someone was having trouble getting the aic7xxxx driver to work under 2.4.0-test1-ac4 and ac5. Jon Akers confirmed this on ac5, and Alan Cox replied briefly, "The scsi fixes broke the aic7xxxx driver. At first glance it seems to be the driver that is the problem."

12. Some Success With X Under user-mode Linux

29 May 2000 - 1 Jun 2000 (5 posts) Archive Link: "user-mode port 0.24-2.3.99-pre9"

Topics: Framebuffer, User-Mode Linux

People: Jeff DikePavel Machek

Jeff Dike announced:

The user-mode port of 2.3.99-pre9 is available.

There is now a real hardware interrupt mechanism, which I got by copying the i386 irq code, and wrapping user-mode stuff around it. The consoles and network device now do their I/O off interrupts rather than the timer, which greatly reduces latency. The interactive feel is much better, especially under X.

As a side-effect of this, 'cat /proc/interrupts' will no longer hang the kernel :-)

I fixed the stair-stepping problem with the console output.

I also fixed the problem that some people had running kernels that they had built themselves. So, if you built a -pre8 kernel from source, and it did nothing but hang, that's fixed.

I've also got some caveats to go with this batch of good news. Now that this port is much more interrupt-driven, it is more prone to races. I've fixed a bunch of them, but I still see an occasional process segfault.

There is also a slight difficulty at times with the network. Sometimes packets will stop flowing. I have no idea why, but typing at a console will wake things up and get those packets flowing again. This is most easy to reproduce under X (start an xterm and a window manager and wave the mouse in and out of the xterm, and after a while, the xterm will stop blinking and the mouse will stop changing shape), but I've also seen it affect ping.

The project's home page is http://user-mode-linux.sourceforge.net The project's download page is http://sourceforge.net/project/filelist.php?group_id=429

Pavel Machek asked if, since Jeff was talking about reproducing a problem under X, this meant that X was working under user-mode Linux. Jeff replied:

Yeah, that's actually worked for a while. I just haven't gone out of my way to publicize it.

This is how to do it:

Install the X clients in your favorite user-mode root filesystem. Make sure you get Xnest. Boot it up and bring the network up if it isn't already. If necessary, xhost the virtual machine on the host. In the virtual machine run 'DISPLAY=host:0 Xnest &' You'll get an Xnest window, and you can then set your DISPLAY to :0 and run whatever X clients you want. What I normally do is 'DISPLAY=:0 fvwm2 &' That gives me a window manager and enough of an environment to do what I want without needing to go back to the console.

This is a rerun url (I ran it last week as a demo of the virtual network), but have a look at http://user-mode-linux.sourceforge.net/net.html. It's a screenshot of a virtual net, but Xnest is also involved.

Pavel remarked that he'd been hoping Jeff would have used the framebuffer, which would prevent malicious apps from connecting to the X server directly and capturing keystrokes. Jeff replied:

I did it that way to exercise the kernel and to demonstrate that it could run its own local X server.

If you were running it as a sandbox, you would put the Xnest outside, running on the host. You'd make sure that it accepted connections from the virtual machine and that your native X server didn't. That way, the only server available to evil proggies running inside the virtual machine is the Xnest, which can be made to go away with a click of the mouse if it starts trying nasty things like creating infinite windows.

13. Trying To Reserve System Call Table Entries

2 Jun 2000 - 5 Jun 2000 (7 posts) Archive Link: "Syscall number allocation"

People: Josh HuberVictor KhimenkoH. Peter AnvinAlan CoxOlaf Titz

Josh Huber asked about the procedure for reserving syscall table entries. He explained, "I'm working on getting a new version of our crash dump system ready, and binary compatibility for the user-space part between kernel versions would be a big plus for users," and went on, "I'm told that Linus is somewhat anti debugging systems in the kernel, so it seems like the only route to make using our system and painless as possible would be to have system call entries marked as unimplemented, which would be filled in when you apply our patch."

Victor Khimenko explained Linus' stance, "Not anti-debugging. His opinion is simple: kernel debuggers tend to affect system functionaly and it's NOT what you want when kernel works non-properly and when kernel works properly why you need kernel debugger in first place ? Some debugging facilities can be added (was added, will be added) if it does not affect work of kernel when there are no bugs... On other hand he has VERY strong opinion AGAINST crash dump systems. EVEN if it does not affect normal system behaviour. Of course he can change his mind sometime in future but before it happens probability of getting ANY support for such system (including syscall numbers) is exactly zero." H. Peter Anvin remarked in reply, "Actually, the "procedure" is to convince Linus to let you reserve a syscall number; he will put it in the kernel as unimplemented. It has been done, I belive, twice: once for afs and once for STREAMS."

Alan Cox and Olaf Titz both wondered why Josh needed this to be a syscall, and Josh replied to Alan, "That's a good question. The answer is, it doesn't really -- I'll take a look at alternatives. Using a /dev/ entry would also alleviate the problem of updating architecture dependant syscall entries."

14. Hotplugging All Hardware

3 Jun 2000 - 5 Jun 2000 (24 posts) Archive Link: "Hot pluggable CPUs ( was Linux 2.5 / 2.6 TODO (preliminary) )"

Topics: Hot-Plugging, PCI

People: James SutherlandBruce GuenterAndrew Sharp

In the course of discussion, Andrew Sharp remarked that hot-plug CPU's would be easy to implement. Hot-plug RAM, on the other hand, he felt would be much more difficult. James Sutherland went with it, musing:

It shouldn't be all that harder. Harder than killing a CPU, certainly, but far from impossible. Mark the range as unavailable, then page out any user pages from that region. Most kernel pages could just be moved elsewhere; the big problem will be devices using buffers for DMA. Presumably the best way to solve that would be either to reinitialise that driver (rmmod, insmod) or to have a way to tell the driver to perform the move itself.

Except for reinitialising the drivers, the only impact should be a performace hit while the memory contents are moved. With the right driver modifications, we could avoid any actual loss of functionality during the transition.

This is, IMHO, quite an attractive idea: a fully hot-swappable system, where any failed component can be replaced without any downtime.

Bruce Guenter tried to grab the reins, with, "And how do you plan on swapping out the motherboard that everything connects into?" But James bucked, with:

Every "component" is mounted on a carrier board; this then connects to a pair of backplanes. Each individual component can, obviously, be replaced; you can also remove/disable one backplane at once without downtime.

The next issue is to enable software upgrades without downtime. For applications, this can be done by installing the new version, then signalling the old version to "exec" the new one. (Apache can do something similar with configuration files already.) For a WWW server, for example, this can be done without dropping or refusing a single connection.

The kernel itself would be harder, of course. Kernel modules could do something similar - just unload the old one and reload the new one, taking care to avoid anything trying to use the module in the mean time - leaving just the core code - memory management etc., which would be much more difficult.

The discussion continued in good fun, with various objections and answers. At one point, James remarked:

There are quite a few specialist systems which do similar things; the one which springs to mind is Tandem's Himalaya series, but they aren't the only player by any means.

Right now, it's a very specialised area - but it is certainly becoming less so. Servers like Sun's E10k Starfire systems implement partitioning, for example, and things like hotplug PCI are appearing. Soon, this sort of N+N redundancy could be relatively common in high-end servers: it would be nice to see Linux getting there first...

15. Winmodem Progress

4 Jun 2000 - 5 Jun 2000 (12 posts) Archive Link: "RFC Maestro-3i"

Topics: Modems, Sound: Maestro

People: Jeff GarzikMartin DaleckiZach BrownPavel Machek

In the course of discussion, Jeff Garzik summed up the winmodem situation:

I think when people initially make predictions about winmodem-type modems not working, they didn't think much about the momentum of cheap PC hardware and open source.. With so many winmodems out there today, Linux programmers would inevitably get irked enough to reverse engineer the hardware support.

In just a year or two, we have gone from "winmodem support is impossible!" to hardware support for Lucent (and easily added for AC97 modem codecs), and a V.32 protocol stack in the works. Watch http://www.linmodems.org/ periodically.

Martin Dalecki (who'd started the thread) replied, "As far as I can see there isn't much for reverse engine for in the ESS Maestro-3i part. I have the specs (anybody can get them on ftp://esstech.com.tw. However this part is *really really cheap* Blame on HP for including such a piece of CRAP into such an expensive notebook! And I doubt the timeframe needed to reverse engineer this thing will be shorter then the time frame during which it will get obsoleted..."

Pavel Machek also replied to Jeff's statement about the inevitability of reverse engineering winmodems. He said, "This has already happened. I reverse engineered lucent winmodem; I can now do v.21 with fully open-source components [I've actually used it debugging usb]. (Don't get too excited, v.21 is 300bps)."

 

 

 

 

 

 

Sharon And Joy
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.