Kernel Traffic #140 For 5 Nov 2001

By Zack Brown

Table Of Contents

Introduction

Well, the indexing features are back, though no promises as to how long they'll stay up. If I run into more database problems, I may have to take them down again.

Basically, you can now see a list of all the people who've written summaries for KT or any of the Cousins, at http://kt.zork.net/authors.html (../authors.html) . Unfortunately translator attributions are still not supported. Maybe one day...

But the author lists are just a perk. The really cool feature is that you can now see indices of people on the mailing lists who are quoted or referred to in KT or the Cousins. The master index is at http://kt.zork.net/people.html (../people.html) . There are actually two master index pages. The one I just mentioned, which lists everyone alphabetically; and another one at http://kt.zork.net/peoplebycontrib.html (../peoplebycontrib.html) , which lists everyone in order of how many times they appear in KT or the Cousins.

Each of these main index pages in turn refers to a HUGE number of smaller indices for each individual person. These smaller indices are also linked from those people's names as they appear in each issue. The links in KT and the Cousins show up as a "[*]" following the person's name.

Each person's index page contains a variety of information. Take Wichert Akkerman (../people/Wichert_Akkerman.html) as an example. All of his appearences, organized by Cousin (or KT) are listed, complete with the date and title of the discussion. Then underneath all that are two columns (also organized by Cousin) containing the names of all the people who appeared in discussions with Wichert, and the number of times they appeared with him. The left column lists each person alphabetically, while the right lists each person in terms of how many discussions they shared. These columns are also organized by Cousin. Each person also has a link leading to their own personal index page, so you can actually surf around through these indices.

Enjoy! And if you find any errors, I'd appreciate hearing about them.

Mailing List Stats For This Week

We looked at 1731 posts in 7129K.

There were 587 different contributors. 266 posted more than once. 159 posted last week too.

The top posters of the week were:

 

1. Al Viro Planning To Fork devfs
16 Oct 2001 - 28 Oct 2001 (54 posts) Archive Link: "Poor floppy performance in kernel 2.4.10"
Topics: FS: devfs, Version Control, Virtual Memory
People: Ryan CummingAlexander ViroRik van RielRichard GoochRoman Zippel

In the course of discussion, Alexander Viro started pumping notices of devfs bugs into the mailing list. After replying to himself half a dozen times with more and more notices, Ryan Cumming suggested, "It might be more productive to provide patches or at least generate constructive ideas as how to fix these problems, as you are obviously quite capable of doing so. Digging through the code and sending a new email to this list for every few flaws you find is doing little good, and your personal attacks on the maintainer are even less benefical. Cooperation will get you a lot farther than conflict." Alexander replied:

Been there, tried that, had been told by Richard that he would rather fix devfs bugs himself. Quite a few months ago. If you have better suggestions they would be more than welcome.

As far as I can see, if maintainer doesn't fix the bugs himself and tells that patches are not welcome there are only two things that can be done - going into full-disclosure mode in hope that it will change the situation or taking over the code in question.

By that point I'm sorely tempted to do the latter (i.e. full-blown fork, maintained with no regard to Richard's preferences + sumbitting [very massive] fixes directly to Linus), but I want to give a try to less drastic approach first.

Rik van Riel confirmed these statements independently, saying to Ryan:

  1. yes, Al Viro is very capable of sending in devfs fixes and he has done so in the past (IIRC around 2 months ago)
  2. Richard Gooch then told Al he'd just started working on a patch to fix the problem and he'd rather fix things himself ... as far as I can see this hasn't happened yet

Richard Gooch replied to this, saying that Alexander had submitted "A truely horrible, busy-wait patch that was quickly superceeded by a much cleaner patch that I wrote shortly thereafter. And was applied by Linus in due course." As for Rik's and Alexander's take on the history of the discussion, he added:

Complete fucking bullshit. Over the last several months, I've been sending a steady stream of bugfix patches to Linus and the list, and if you'd been paying attention, you'd notice that in time they've been applied.

Furthermore, I've nearly finished the big rewrite of devfs which adds proper locking and refcounting. That work was progressing nicely (but it's a big job), although it's temporarily stalled because of some important travel. Work on that will resume in the next couple of weeks. There's no point sending in an incomplete version of the code.

It's beyond me why you state that there has been no progress by me when my announcements of new devfs patches have been posted to the list and even Linus' ChangeLog messages have shown stuff going in. If you don't actually know what's going on, why do you bother posting on this subject in the first place? How would you like it if I started flaming about how long the VM code was taking to get working? Our VM has sucked for *years*.

Alexander replied:

OK, _now_ I'm really pissed off. As far as I can see there is only one way to get you fix anything - posting to l-k. This "steady stream" consists of what? Let's see:

0.118: buffer underrun in try_modload(). Source: some Al Viro had hit the function in question in grep over tree and took a couple of minutes to read it.

0.118: moving down_read() - yes, it had fixed the instance of deadlock pointed to you by, damn, what a coincidence, same bastard. Come to think of that, let me grep for down_read()... Aha.

static int devfs_follow_link (struct dentry *dentry, struct nameidata *nd)
{

int err;
struct devfs_entry *de;
 
de = get_devfs_entry_from_vfs_inode (dentry->d_inode, TRUE);
if (!de) return -ENODEV;
down_read (&symlink_rwsem);
err = de->registered ? vfs_follow_link (nd, de->u.symlink.linkname) : -ENODEV;
up_read (&symlink_rwsem);
return err;
} /* End Function devfs_follow_link */

Umm... Hadn't we just been there? Recursive down_read(&symlink_rwsem)...

0.117: oh, wow - finally. devfs_link() is gone.

0.116: reverted previous broken patch, but result contained a deadlock instead of race. Result of race scenario described on l-k by... damn, this asshole again.

0.115: bogus fix for breakage introduced by blkdev-in-pagecache patch. Hadn't got into Linus' tree, actually.

0.114: introduced broken refcounting for symlinks (see 0.116)

0.113: "quick and dirty hack" to protect symlink bodies. Broken, at that. BTW, breakage in 0.113 and 0.114 hadn't stopped Mandrake from deciding that it fixed readlink() race and shipping the thing. Funny, but race it was supposed to fix had been described in private email several months before. Then it was described on l-k. Then description had been forwarded to Mandrake - after a question about potential breakage. _Then_ (and I assume that it was a coincidence) said patches had appeared.

0.111, 0.112: not a fix by any stretch of imagination.

Oh, and before that we have a (finally, only after a year of mentioning the crap in question, heavy-weight rant on l-k when I've finally ran out of patience _and_ detailed discussion on the possible fixes) fix for expand-entry-table races.

So far all I see is that beating you hard enough in public can make you fix the bugs explicitly pointed to you. That's it. As far as I can see you don't read your own code, judging by the fact that every damn look at fs/devfs/base.c shows a new hole within a couple of minutes _and_ said holes stay until posted on l-k. Private mail doesn't work. You read it, reply and ignore. About hundred kilobytes of evidence available at request.

Richard replied:

You don't get to see the bug reports or questions I respond to which are sent to me privately or on the devfs list (I know you're not subscribed:-). And you seem to have forgotten that I've responded to questions or bug reports *from you* that you send privately to me, sometimes Cc:ed to Linus. I've even responded to questions that you've placed in the code. So it's simply not true that I only respond if beat upon in public. Progress *is* being made, irrespective of your "input".

As for the recent bug reports, yes, I've just seen them. I'll respond (not because you've been flaming about it on the list) later this week once I clear through my email backlog which accumulated while I was off the 'net for a week. Yeah, it does take time to wade through all the email, especially when I get greeted with a huge pile of flames.

Regarding the work Richard has been doing lately, Roman Zippel asked, "What about putting them somewhere in a CVS repository, so people can see what's going on and maybe even can help out? BTW you should really do something about your coding style, your code is very confusing to read. I wouldn't care if it would be just some driver, but devfs is supposed to be a very important part, so it would be nice to use the same rules that apply to other important parts of the kernel." Alexander replied, "Looks like I'll get around to creating a CVS repository starting at the last known code in a couple of days anyway..."

 

2. Searching For A Monotonic Clock
22 Oct 2001 - 26 Oct 2001 (7 posts) Archive Link: "How should we do a 64-bit jiffies?"
Topics: POSIX, SMP
People: Keith OwensLinus TorvaldsGeorge AnzingerBrian Gerst

George Anzinger wanted to create a POSIX timer that would not roll back to 0 at any point. He figured that the best way to implement this would be in terms of the system's uptime, which meant using the jiffies value. But since jiffies would eventually roll back to 0, he figured he'd have to work around that in some way. Making jiffies a 64-bit value, as opposed to just 32, seemed the way to go. He made several proposals, each of which had drawbacks that he pointed out. Keith Owens offered:

If you want to leave existing kernel code alone so it still uses 32 bit jiffies, just maintain a separate high order 32 bit field which is only used by the code that really needs it. On 32 bit machines, the jiffie code does

old_jiffies = jiffies++; if (jiffies < old_jiffies)

++high_jiffies;

You will need a spin lock around that on 32 bit systems, but that is true for anything that tries to do 64 bit counter updates on a 32 bit system. None of your suggestions will work on ix86, it does not support atomic updates on 64 bit fields in hardware.

Brian Gerst pointed out that cmpxchg8b did support atomic updates on 64-bit fields in hardwre, but Keith replied, "Not on 386, only on 486 and above. Besides, you want to avoid arch specific asm code."

George replied to Keith's suggestion, saying, "As it turns out I already have a spinlock on the update jiffies code. The reason one would want to use a 64-bit integer is that the compiler does a MUCH better job of the ++, i.e. it just does an add carry. No if, no jmp. I suppose I need to lock the read also, but it is not done often and will hardly ever block."

He added that something like "#define jiffies (unsigned long volitial)jiffies_u64" was looking like the best solution at the moment, simply casting jiffies to the proper size. But Linus Torvalds objected:

except for gcc being bad at even 64->32-bit casts like the above. It will usually still load the full 64-bit value, and then only use the low bits.

The efficient and sane way to do it is:

/*
* The 64-bit value is not volatile - you MUST NOT read it
* without holding the spinlock
*/
u64 jiffies_64;
 
/*
* Most people don't necessarily care about the full 64-bit
* value, so we can just get the "unstable" low bits without
* holding the lock. For historical reasons we also mark
* it volatile so that busy-waiting doesn't get optimized
* away in old drivers.
*/
#if defined(__LITTLE_ENDIAN) || (BITS_PER_LONG > 32)
#define jiffies (((volatile unsigned long *)&jiffies_64)[0])
#else
#define jiffies (((volatile unsigned long *)&jiffies_64)[1])
#endif

which looks ugly, but the ugliness is confined to that one place, and none of the users will ever have to care.

George felt this could be improved by avoiding the lock altogether. He posted a version to handle it on UP systems, but wasn't sure if there were any corresponding version that would work on SMP systems. There was no reply.

 

3. More Discussion Of Compile-Time VM Selection
25 Oct 2001 (10 posts) Archive Link: "concurrent VM subsystems"
Topics: Virtual Memory
People: Lars Marowsky-BreeRik van Riel

Marton Kadar asked if it would be possible to make the VM subsystem a compile-time option. Reid Hekman felt the issue had already been beaten to death, but Lars Marowsky-Bree added in reply, "this might be 2.5 material, but I think the subsystem should be modularized; I think it has been proven that this part of the code is definitely subject for discussion, and I would go as far as saying it just might be possible that the optimal VM, catering to different approaches, plain out doesn't exist, and that being able to switch VM personalities during runtime would be useful." Rik van Riel commented:

Interestingly, of all the people saying that we should have different VM systems for different situations, NOBODY has managed to point out what specific things should be different.

The current situation of having 2 competing VMs seems to work out nicely, though. Especially when ideas get merged all the time.

 

4. Which Compiler To Use
25 Oct 2001 - 26 Oct 2001 (15 posts) Archive Link: "kernel compiler"
People: David WeinehallAlan Cox

Madhav Diwan asked which compiler would be best for compiling the kernel; he'd been using Red Hat's gcc-2.96-85, but had been warned that it would break things. He hadn't noticed that behavior himself, but was curious if there were a better alternative. David Weinehall replied, "some people are still living with the misconception that all gcc-2.96 releases are buggy. They are not; only early versions are. gcc-2.95.[34] and gcc-2.96-(newer versions) are viable choices if you want a working kernel. Some other versions might work, but then again, they might not :-)" Alan Cox added that he currently used gcc-2.96-85 (precisely what Madhav had been using). He also mentioned, "Gcc 3.0 doesn't always build correct kernels. Its very much a .0 release - new infrastructure, the core to do far better thinga than gcc 2.* but not yet the actual results as the bugs all get kicked out."

 

5. Alan Leans Toward Andrea's VM
26 Oct 2001 (8 posts) Archive Link: "Linux 2.4.13-ac1"
Topics: Kernel Release Announcement, Virtual Memory
People: Christopher S. SwingleyAlan CoxRik van RielLinus Torvalds

Alan Cox announced 2.4.13-ac1, which he said included a merge from the Linus Torvalds tree. Christopher S. Swingley asked, "Does this mean the ac tree now uses the AA VM, or is this a merge with everything but the VM, like the earlier 2.4.1x-ac trees?" Alan replied that it still used Rik van Riel's VM, but that "the way things are panning out I suspect 2.4.14ac* may well be a point I switch to the Andrea/Marcelo/Linus VM."

 

6. More Discussion Of License Tainting
26 Oct 2001 - 29 Oct 2001 (8 posts) Archive Link: "Non-standard MODULE_LICENSEs in 2.4.13-ac2"
Topics: BSD
People: Keith OwensAndreas DilgerAlan Cox

Keith Owens posted a list of licenses that would taint the kernel, explaining, ""BSD without advertising clause" is not quite good enough for the kernel, that licence allows for binary only modules. Kernel debuggers insist on general source availability. Since the source is already in the kernel which is distributed as a GPL work, these sources are effectively dual BSD/GPL. Could the owners please convert them to "Dual BSD/GPL"?" Andreas Dilger replied, "Being included in the kernel source isn't "general source availability"? I can see that you want to make this whole tainted-kernel mess work, but I think you are confusing intent with implementation. The intent (AFAICS) is to mark the kernel tainted ONLY if a closed-source module is loaded, rather than to be a "license police" mechanism, especially for sources that have been included in the kernel for a long time." Alan Cox replied:

"BSD" can indicate totally closed source material as well as other stuff

Also Keith is right - if it is GPL compatible BSD code linked with the kernel then its correct to describe it as Dual BSD/GPL anyway.

 

7. Binary-Only Sigma 8400/8401 Chip Support
31 Oct 2001 (5 posts) Archive Link: "EM8400/8401 support?"
People: Anton AltaparmakovRoy Sigurd KarlsbakkTorrey Hoffman

Roy Sigurd Karlsbakk asked if there were support for Sigma 8400/8401 chips. Anton Altaparmakov replied:

http://www.sigmadesigns.com/support/download_netstream2000_linux.htm

contains official binary only drivers for the Netstream2000 card which uses the em8400 chip.

If you are using the chip to implement your own board then you should contact sigma designs for Linux drivers. They say the chip has linux support on:

http://www.sigmadesigns.com/products/em8400.htm

As Sigma designs do not release specs nor sourcecode there is no open source driver available and I am not aware of any non-official efforts to produce drivers.

If you want the em8300 chip then have a look at http://dxr3.sf.net/ where you can find the inofficial Linux GPL drivers for the Sigma designs Realmagic Hollywood+ and Creative dxr3 cards (which are the same).

Roy Sigurd Karlsbakk replied, "strange... I found a package called NetStream2000-0.2.047.1.tar.gz with these drivers with source on Sigma's site. I also found tech spec on the EM840[01] open on their sites, although the document was marked 'confidential'." But Torrey Hoffman explained:

That GPL'ed source code (from the "kernelmode" directory of the tarball) contains only the source for the interface between the driver and the kernel. Compiling that gives you a small module, but AFIK, there is no way (well, no documentation) to use that module to actually do anything useful or interesting.

To actually do anything (like decode MPEG-2 video) with the hardware, you use the large (400K) closed-source libEM8400.so library. That library talks to the hardware using the module. I suppose you could try to reverse-engineer that by observing all the communication between the lib and the driver, but that's probably not allowed.

So, in short: The only documentation is on how to use libEM8400, and that's closed source. But hey, it works, so things could be worse.

(I suppose one could have an discussion on the legality of this GPL'ed kernel module / closed driver, but I'm sure most readers of the list are sick and tired of amateur legal discussion, I guess Sigma's lawyers decided it was legal, and they know better than me.)

 

 

 

 

 

 

We Hope You Enjoy Kernel Traffic
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.