Kernel Traffic #128 For 30 Jul 2001

Editor: Zack Brown

By Adam Buchbinder  and  Zack Brown

linux-kernel FAQ (http://www.tux.org/lkml/) | subscribe to linux-kernel (http://www.tux.org/lkml/#s3-1) | linux-kernel Archives (http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html) | kernelnotes.org (http://www.kernelnotes.org/) | LxR Kernel Source Browser (http://lxr.linux.no/) | All Kernels (http://www.memalpha.cx/Linux/Kernel/) | Kernel Ports (http://perso.wanadoo.es/xose/linux/linux_ports.html) | Kernel Docs (http://jungla.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html) | Gary's Encyclopedia: Linux Kernel (http://members.aa.net/~swear/pedia/kernel.html) | #kernelnewbies (http://kernelnewbies.org/)

Table Of Contents

Introduction

I'd like to draw your attention to the text and link at the bottom of every page on this site. If you haven't already heard, the Russian programmer Dmitry Sklyarov was arrested this month for violating the DMCA (The Digital Millennium Copyright Act). The Electronic Frontier Foundation (http://www.eff.org/) was instrumental in getting Adobe to back off from their initial complaint, but Sklyarov is still charged with violating the DMCA and may be sent to prison. Many people are trying to convince the US government to let him go, and the free-sklyarov mailing list (http://zork.net/mailman/listinfo/free-sklyarov/) is available if you want to participate in that effort. Many protests have already taken place world-wide, and more are planned. For a lot of information on this issue, see http://www.freesklyarov.org/.

Mailing List Stats For This Week

We looked at 1024 posts in 4349K.

There were 406 different contributors. 168 posted more than once. 151 posted last week too.

The top posters of the week were:

 

1. Hash Functions
17 Jul 2001 - 24 Jul 2001 (16 posts) Archive Link: "Common hash table implementation"
Summary By Zack Brown
Topics: BitKeeper, Version Control
People: Larry McVoyDaniel PhillipsBrian J. WatsonRichard Guenther

Brian J. Watson wanted to work up a common hash table implementation, along the lines of include/linux/list.h; when he stumbled across include/linux/ghash.h he thought someone had already done it, until he noticed the copyright notice from 1997. He also found that no one actually included that code, so he asked if there was interest in something a bit newer. Richard Guenther suggested checking out some code (http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/glame/glame/src/include/hash.h?rev=1.5&content-type=text/plain) which would generate code for static hash tables. Larry McVoy also said:

We've got a fairly nice hash table interface in BitKeeper that we'd be happy to provide under the GPL. I've always thought it would be cool to have it in the kernel, we use it everywhere.

http://bitmover.com:8888//home/bk/bugfixes/src/src/mdbm

will let you browse it. The general interface is gdbm() like and there are both file backed and memory backed versions. It was designed to be useful in small and large configs, you can get a hash into 128 bytes if I recall correctly.

Daniel Phillips licked his lips at the prospect of new hash tables to test, but added, "I think the original poster was thinking more along the lines of a generic insertion, deletion and lookup interface, which we are now doing in an almost-generic way in a few places. Once place that is distinctly un-generic is the buffer hash, for no good reason that I can see. This would be a good starting point for a demonstration." Brian was also very excited by Larry's post, but concurred with Daniel that it wasn't quite what he was looking for.

Daniel emerged from deep within the code, announcing:

I tested almost all of them to see how well they worked my directory index application. There are really only two criterea:

  1. How random is the hash
  2. How efficient is it

My testing was hardly what you would call rigorous. Basically, what I do is hash a lot of very unrandom strings and see how uniform the resulting hash bucket distribution is. The *only* function from Larry's set that did well on the randomness side is the linear congruential hash - it did nearly as well as my dx_hack_hash.

Surprisingly, at least to me, the CRC32 turned in an extremely variable performance. With a small number of buckets (say 100) it did ok, but with a larger numbers it showed a very lumpy distribution. Yes, this is way too imprecise a way of describing what happened and I should take a closer look at it. I don't have the mathematical background to be really sure about this, but I suspect CRC32 isn't optimized at all for randomness - it's optimized for detecting bit errors and has good properties with respect to neighbouring bits, properties that are no use at all to a randomizing funciton. Anyway, I wasn't all that unhappy to see CRC32 turn in a poor performance for two reasons: a) the 1K xor table would represent a 25% increase of the indexing code and b) hashing through the table eats an extra 1K of precious L1 cache.

The linear congruential hash from Larry's set and my dx_hack_hash share a common characteristic: they both munge each character against a pseudorandom sequence. In Larry's hash it's a linear congruential sequence, and in my case it's a feedback shift register. In addition, I use a multiply to spread the effect of each character over a broader range of bits.

Larry's hash doesn't do this and you can see right away that strings that vary only in the last character aren't going to be distributed very randomly. It might work a little better with the hashing step spelled this way:

- ((h) = 0x63c63cd9*(h) + 0x9c39c33d + (c))
+ ((h) = 0x63c63cd9*(h + (c)) + 0x9c39c33d)

I haven't tried this, but I will.

There are people out there who know a lot more about analyzing hash functions than I do, and I have their names somewhere in my mailbox. I'll go look them up soon and submit for proper testing the whole batch of functions that have been suggested to me over the last few months. By the way, in case you haven't already deduced this, this stuff is really time consuming.

 

2. Approaching 2.5
20 Jul 2001 (2 posts) Archive Link: "Linux 2.5"
Summary By Zack Brown
People: Thiago Vinhas de MoraesAndre DahlqvistAlan CoxLinus Torvalds

Thiago Vinhas de Moraes asked:

I just would like to know what's missing to the start of the development of the kernel 2.5, and the mantaince of the 2.4 to go to Alan Cox ?

I'm asking this because I see a very good stability of the 2.4 tree, and the need of the start of the development of 2.5.

Currently, 2.4 is just getting small fixes, that could be easily managed by Alan.

Does Linus have any schedule to pass the control of 2.4 management to someone else, and start developing the great 2.5 kernel?

Andre Dahlqvist replied:

On the 21th of June Linus said this in a message to linux-kernel:

"2.5.x looks like it will open in a week or two, so we're not talking about long timeframes".

So he probably has plans to start 2.5.x soon (my personal guess is that he'll do it at the same time as 2.4.8 is released, but that's just me:-)

Linus Torvalds had nothing to say.

 

3. Status Of Kernel Debuggers
21 Jul 2001 - 23 Jul 2001 (6 posts) Archive Link: "kgdb and/or kdb for RH7.1"
Summary By Zack Brown
Topics: FS: XFS
People: Keith OwensAmit S. KaleTigran Aivazian

Michael S. Miles asked if patches existed for the kgdb or kdb kernel debuggers, for kernel 2.4.2-pre2, and offered to port the patches to that version if none were available. Keith Owens replied:

ftp://oss.sgi.com/projects/xfs/download/Release-1.0/patches/linux-2.4.2-kdb-04112001.patch.gz is kdb v1.8 against Redhat 7.1. There are no XFS dependencies in that patch, but kdb and xfs hit a couple of common files so you might need to resolve some patch failures.

It is a lot easier to start from that patch instead of trying to convert a kdb patch from a standard kernel onto Redhat's kernel. RH took patches from the -ac tree as well which really messed up kdb, it took me several hours to work out whta RH had done to each file, and I had all the kdb patches. AFAICR, the IKD patch in RH 7.1 does not fit correctly.

He replied to himself, saying, "Correction, that patch is against a standard 2.4.2 kernel. The closest I could find is ftp://oss.sgi.com/projects/xfs/download/testing/Release-1.0.1-PR3/patches/patch-RH2.4.3-xfs-1.0.1-kdb That is against Rawhide rather than RH 7.1 but it should be fairly close. So many patches, so little time :(."

Elsewhere, Tigran Aivazian gave a pointer to http://kgdb.sourceforge.net/, saying it was maintained by Amit S. Kale. Amit replied:

I am not maintaining a kgdb patch for RH7.1 as yet. This is an extact from the newly uploaded FAQ page on kgdb website.

Why only one kernel version is supported? I enhance kgdb and add documentation to kgdb webpage frequently. This process is easy with a single kernel version as I can work on enhancing and supporting newer kernel versions at the same time. I myself need kgdb for kernel debugging on newer kernels for the translation filesystem. Supporting older kernels involves backporting enhancements and testing them. Usually a kgdb patch works for multiple kernel versions with a bit of application of failed hunks by hand. I plan to support a fixed 2.4 kernel version and a top of the line 2.5 kernel, once 2.5 kernel branch starts.

 

4. Status Of Journaling Filesystems
22 Jul 2001 - 23 Jul 2001 (10 posts) Archive Link: "OT: Journaling FS Comparison"
Summary By Zack Brown
Topics: FS: JFS, FS: NFS, FS: ReiserFS, FS: XFS, FS: ext2, FS: ext3
People: Ian ChiltonHans ReiserTigran AivazianMartin KnoblauchSteven Cole

Ian Chilton asked about the relative merits and status of the various journaling filesystems: ext3, reiserfs, XFS and JFS. He said:

ext3 stands out because of it's compatibility with ext2 - this makes it easy to 'upgrade' from ext2 to ext3 without loosing/moving data. Also it would be much easier to move a drive into another machine without worrying about the kernel having reiserfs etc compiled in.

However, I have heard ext3 is slower (obviously because it has extra writes) and sometimes has instibilities.

I also heard that ReiserFS is the fastest out of the bunch, but all data is lost on converstion, and obviously rescuing and moving disks is harder. But, it is in the main kernel tree..

Steven Cole gave a pointer to a page that now appears dead (http://aurora.zemris.fer.hr/filesystems/) , and Constantin Loizides gave a link to his reiserfs page (http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/) .

There were some other comments sprinkled throughout the thread. Hans Reiser said, "The last ReiserFS patch for NFS in Linux 2.4 seems to have resulted in no more complaints regarding nfs and reiserfs used in combination since it went in. It went in quite recently though." Tigran Aivazian said elsewhere, "at the time when I did the comparison using SPEC SFS to benchmark, the choice was not hard at all -- absolute and obvious winner was reiserfs. That is, amongst the freely available ones. (this was not too long ago, a mere 2 months or so)." Hans remarked, "SPEC SFS is a proprietary and expensive benchmark which precludes us from optimizing for it, which is a pity, I suspect we'd learn something from analyzing its results." Elsewhere, Martin Knoblauch asked, "what is the status if integration of the various ReiserFS patches in the mainstream or AC kernels. e.g. the "unmount" patch does not seem to be incorporated in 2.4.[5-7]." Hans replied, "Its functional substitute is in 2.4.6" .

 

5. Status Of NTFS
23 Jul 2001 - 25 Jul 2001 (2 posts) Archive Link: "Status of NTFS support was Re: [PATCH] 2.4.7 More tiny NTFS fixes"
Summary By Zack Brown
Topics: FS: NTFS
People: Anton Altaparmakov

Gabriel Rocha asked about the status of NTFS support under Linux, as it seemed to have been "poorly supported" for a long time. Anton Altaparmakov replied:

I will comment as the current maintainer. (-:

If you by "poorly supported" mean that it was a more or less abandoned project then that has changed a lot indeed. NTFS is now under active development, both kernel and user space side. I am happy to receive patches and forward them for inclusion if appropriate or integrate them in my local development tree and submit as larger patch later (depends on the triviality of the patches). And I try to respond asap to requests/bug reports/etc. Currently my personal response times are between 5mins and a week or so depending on how busy I am.

If by "poorly supported" you mean it doesn't work very well, then that has improved as well. We have a fully functional mkntfs program already on the userspace side and ntfsfix which repairs some of the damage done by the ntfs driver making it a bit somewhat safer to use. The driver itself has much improved in recent months, writing is now relatively ok as long as it happens on a UP system, to simple files and directories. There is still a lot not implemented so only the simple case works for now. Reading is relatively stable and most things are implemented wrt to reading the normal data attribute of both uncompressed and compressed files. We now can cope with large files to the full potential of NTFS (i.e. we cope with 2^63 byte sized files) for example to mention one of the improvements.

So to summarize: we are working on it but don't hold your breath. NTFS is highly complex and extremely poorly documented. Most of our knowledge is based on reverse engineering and looking at on-disk structures with hex/disk editors and it will take considerable time to have a fully working fully featured NTFS implementation...

End Of Thread (tm).

 

6. Autorun/autodetect for RAID
24 Jul 2001 (3 posts) Archive Link: "[PATCH] add "autorun" interface to md"
Summary By Adam Buchbinder
Topics: Disk Arrays: MD, Disk Arrays: RAID, USB
People: Kees CookNeil Brown

Kees Cook was setting up a removable RAID, and noticed "After boot-up (or as a module) the "md" driver has no interface to run the "autostart_arrays" function. In the case of removable disks (eg USB, or in my case, FireWire), since the disks may not appear in the same place, or in the same order, the standard raidtools' "raidstart" will not work (calling the md.c "raidstart" interface) because the device names don't match up." He posted a preliminary patch to enable autodetection of his array, and asked for comments.

Neil Brown replied that he was also working on the md driver, and suggested joining the linux-raid@vger.kernel.org (mailto:majordomo@vger.kernel.org?body=subscribe linux-raid) mailing list. He went on, "autorun/autodetect just doesn't belong in the kernel. It should be done in user space. The only time the kernel should assemble a raid array itself is for the root device." He went on to say, "It is true that there is not currently any userlevel tool which does the equivalent of autodetect, but there will be soon." He posted a link to the current pre-release of his code at http://www.cse.unsw.edu.au/~neilb/source/mdctl/, and the thread ended.

 

7. New Edition Of "Linux Device Drivers"
25 Jul 2001 (1 post) Archive Link: "Linux Device Drivers book available online"
Summary By Adam Buchbinder
People: Jonathan CorbetAlessandro Rubini

Jonathan Corbet wrote:

Finally, _Linux_Device_Drivers, second edition, by Alessandro Rubini and myself, is available online. Find it at:

http://www.xml.com/ldd/chapter/book/index.html

It's there in HTML, PDF, and XML (DocBook) forms. The license is the GNU FDL, which allows redistribution and all that cool stuff.

There was no reply.

 

8. New Inlining Conventions For GCC 3.0
26 Jul 2001 - 27 Jul 2001 (12 posts) Archive Link: "[PATCH] gcc-3.0.1 and 2.4.7-ac1"
Summary By Adam Buchbinder
People: Alan CoxLinus TorvaldsPetr Vandrovec

Petr Vandrovec posted a patch to convert several "extern inline" functions to "static inline" to work with gcc version 3.0.1 2 0010721 (Debian prerelease), and sparked a flurry of discussion. Alan Cox suggested, "Fix gcc. We use extern inline to say 'must be inlined' and that was the semantic it used to have. Some of our inlines will not work if the compiler uninlines them." Linus Torvalds gave his opinion:

We had this fight with the gcc people a few years back, and they have a very valid argument for the current semantics.

  • "static inline" means "we have to have this function, if you use it but don't inline it, then make a static version of it in this compilation unit"
  • "extern inline" means "I actually _have_ an extern for this function, but if you want to inline it, here's the inline-version"

The only problem with "static inline" was some _really_ old gcc versions that did the wrong thing and made a static version of the function in _every_ compilation unit, whether it was needed or not. Those versions of gcc do not work on the kernel anyway these days, so..

I think the current gcc semantics are (a) more powerful than the old one and (b) have been in effect long enough that it's not painful for Linux to just switch over to them. In short, we might actually want to start taking advantage of them, and even if we don't we should just convert all current users of "extern inline" to "static inline".

 

9. Maximum Number Of Open Files
26 Jul 2001 (2 posts) Archive Link: "Increase number of open files"
Summary By Adam Buchbinder
People: Edouard SorianoNick DeClario

Edouard Soriano reported having to "close some Windows on my system to perform some other tasks" , and thought that his system might be using the maximum number of concurrent files. He asked if there was a /proc setting to modify this.

Nick DeClario replied, saying that "/proc/sys/fs/file-max contains the max files. The default is 4096. Try changing it to 8192, that should do the trick." There was no reply.

 

 

 

 

 

 

We Hope You Enjoy Kernel Traffic
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License, version 2.0.