Kernel Traffic #268 For 19 Jul 2004

By Zack Brown

Table Of Contents

Mailing List Stats For This Week

We looked at 1426 posts in 8571K.

There were 416 different contributors. 226 posted more than once. 183 posted last week too.

The top posters of the week were:

1. Saving Version Number And Date In .config Files

17 Jun 2004 - 27 Jun 2004 (16 posts) Archive Link: "[PATCH] save kernel version in .config file"

People: Willy TarreauSam RavnborgRandy Dunlap

Randy Dunlap had an idea, that kernel version information should be included automatically in the .config file. This turned out to be one of those very obvious ideas that no one thinks of for years until someone just gets it. A bunch of folks said this was a great idea, and would really make configuration files a lot easier to handle. Willy Tarreau suggested adding some date information as well, and Randy updated his patch to include this. He was curious why the file timestamp wouldn't do just as well, and Sam Ravnborg said having the date in the file would be easier to grep for. Willy added, "there may be lots of reasons. The first one which comes to my mind is when I archive several config files in a same directory, I rarely think about adding '-a' to cp to preserve the dates. And when you're experimenting with a kernel and you're at the 20th at the end of the day, the date in the config file is often more reliable than yourself to keep track of what you have tried." Willy also suggested porting the patch to the 2.4 tree. Randy replied that this would be trivial for 'make menuconfig', and posted a patch. But 'make xconfig', he said, would be a different story, as it would require tcl/tk instead. Willy was happy enough with the 'make menuconfig' hack, and added that he hadn't used 'make xconfig' for 3 or 4 years.

2. SMP Support For Software Suspend (swsusp)

23 Jun 2004 - 25 Jun 2004 (4 posts) Archive Link: "SMP support for swsusp (this one actually works for me)"

Topics: SMP, Software Suspend

People: Pavel MachekPatrick Mochel

Pavel Machek said, "Here's SMP support for swsusp; this one actually works for me [with keyboard hack], but I'd like more testers. If it looks okay, I'll merge simple pieces with andrew." Patrick Mochel liked the patch, and offered some aesthetic criticism, but there was no real discussion.

3. Elastic Quota File System (EQFS) Proposal

23 Jun 2004 - 30 Jun 2004 (46 posts) Archive Link: "Elastic Quota File System (EQFS)"

People: Amit GudOlaf DabrunzMark Cooke

Amit Gud said:

Recently I'm into developing an Elastic Quota File System (EQFS). This file system works on a simple concept ... give it to others if you're not using it, let others use it, but on the guarantee that you get it back when you need it!!

Here I'm talking about disk quotas. In any typical network, e.g. sourceforge, each user is given a fixed amount of quota. 100 Mb in case of sourceforge. 100 Mb is way over some project requirements and too small for some projects. EQFS tries to solve this problem by exploiting the users' usage behavior at runtime. That is the user's quota which he doesn't need is given to the users who need it, but on 100% assurance that the originl user can any time reclaim his/her quota.

Before getting into implementation details I want to have public opinion about this system. All EQFS tries to do is it maximizes the disk space usage, which otherwise is wasted if the user doesn't really need the allocated user..on the other hand it helps avoid the starvation of the user who needs more space. It also helps administrator to get away with the problem of variable quota needs..as EQFS itself adjusts according to the user needs.

Mark Watts asked how it would be possible to "guarantee" that the user would get the space back when they wanted it. Amit expanded:

Ok, this is what I propose:

Lets say there are just 2 users with 100 megs of individual quota, user A is using 20 megs and user B is running out of his quota. Now what B could do is delete some files himself and make some free space for storing other files. Now what I say is instead of deleting the files, he declares those files as elastic.

Now, moment he makes that files elastic, that much amount of space is added to his quota. Here Mark Cooke's equation applies with some modifications: N no. of users, Qi allocated quota of ith user Ui individual disk usage of ith user ( should be <= allocated quota of ith user ), D disk threshold; thats the amount of disk space admin wants to allow the users to use (should be >= sum of all users' allocated quota, i.e. summation Qi ; for i = 0 to N - 1).

Total usage of all the users (here A & B) should be at _anytime_ less than D. i.e. summation Ui <= D; for i = 0 to N - 1.

The point to note here is that we are not bothering how much quota has been allocated to an individual user by the admin, but we are more interested in the usage pattern followed by the users. E.g. if user B wants additional space of say 25 megs, he picks up 25 megs of his files and 'marks' them elastic. Now his quota is increased to 125 megs and he can now add more 25 megs of files; at the same time allocated quota for user A is left unaffected. Applying the above equation total usage now is A: 20 megs, B: 125 megs, now total 145 <= D, say 200 megs. Thus this should be ok for the system, since the usage is within bounds.

Now what happens if Ui > D? This can happen when user A tries to recliam his space. i.e. if user A adds say more 70 megs of files, so the total usage is now - A: 90 megs, B: 125 megs; 215 ! <= D. The moment the total usage crosses the value, 'action' will be taken on the elastic files. Here elastic files are of user B so only those will be affected and users A's data will be untouched, so in a way this will be completely transparent to user A. What action should be taken can be specified by the user while making the files elastic. He can either opt to delete the file, compress it or move it to some place (backup) where he know he has write access. The corresponding action will be taken until the threshold is met.

Will this work?? We are relying on the 'free' space ( i.e. D - Ui ) for the users to benefit. The chances of having a greater value for D - Ui increases with the increase in the number of users, i.e. N. Here we are talking about 2 users but think of 10000+ users where all the users will probably never use up _all_ the allocated disk space. This user behavior can be well exploited.

EQFS can be best fitted in the mail servers. Here e.g. I make whole linux-kernel mailing list elastic. As long as Ui <= D I get to keep all the messages, whenever Ui > D, messages with latest dates will be 'acted' upon.

For variable quota needs, admin can allocate different quotas for different users, but this can get tiresome when N is large. With EQFS, he can allocate fixed quota for each user ( old and new ) , set up a value for D and relax. The users will automatically get the quota they need. One may ask that this can be done by just setting up value of D, checking it against summation Ui and not allocating individual quotas at all. But when summation Ui crosses D value, whose file to act on? Moreover with both individual quotas and D, we give users 'controlled' flexibility just like elastic - it can be stretched but not beyond a certain range.

What happens when an user tries to eat up all the free ( D - Ui ) space? This answer is implementation dependent because you need to make a decision: should an user be allowed to make a file elastic when Ui == D . I think by saying 'yes' we eliminate some users' mischief of eating up all free space.

Olaf Dabrunz replied:

So this narrows down to the effective handling of backup procedures and the effective administration of fixed quotas and centralization of data.

If you have many users it is also likely that there are more people interested in big data-files. So you need to help these people organize themselves e.g. by helping them to create mailing-list, web-pages or letting them install servers that makes the data centrally available with some interface that they can use to select parts of the data.

I would rather suggest that if the file does not fit within a given quota, the user should apply for more quota and give reasons for that.

I believe that flexible or "elastic" allocation of ressources is a good idea in general, but it only works if you have cheap and easy ways to control both allocation and deallocation. So in the case of CBQ in networks this works, since bandwidth can easily and quickly be allocated and deallocated.

But for filesystem space this requires something like a "slower (= less expensive), bigger, always accessible" third level of storage in the "RAM, disk, ..." hierarchy. And then you would need an easy or even transparent way to access files on this third level storage. And you need to make sure that, although you obviously *need* the data for something, you still can afford to increase retrieval times by several orders of magnitude at the discretion of the filesystem.

But usually all this can be done by scripts as well.

Still, there is a scenario and a combination of features for such a filesystem that IMHO would make it useful:

Now you can use the third-level storage as a backing store for hard-drive space, analoguous to what swap-space provides for RAM. And you can "swap in" parts of files from there and cache them on the hard drive. So "elastic" files are actually files that are "swappable" to backing store.

This assumes that the "elastic" files meet the requirements for a "working set" in a similar fashion as for RAM-based data. I.e. the swap operations need only be invoked relatively seldom.

If this is not the case, your site/customer needs to consider buying more hard drive space (and maybe also RAM).

The tradeoff for the user now is:

Maybe this is a good tradeoff for a significant amount of users. Maybe there are sites/customers that have the required backing store (or would consider buying into this). I do not know. Find a sponsor, do some field research and give it a try.

4. linux-libc-headers Updated To 2.6.7; Status Of ABI Cleanup

23 Jun 2004 - 25 Jun 2004 (15 posts) Archive Link: "[ANNOUNCE] linux-libc-headers 2.6.7.0"

People: Mariusz MazurChris FriesenAndries BrouwerSam RavnborgMatthew WilcoxJeff GarzikRob LandleyKrzysztof HalasaH. Peter Anvin

Mariusz Mazur announced a new version of linux-libc-headers, updated to Linux 2.6.7, with additional minor fixes; he said, "Llh is all good and nice, cause it works (most of the times anyway), but with every new release the possibility of desync from kernel increases - downfalls of maintaining it as a separate package. Could anybody point me to some conclusions about how the thing should be done The Right Way (preferably with some input from high profile kernel hackers, so I can have some assurance that once something gets done it will get merged)?" Jeff Garzik replied that H. Peter Anvin had suggested adding an include/abi directory in the sources, and that there had been no objection to it at that time; but that this would most likely have to wait until the 2.7 time frame. Mariusz thought the sooner the better, and Krzysztof Halasa floated a patch. Close by, Chris Friesen googled around and posted his findings regarding the earlier discussion:

Andries Brouwer suggested include/linuxabi with arch-specific dirs

H. Peter Anvin apparently suggested putting them in include/abi, with arch-specific dirs. However, he thinks its too much work for 2.6 and sees it as an early 2.7 thing.

Matthew Wilcox apparently suggested something similar

Jeff Garzik approved the idea

Rob Landley suggested moving your headers there and then cleaning up the other headers, and expressed willingness to submit patches.

Sam Ravnborg supported the idea

Eric Biederman supported the idea, suggested linux-only namespace and version-based naming, figured it was 2.7 work

David Miller approved the idea

At some point down the line in the discussion, Andries Brouwer said:

Several people, probably independently, submitted a header setup and a patch that did the required work for a small handful of header files.

As far as I know Linus has not reacted to such patches.

Since the total amount of work is, like Jeff says, incredibly long and tedious, it is unreasonable to expect that all be done before anything is put in the default kernel tree.

At some point in time Linus either has to describe his setup, or accept a setup someone submits. Maybe a BOF would be useful to find out precisely what requirements there are, but only if Linus is present, because we have had enough discussion already.

Sam Ravnborg remarked:

Header file cleanup has a tendency to break the compile in some configurations. This was obvious during the effort to clean up the include mess in the 2.5 time.

That's maybe the primary reason to postpone it to 2.7.

5. Linux 2.6.7-mm2 Released

24 Jun 2004 - 29 Jun 2004 (24 posts) Archive Link: "2.6.7-mm2"

Topics: Kernel Release Announcement, Virtual Memory

People: Andrew Morton

Andrew Morton announced Linux 2.6.7-mm2, saying:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm2/

6. GFS Clustering Filesystem Goes GPL

24 Jun 2004 - 25 Jun 2004 (4 posts) Archive Link: "GFS cluster filesystem re-released"

Topics: Clustering, Disk Arrays: LVM

People: Ken PreslanBernd Eckenfels

Ken Preslan said:

Red Hat has re-released the GFS cluster filesystem and its related infrastructure under the GPL. The different projects that make up the infrastructure are:

GFS - shared-disk cluster file system
CLVM - clustering extensions to the LVM2 logical volume manager toolset
CMAN - general-purpose symmetric cluster manager
DLM - general-purpose distributed lock manager
CCS - cluster configuration system to manage the cluster config file
GULM - alternative redundant server-based lock/cluster manager for GFS
GNBD - network block device driver shares storage over a network
Fence - I/O fencing system

The source code and patches for 2.6 are available at http://sources.redhat.com/cluster/. 2.4 source should show up early tomorrow.

We're looking for people help us work on this project so we can eventually get it included into the Linux kernel. Comments, suggestions, patches, and testers are more than welcome.

Bernd Eckenfels was very happy to see this, and Jonathan Fors asked what exactly GFS was. Was it the 'Google FS' clustering filesystem he'd read about, but Ken said no, this was something completely different.

7. Merging ext2 And ext3

24 Jun 2004 - 25 Jun 2004 (30 posts) Archive Link: "Collapse ext2 and 3 please"

Topics: FS: ext2, FS: ext3

People: Helge HaftingSean NeakumsAndrew MortonLinus Torvalds

John Richard Moser suggested collapsing the ext2 and ext3 filesystems into one, because their similarities seemed to make it appropriate. Helge Hafting pointed out that when the issue of extending ext2 to support journaling first came up, Linus Torvalds "said that creating a journalled fs was fine, but they had to make it a new fs so as to not make ext2 unstable while working on it. Therefore - ext3. Now ext3 was based on ext2 so it basically started out as a copy." Close by, Sean Neakums also said that once-upon-a-time, someone had suggested "that a no-journal mode be added to ext3 so that ext2 could be removed." Andrew Morton replied:

I think it could be done, mainly as a kernel-space-saving exercise. But the two filesystems are quite different nowadays.

ext2 uses per-inode pagecache for directories, ext3 uses blockdev pagecache. The truncate algorithms are significantly different. Other stuff.

Much pain, little gain.

8. Linux 2.6.7-mm3

26 Jun 2004 - 28 Jun 2004 (13 posts) Archive Link: "2.6.7-mm3"

Topics: Kernel Release Announcement, Version Control

People: Andrew Morton

Andrew Morton announced Linux 2.6.7-mm3, saying:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm3/

9. Transparent Interprocess Communication Protocol (TICP)

28 Jun 2004 (1 post) Archive Link: "[ANNOUNCE] supporting cluster communication with TIPC"

Topics: BSD, Networking, Ottawa Linux Symposium, Version Control

People: Jon Maloy

Jon Maloy said:

I would like to announce the availability of TIPC (Transparent Inter Process Communication protocol). TIPC is a protocol specially designed for high-performance, location transparent communication within loosely connected clusters, and has been used successfully in various Ericsson products over the last years.

In cooperation with colleagues from OSDL and Intel, I have ported TIPC to Linux,and rewritten large parts of the code to fit the Linux kernel environment and coding requirements. TIPC can be compiled either as a part of the kernel or as a loadable module, and is now released as open source code under a dual GPL/BSD license.

Overview

TIPC provides a good support for designing scalable, distributed, site independent, highly available, and high-performance applications.

It provides features such as:

Implementation Status:

There exists two main source code lines:

tipc-1.2.X: this is the most stable and tested release. It works well on both Linux 2.4 and Linux 2.6. This code is not compliant with Linux kernel code requirements regarding code style etc, and now only has interest for comparative reasons. The corresponding CVS modules are "source/stable_ericsson" and "source/unstable_ericsson". A downloadable example using the API of this version is found under "tipc-test", the file "tipc-benchmark-0.93.tar.gz"

tipc-1.3.X: the most recent code, written for Linux 2.6, and compliant with requirements on such code. It works well, but is still slightly less stable than the 1.2 line. The corresponding CVS module is "source/unstable", while we have not had the guts to check in anything under "source/stable" yet. We have not been able to verify that this code has the same performance as the 1.2 code, but we have every reason to believe it will be comparable once the proper optimization work is done. A downloadable example using the API of this version is found under "tipc-test", the file "tipc_test-1.5.tar.gz"

Links:

The TIPC page at SourceForge:
http://tipc.sourceforge.net

Downloading source code and documentation:
http://sourceforge.net/projects/tipc/

A draft protocol specification presented at IETF-59 in Seoul last March.
http://www.ietf.org/internet-drafts/draft-maloy-tipc-00.txt

An article written for the April issue of Linux World Magazine:
http://www.linux.ericsson.ca/papers/tipc_lwm/index.shtml

To be presented at OLS in Ottawa next month:
http://www.linux.ericsson.ca/papers/tipc_ols.pdf

We would appreciate your feedback and advice.

10. Linux 2.6.7-mm4

29 Jun 2004 (4 posts) Archive Link: "2.6.7-mm4"

Topics: Kernel Release Announcement

People: Andrew Morton

Andrew Morton announced Linux 2.6.7-mm4, saying:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm4/

 

 

 

 

 

 

Sharon And Joy
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.