Kernel Traffic #321 For 3 Sep 2005

By Zack Brown

Table Of Contents

Mailing List Stats For This Week

We looked at 1504 posts in 9MB. See the Full Statistics.

There were 628 different contributors. 218 posted more than once. The average length of each message was 97 lines.

The top posters of the week were: The top subjects of the week were:
63 posts in 297KB by adrian bunk
54 posts in 226KB by akpm@osdl.org
28 posts in 87KB by lee revell
27 posts in 153KB by dtor_core@ameritech.net
23 posts in 83KB by pavel machek
73 posts in 362KB for "merging relayfs?"
38 posts in 152KB for "kernel guide to space"
28 posts in 127KB for "memory pressure handling with iscsi"
26 posts in 128KB for "2.6.13-rc3-mm1 (ckrm)"
22 posts in 85KB for "[patch 2.6.13-rc3a] i386: inline restore_fpu"

These stats generated by mboxstats version 2.8

1. Guide To Using Whitespace In Kernel Sources

11 Jul 2005 - 22 Jul 2005 (38 posts) Archive Link: "kernel guide to space"

People: Michael S. Tsirkin

Michael S. Tsirkin said:

I've been tasked with edicating some new hires on linux kernel coding style. While we have Documentation/CodingStyle, it skips detail that is supposed to be learned by example. Since I've been burned by this a couple of times myself till I learned, I've put together a short list of rules complementing Documentation/CodingStyle. This list is attached, below. Please cc me directly with comments, if any.

He gave a link to his list of rules (http://www.mellanox.com/mst/boring.txt) , and various folks offered suggestions of greater or lesser obscurity.

2. RelayFS Likely To Go Into -mm

11 Jul 2005 - 25 Jul 2005 (89 posts) Archive Link: "Merging relayfs?"

Topics: Networking

People: Tom ZanussiAndrew MortonDave AirlieBaruch EvenJason BaronSteven RostedtBert HubertChristoph Hellwig

Tom Zanussi said to Andrew Morton:

can you please merge relayfs? It provides a low-overhead logging and buffering capability, which does not currently exist in the kernel.

relayfs key features:

The relayfs code has been in -mm for more than three months following the extensive review that took place on LKML at the beginning of the year, at which time we addressed all of the issues people had. Since then only a few minor patches to the original codebase have been needed, most of which were sent to us by users; we'd like to thank those who took the time to send patches or point out problems.

The code in the -mm tree has also been pounded on very heavily through normal use and testing, and we haven't seen any problems with it - it appears to be very stable.

We've also tried to make it as easy as possible for people to create 'quick and dirty' (or more substantial) kernel logging applications. Included is a link to an example that demonstrates how useful this can be. In a nutshell, it uses relayfs logging functions to track kmalloc/kfree and detect memory leaks. The only thing it does in the kernel is to log a small binary record for each kmalloc and kfree. The data is then post-processed in user space with a simple Perl script. You can see an example of the output and the example itself here:

http://relayfs.sourceforge.net/examples.html#kleak

Last but not least, it's still small (40k worth of source), self-contained and unobtrusive to the rest of the kernel.

In summary, relayfs is very stable, is useful to current users and with inclusion, would be useful to many others. If you can think of anything we've overlooked or should work on to get relayfs to the point of inclusion, please let us know.

Andrew said he was willing, but asked, "Would you have time to prepare a list of existing and planned applications?" Dave Airlie replied:

I have a plan to use it for something that no-one knows about yet..

I was going to use it for doing a DRM packet debug logger... to try and trace hangs in the system, using printk doesn't really help as guess what it slows the machine down so much that your races don't happen... I wrote some basic code for this already.. and I'm hoping to use some work time to get it finished at some stage...

Baruch Even also said, "I'm using relayfs during my development work to log the current TCP stack parameters and timing information. There is no reason that I can see to merge this into the kernel, but it's very useful for my development work. I'd like to see relayfs merged." And Tom also added:

I know that systemtap (http://sourceware.org/systemtap/) is using relayfs and that LTT (http://www.opersys.com/ltt/index.html) is also currently being reworked to use it.

I've also added a couple of people to the cc: list that I've consulted with in getting their applications to use relayfs, one of which is the logdev debugging device recently posted to LKML.

I also know that there are still users of the old relayfs around; I don't however know what their plans are regarding moving to the new relayfs.

My own personal interest is to start playing around with creating some visualization tools using data gathered from relayfs. Hopefully, I'll have more time to do that if relayfs gets merged. ;-)

Elsewhere, Christoph Hellwig remarked that the code itself looked very good. And close by, Andrew remarked that he was inclined to merge RelayFS even without a large contingent of users, because "relayfs is more for in-kernel "applications" than for userspace ones, if you like."

Elsewhere, Jason Baron objected, "regarding its use of vmap, http://marc.theaimsgroup.com/?l=linux-kernel&m=110755199913216&w=2 On x86, the vmap space is at a premium, and this space is reserved over the entire lifetime of a 'channel'. Is the use of vmap really critical for performance?" Tom confirmed:

Yes, the vmap'ed area is reserved over the lifetime of the channel, but the typical usage of a channel is transient - allocate it at the start of say a tracing run, and then vunmap it and free the memory when done. Unless you're using huge buffers, you wouldn't run into a problem running out of vmalloc space, and typical applications should be able to use relatively small buffers.

I don't really know how we would get around using vmap - it seems like the alternatives, such as managing an array of pages or something like that, would slow down the logging path too much to make it useful as a low overhead logging mechanism. I you have any ideas though, please let me know.

Steven Rostedt remarked, "My logdev device was pretty quick! The managing of the pages were negligible to the copying of the data to the buffer. Although, sometimes you needed to copy across buffers, but this too wouldn't be too much of an impact." He also said in a different post:

I believe that (Tom correct me if I'm wrong) the use of vmap was to allocate a large buffer without risking failing to allocate. Since the buffer does not need to be in continuous pages. If this is a problem, maybe Tom can use my buffer method to make a buffer :-)

See http://www.kihontech.com/logdev where my logdev debugging tool that allocates separate pages and uses an accounting system instead of the more efficient vmalloc to keep the data in the pages together. I'm currently working with Tom to get this to use relayfs as the back end. But here you can take a look at how the buffering works and it doesn't waste up vmalloc.

Rom replied, "The main reason we use vmap is so that from the kernel side we have a nice contiguous address range to log to even though the the pages aren't actually contiguous." He added, "It might be worthwhile to try out different alternatives and compare them, but I'm pretty sure we won't be able to beat what's already in relayfs. The question is I guess, how much slower would be acceptable?" Steven replied, "I totally agree that the vmalloc way is faster, but I would also argue that the accounting to handle the separate pages would not even be noticeable with the time it takes to do the actual copying into the buffer. So if the accounting adds 3ns on top of 500ns to complete, I don't think people will mind." Tom replied, "OK, it sounds like something to experiment with - I can play around with it, and later submit a patch to remove vmap if it works out."

Elsewhere along a different train of thought, Bert Hubert said:

I'm running into a wall with relayfs, which I intend to use to convey large amounts of disk statistics towards userspace.

Now, I've read Documentation/filesystems/relayfs.txt many times over, and I don't get it.

It appears there is relayfs, and 'klog' on top of that. It also appears that to access relayed data from the kernel in userspace there is librelay.c.

On reading librelay.c, I find code sending and receiving netlink messages, but relayfs.txt doesn't even contain the word netlink!

I then launched the 'kleak-app' sample program, but told it to look at /relay/diskstat* instead of its own file, but it gives me unspecified netlink errors.

Things I need to know, and which I hope to find documented somewhere:

  1. Do I need to do the netlink thing?
  2. What kind of messages do I need to send/receive?
  3. What is the exact format userspace sees in the relayfs file? Iow, can I access that file w/o using librelay.c?
  4. What are the semantics for reading from that file?
  5. When using klog, is there only one channel?
  6. does librelay.c talk to regular relayfs or to klog?

Don't get me wrong, relayfs sure looks nice for what I'm trying to do but from userspace it is sort of a black box right now..

Tom answered a lot of Bert's specific questions, and Bert posted a patch to add some real documentation for the filesystem.

3. Linux 2.6.13-rc3-mm1 Released; Some Consideration Of CKRM

15 Jul 2005 - 28 Jul 2005 (91 posts) Archive Link: "2.6.13-rc3-mm1"

Topics: Kernel Release Announcement, Ottawa Linux Symposium, Version Control

People: Andrew MortonChristoph HellwigPaul JacksonMark HahnGerrit HuizengaHelge HaftingAdrian BunkAlan Cox

Andrew Morton announced Linux 2.6.13-rc3-mm1, saying:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

(http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until kernel.org syncs up)

Christoph Hellwig asked, "Andrew, do we really need to add every piece of crap lying on the street to -mm? It's far away from mainline enough already without adding obviously unmergeable stuff like this." Andrew replied:

My gut reaction to ckrm is the same as yours. But there's been a lot of work put into this and if we're to flatly reject the feature then the developers are owed a much better reason than "eww yuk".

Otherwise, if there are certain specific problems in the code then it's best that they be pointed out now rather than later on.

What, in your opinion, makes it "obviously unmregeable"?

Paul Jackson replied:

Thanks to some earlier discussions on the relation of CKRM with cpusets, I've spent some time looking at CKRM. I'm not Christoph, but perhaps my notes will be of some use in this matter.

CKRM is big, it's difficult for us mere mortals to understand, and it has attracted only limited review - inadequate review in proportion to its size and impact. I tried, and failed, sometime last year to explain some of what I found difficult to grasp of CKRM to the folks doing it. See further an email thread entitled:

Classes: 1) what are they, 2) what is their name?
http://sourceforge.net/mailarchive/forum.php?thread_id=5328162&forum_id=35191

on the ckrm-tech@lists.sourceforge.net email list between Aug 14 and Aug 27, 2004

As to its size, CKRM is in a 2.6.5 variant of SuSE that I happen to be building just now for other reasons. The source files that have 'ckrm' in the pathname, _not_ counting Doc files, total 13044 lines of text. The CONFIG_CKRM* config options add 144 Kbytes to the kernel text.

The CKRM patches in 2.6.13-rc3-mm1 are similar in size. These patch files total 14367 lines of text.

It is somewhat intrusive in the areas it controls, such as some large ifdef's in kernel/sched.c.

The sched hooks may well impact the cost of maintaining the sched code, which is always a hotbed of Linux kernel development. However others who work in that area will have to speak to that concern.

I tried just now to read through the ckrm hooks in fork, to see what sort of impact they might have on scalability on large systems. But I gave up after a couple layers of indirection. I saw several atomic counters and a couple of spinlocks that I suspect (not at all sure) lay on the fork main code path. I'd be surprised if this didn't impact scalability. Earlier, according to my notes, I saw mention of lmbench results in the OLS 2004 slides, indicating a several percent cost of available cpu cycles.

A feature of this size and impact needs to attract a fair bit of discussion, because it is essential to a variety of people, or because it is intriguing in some other way.

I suspect that the main problem is that this patch is not a mainstream kernel feature that will gain multiple uses, but rather provides support for a specific vendor middleware product used by that vendor and a few closely allied vendors. If it were smaller or less intrusive, such as a driver, this would not be a big problem. That's not the case.

The threshold of what is sufficient review needs to be set rather high for such a patch, quite a bit higher than I believe it has obtained so far. It will not be easy for them to obtain that level of review, until they get better at arousing the substained interest of other kernel developers.

There may well be multiple end users and applications depending on CKRM, but I have not been able to identify how many separate vendors provide middleware that depends on CKRM. I am guessing that only one vendor has a serious middleware software product that provides full CKRM support. Acceptance of CKRM would be easier if multiple competing middleware vendors were using it. It is also a concern that CKRM is not really usable for its primary intended purpose except if it is accompanied by this corresponding middleware, which I presume is proprietary code. I'd like to see a persuasive case that CKRM is useful and used on production systems not running substantial sole sourced proprietary middleware.

The development and maintenance costs so far of CKRM appear (to this outsider) to have been substantial, which suggests that the maintenance costs of CKRM once in the kernel would be non-trivial. Given the size of the project, its impact on kernel code, and the rather limited degree to which developers outside of the CKRM project have participated in CKRM's development or review, this could either leave the Linux kernel overly dependent on one vendor for maintaining CKRM, or place an undo maintenance burden on other kernel developers.

CKRM is in part a generalization and descendent of what I call fair share schedulers. For example, the fork hooks for CKRM include a forkrates controller, to slow down the rate of forking of tasks using too much resources.

No doubt the CKRM experts are already familiar with these, but for the possible benefit of other readers:

UNICOS Resource Administration - Chapter 4. Fair-share Scheduler
http://oscinfo.osc.edu:8080/dynaweb/all/004-2302-001/@Generic__BookTextView/22883

SHARE II -- A User Administration and Resource Control System for UNIX
http://www.c-side.com/c/papers/lisa-91.html

Solaris Resource Manager White Paper
http://wwws.sun.com/software/resourcemgr/wp-mixed/

ON THE PERFORMANCE IMPACT OF FAIR SHARE SCHEDULING
http://www.cs.umb.edu/~eb/goalmode/cmg2000final.htm

A Fair Share Scheduler, J. Kay and P. Lauder
Communications of the ACM, January 1988, Volume 31, Number 1, pp 44-55.

The documentation that I've noticed (likely I've missed something) doesn't do an adequate job of making the case - providing the motivation and context essential to understanding this patch set.

Because CKRM provides an infrastructure for multiple controllers (limiting forks, memory allocation and network rates) and multiple classifiers and policies, its critical interfaces have rather generic and abstract names. This makes it difficult for others to approach CKRM, reducing the rate of peer review by other Linux kernel developers, which is perhaps the key impediment to acceptance of CKRM. If anything, CKRM tends to be a little too abstract.

Inclusion of diffstat output would help convey to others the scope of the patchset.

My notes from many months ago indicate something about a 128 CPU limit in CKRM. I don't know why, nor if it still applies. It is certainly a smaller limit than the systems I care about.

A major restructuring of this patch set could be considered, This might involve making the metric tools (that monitor memory, fork and network usage rates per task) separate patches useful for other purposes. It might also make the rate limiters in fork, alloc and network i/o separately useful patches. I mean here genuinely useful and understandable in their own right, independent of some abstract CKRM framework.

Though hints have been dropped, I have not seen any public effort to integrate CKRM with either cpusets or scheduler domains or process accounting. By this I don't mean recoding cpusets using the CKRM infrastructure; that proposal received _extensive_ consideration earlier, and I am as certain as ever that it made no sense. Rather I could imagine the CKRM folks extending cpusets to manage resources on a per-cpuset basis, not just on a per-task or task class basis. Similarly, it might make sense to use CKRM to manage resources on a per-sched domain basis, and to integrate the resource tracking of CKRM with the resource tracking needs of system accounting.

And Mark Hahn said:

CKRM is all about resolving conflicting resource demands in a multi-user, multi-server, multi-purpose machine. this is a huge undertaking, and I'd argue that it's completely inappropriate for *most* servers. that is, computers are generally so damn cheap that the clear trend is towards dedicating a machine to a specific purpose, rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.

this is *directly* in conflict with certain prominent products, such as the Altix and various less-prominent Linux-based mainframes. they're all about partitioning/virtualization - the big-iron aesthetic of splitting up a single machine. note that it's not just about "big", since cluster-based approaches can clearly scale far past big-iron, and are in effect statically partitioned. yes, buying a hideously expensive single box, and then chopping it into little pieces is more than a little bizarre, and is mainly based on a couple assumptions:

CKRM is one of those things that could be done to Linux, and will benefit a few, but which will almost certainly hurt *most* of the community.

let me say that the CKRM design is actually quite good. the issue is whether the extensive hooks it requires can be done (at all) in a way which does not disporportionately hurt maintainability or efficiency.

CKRM requires hooks into every resource-allocation decision fastpath:

but really, this is only for CKRM-enforced limits. CKRM really wants to change behavior in a more "weighted" way, not just causing an allocation/fork/packet to fail. a really meaningful CKRM needs to be tightly integrated into each resource manager - effecting each scheduler (process, memory, IO, net). I don't really see how full-on CKRM can be compiled out, unless these schedulers are made fully pluggable.

finally, I observe that pluggable, class-based resource _limits_ could probably be done without callbacks and potentially with low overhead. but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource partitioning within a large, shared machine.

Paul agreed with all of this. Several folks spoke in favor of CKRM. Gerrit Huizenga commented, "CKRM's goal is to do simple workload management both on laptops and on servers. I'm not opposed to doing a few things overly simply as long as we get some basic capability. And we can refine with experience. I'm definitly not looking to make CKRM any more complex than it has to be, and yet I also want it to be useful on a laptop, small single CPU machine, as well as larger servers."

At one point, Alan Cox remarked that anyone could simply choose to configure CKRM out of their kernel at compilation time. There was no need to accept any of its drawbacks unless its benefits were more valuable to that user. To this, Gerrit said:

I'm actually trying to keep the impact of CKRM=y to near-zero, ergo only an impact if you create classes. And even then, the goal is to keep that impact pretty small as well.

And yes, a hypervisor does have a lot more overhead in many forms. Something like an overall 2-3% everywhere, where the CKRM impact is likely to be so small as to be hard to measure in the individual subsystems, and overall performance impact should be even smaller. Plus you won't have to manage each operating system instance which can grow into a pain under virtualization. But I still maintain that both have their place.

The debate continued, with no real resolution.

Elsewhere, regarding the kernel release, Helge Hafting reported:

I usually compile without module support. This time, I turned modules on in order to compile an external module.

To my surprise, drivers/scsi/qla2xxx/qla2xxx.ko were built even though no actual modules are selected in my .config, and the source is not patched at all except the mm1 patch.

Adrian Bunk replied, "Known bug, alresdy fixed in -mm3."

4. Adding -Wundef To CFLAGS During Kernel Compilation

21 Jul 2005 - 23 Jul 2005 (5 posts) Archive Link: "[PATCH] add -Wun-def to global CFLAGS"

People: Olaf HeringSam Ravnborg

Olaf Hering said, "A recent change to the aic scsi driver removed two defines to detect endianness. cpp handles undefined strings as 0. As a result, the test turned into #if 0 == 0 and the wrong code was selected. Adding -Wundef to global CFLAGS will catch such errors." Sam Ravnborg replied, "To my suprise it did not spew out a lot of warnings in my build. In the kernel we quite consitently use #ifdef - good! Applied." Olaf also posted another patch to "turn many #if $undefined_string into #ifdef $undefined_string to fix some warnings after -Wno-def was added to global CFLAGS" .

5. Adding Hotswap Support To libata

21 Jul 2005 - 28 Jul 2005 (7 posts) Archive Link: "[PATCH 0/3] Add disk hotswap support to libata"

Topics: Disks: SCSI, Serial ATA

People: Lukasz KosewskiJeff Garzik

Lukasz Kosewski said:

This sequence of patches will add a framework to libata to allow for hot-swapping disks in and out.

There are three patches:
01-promise_sataII150_support
02-libata_hotswap_infrastructure
03-promise_hotswap_support

The rationale for each will be described in following emails.

I encourage anyone with design ideas to come forward and contribute, and anyone who can see concurrency problems (I will describe what I see as issues along with the second patch) to suggest fixes.

Thus far, I've tested this HEAVILY with a 2.6.11.12 kernel + 2.6.11-libata-dev1.patch. I have found no issues outstanding on that kernel. All testing was done with Promise SATA150 and SATAII150 Tx4/Tx2 Plus controllers and a huge variety of Western Digital and Seagate disks.

I have ported my patches to 2.6.13-rc3 and 2.6.13-rc3-mm1, and have tested on the latter as well; they work identically to the 2.6.11 tests except for a breakage in the SCSI layer.

The patches I will attach apply to the latter (2.6.13-rc3-mm1) tree, since I expect that by the time people start looking at them seriously, the existing libata patches in that tree will have become part of mainline. If this is NOT the right thing to do, please tell me, and I'll submit patches for the requested kernel version.

Jeff Garzik replied, "Pretty cool stuff! As soon as I finish SATA ATAPI (this week[end]), I'll take a look at this. A quick review of the patches didn't turn up anything terribly objectionable, though :)" He suggested Ccing the linux-ide mailing list on further discussion; and Lukasz reposted the patches to that list. Meanwhile, Doug Maxey offered to help test the patches, for which Lukasz was very grateful.

6. Some Developer Disconnect Over Touchscreen Support For Sharp SL-5500

22 Jul 2005 - 25 Jul 2005 (18 posts) Archive Link: "[patch 1/2] Touchscreen support for sharp sl-5500"

Topics: FS: sysfs, Touchscreen

People: Pavel MachekRussell KingDmitry Torokhov

Pavel Machek posted a patch, saying, "This adds support for reading ADCs (etc), neccessary to operate touch screen on Sharp Zaurus sl-5500. Please apply." Russell King replied:

I would like to know what the diffs are between my version (attached) and this version before they get applied.

The only reason my version has not been submitted is because it lives in the drivers/misc directory, and mainline kernel folk don't like drivers which clutter up that directory. In fact, I had been told that drivers/misc should remain completely empty - which makes this set of miscellaneous drivers homeless.

Pavel checked out Russell's version, and found the diff to be quite large. He added, "I have made quite a lot of cleanups to touchscreen part, and it seems to be acceptable by input people. I think it should go into drivers/input/touchscreen/collie_ts.c... Also it looks to me like mcp.h should go into asm/arch-sa1100, so that other drivers can use it..." Dmitry Torokhov said he had some technical suggestions and "one bigger concern - I am surprised that a driver for a physical device is implemented as an interface to a class device. This precludes implementing any kind of power management in the driver and pushes it into the parent and is generally speaking is a wrong thing to do (IMHO)." He said, "If the problem is that you have a single piece of hardware you need to bind several drivers to - I guess you will have to create a new sub-device bus for that. Or just register sub-devices on the same bus the parent device is registered on - I am not sure what is best in this particular case - I am not familiar with the arch. It is my understanding that the purpose of interfaces to to present different "views" to userspace and therefore they are not quie suited for what you are trying to do..." Russell replied:

That is exactly the problem - these kinds of devices do _not_ fit well into the device model. A struct device for every different possible sub-unit is completely overkill.

For instance, you may logically use one ADC and some GPIO lines on the device for X and something else for Y and they logically end up in different drivers.

The problem is that the parent doesn't actually know how many devices to create nor what to call them, and they're logically indistinguishable from each other so there's no logical naming system.

Dmitry replied, "Then we should probably not try to force them into driver model. Have parent device register struct device and when sub-drivers register they could attach class devices (like input devices) directly to the "main" device thus hiding presence of sub-sections of the chip from sysfs completely. My point is that we should not be using class_interface here - its purpose is diferent." And Russell said:

If you look at _my_ version, you'll notice that it doesn't use the class interface stuff. A previous version of it did, and this seems to be what the collie stuff is based upon.

What I suggest is that the collie folk need to update their driver to my version so that we don't have two different forks of the same driver in existance. Then we can start discussing whether things should be using kthreads or not.

Pavel said he'd take this suggestion; and as it turned out, his version was based on an earlier version by Russell, and Pavel made plans to catch up, and apologized for the confusion.

7. Some Discussion Of Users Tracking Kernel Releases

25 Jul 2005 - 27 Jul 2005 (8 posts) Archive Link: "Question re the dot releases such as 2.6.12.3"

People: Gene HeskettBrian GerstKurt WallSteven RostedtValdis Kletnieks

Gene Heskett reported, "I just built what I thought was 2.6.12.3, but my script got a tummy ache because I didn't check the Makefile's EXTRA_VERSION, which was set to .2 in the .2 patch. Now my 2.6.12 modules will need a refresh build. :( So whats the proper patching sequence to build a 2.6.12.3?" Brian Gerst replied, "The dot-release patches are not incremental. You apply each one to the base 2.6.12 tree." Kurt Wall replied, "This bit me a while back, too. I'll submit a patch to the top-level README to spell it out." Steven Rostedt added:

Someone should also fix the home page of kernel.org. Since there's no link on that page that points to the full 2.6.12. Since a lot of the patches on that page go directly against the 2.6.12 kernel and not 2.6.12.3, it would be nice to get the full source of that kernel from the home page.

If I want to incremently build the 2.6.13-rc3-mm1, would I need to download the 2.6.12 tar ball, followed by the 2.6.13-rc3 patch and then the 2.6.13-rc3-mm1 patch and apply them that way? If so, I can get all the patches but the starting point. Yes I could also download the full version of any of these, but it still seems to make sense to include the starting point of the patches on the home page.

Valdis Kletnieks added, "Even more to the point - when 2.6.13 comes out, there will be a patch against 2.6.12, not 2.6.12.N, which means you get to download the 2.6.12.N tarball, the 2.6.12.N patch, patch -R that, and *then* apply the 2.6.13 patch."

8. Importing Older Kernel History From BitKeeper To git

26 Jul 2005 - 27 Jul 2005 (6 posts) Archive Link: "Linux BKCVS kernel history git import.."

Topics: Compression, Version Control

People: Linus TorvaldsDavid Woodhouse

Linus Torvalds said:

Ok, I'm uploading my current git CVS import results to kernel.org right now, which is my current best effort (meaning: I may try to improve on it even if there aren't any more cvsps bugs/features I have to fix, and obviously I'll re-create it if there _are_ cvsps or cvsimport bugs that cause the import to have problems).

I've "verified" it in the sense that I've done a "git-whatchanged -p" at various stages of the import, and it looked sane. I also compared doing a tar-tree-export of the 2.6.12-rc2 release, which exists both in my current git tree _and_ in the old bkcvs tree, and they compared identically apart from the fact that the bkcvs tree has the BitKeeper/ directory and a ChangeSet file.

It's also pretty aggressively packed - I used "--window=50 --depth=50" (rather than the default 10 for both) to make the archive smaller, so it's going to be somewhat more CPU-intensive to use (due to the possibly longer delta chains), but it got the pack-file down from 204MB to 166MB, which I think is pretty damn good for three years of history or whatever it is.

Especially considering that a gzip -9'd tar-file of the 2.6.12-rc2 release is 45MB all on its own, that archive is just 3.6 times a single tree.

Of course, this _is_ the cvs import, which means that it's basically just a straight-line linearization of the real BK history, but it's a pretty good linearization and so it's certainly useful.

If somebody adds some logic to "parse_commit()" to do the "fake parent" thing, you can stitch the histories together and see the end result as one big tree. Even without that, you can already do things like

git diff v2.6.10..v2.6.12

(which crosses the BK->git transition) by just copying the 166MB pack-file over, along with the tags that come with the thing. I've not verified it, but if that doesn't work, then it's a git bug. It _should_ work.

BIG NOTE! This is definitely one archive you want to "rsync" instead of closing with a git repack. The unpacked archive is somewhere in the 2.4GB region, and since I actually used a higher compression ratio than the default, you'll transfer a smaller pack that way anyway.

It will probably take a while to mirror out (in fact, as I write this, the DSL upload just from my local machine out still has fifteen minutes to go), but it should be visible out there soonish. Please holler if you find any problems with the conversion, or if you just have suggestions for improvments.

It actually took something like 16 hours to do the conversion on my machine (most of it appears to have been due to CVS being slow, the git parts were quick), so I won't re-convert for any trivial things.

I'm planning on doing the 2.4 tree too some day - either as a separate branch in the same archive, or as a separate git archive, I haven't quite decided yet. But I was more interested int he 2.6.x tree (for obvious reasons), and before I do the 2.4.x one I'd like to give that tree some time for people to check if the conversion was ok.

One thing that could be verified, for example (but that I have _not_ done), is to do a few random "git diff v2.6.x..v2.6.y" and comparing the result with the standard diffs that are out there. Just to verify that the archive looks ok. I assume there is some "diff-compare" out there that can handle the fact that the files are diffed in a different order (and with different flags) etc.

Regarding the 'git diff v2.6.10..v2.6.12' command Linus posted, David Woodhouse remarked:

That's a bit of a hack which really doesn't belong in the git tools. It's not particularly hard to reparent the tree for real -- I'd much rather see a tool added to git which can _actually_ change the 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 commit to have a parent of 0bcc493c633d78373d3fcf9efc29d6a710637519, and ripple the corresponding SHA1 changes up to the current HEAD.

Note that the latter commit ID I gave there was actually the 2.6.12-rc2 commit in Thomas' history import, not your own. Thomas has done a lot of work on it, and it has the full names extracted from the shortlog script, full timestamps, branch/merge history and consistent character sets in the commit logs. I'd definitely suggest that you use that instead of the import from bkcvs.

http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=summary

Linus replied:

I used to think I wanted to, but these days I really don't. One of the reasons is that I expect to try to pretty up the old bkcvs conversion some time: use the name translation from the old "shortlog" scripts etc, and see if I can do some other improvements on the conversion (I think I'll remove the BK files - "ChangeSet" etc).

And it's really much easier and more general to have a "graft" facility. It's something that git can do trivially (literally a hook in "parse_commit" to add a special parent), and it's actually a generic mechanism exactly for issues like this ("project had old history in some other format").

Somebody already asked for having the import history for old historic patches - which we _do_ actually have as patches, but which obviously don't have any changelogs except for the version information. Most people may not want that, but the thing is, with a "graft" facility, the people who _do_ want that can easily see it all, and it is totally seamless.

So it's not even a one-time hack - it's a real feature that just in the kernel would have several cases we'd be able to use it for, and the same is likely true for almost any other project that wasn't started purely from git..

David said, "OK. That works and can also be used for the "fake _absence_ of parent" thing -- if I'm space-constrained and want only the history back to some relatively recent point like 2.6.0, I can do that by turning the 2.6.0 commit into an orphan instead of also using all the rest of the history back to 2.4.0." And Linus replied:

Yes. The grafting really should work pretty well for various things like this, and at the same time I don't think it's ever going to be a huge problem: people may have a couple of graft-points (if you want to drop history, you may well have more than one point you need to "cauterize": you may not be able to just cut it off at 2.6.0, since there may be merges furhter back in history), but I don't think it's going to explode and become unwieldly.

I just don't see people having more than a few trees that they might want to graft together, and while the "drop history" thing might cause more issues, even that is bounded by the amount of development parallellism, so while it probably causes more graft-points than the "join trees" usage, it should still be just a small handful of points.

9. Linux 2.4.32-pre2 Released

27 Jul 2005 - 28 Jul 2005 (5 posts) Archive Link: "Linux 2.4.32-pre2"

Topics: Compression, SMP, USB

People: Marcelo TosattiLarry WoodmanDavid S. MillerJakub BoguszAlan SternPete Zaitcev

Marcelo Tosatti announced Linux 2.4.32-pre2, saying:

Here goes another -pre, after a long period.

A couple of USB corrections, a socket hashing bugfix and ipvs race condition, avoidance of rare inode cache SMP race.

And a zlib security update (erratic changelog for that one, my fault), whose CAN number is: CAN-2005-1849

Summary of changes from v2.4.32-pre1 to v2.4.32-pre2
============================================

Alan Stern:
file_storage and UHCI bugfixes

David S. Miller:
[NETLINK]: Fix two socket hashing bugs.

Jakub Bogusz:
[SPARC64]: fix sys32_utimes(somefile, NULL)

Larry Woodman:
workaround inode cache (prune_icache/__refile_inode) SMP races

Marcelo Tosatti:
Change VERSION to 2.4.32-pre2
Merge with rsync://rsync.kernel.org/.../davem/net-2.4.git
Revert [NETLINK]: Fix two socket hashing bugs.

Neil Horman:
[IPVS]: Close race conditions on ip_vs_conn_tab list modification

Pete Zaitcev:
usb: printer double up()

Tim Yamin:
Merge with rsync://rsync.kernel.org/.../davem/sparc-2.4.git/
The gzip description is as good as the ChangeLog says it is -: "Set n to

 

 

 

 

 

 

Sharon And Joy
 

Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.