Table Of Contents
|1.||21 Jul 2004 - 22 Jul 2004||(93 posts)||Status Of 2.6 Development Style; DevFS To Be Gone In 2.8|
|2.||22 Jul 2004 - 31 Jul 2004||(61 posts)||New DumpFS API For RAS Components|
|3.||27 Jul 2004 - 2 Aug 2004||(12 posts)||Altix System Controller Interface To User-Space|
|4.||28 Jul 2004 - 3 Aug 2004||(40 posts)||Linux 2.6.8-rc2-mm1 Released|
|5.||29 Jul 2004 - 4 Aug 2004||(49 posts)||Allowing Non-Root User To mlock Memory|
|6.||30 Jul 2004 - 4 Aug 2004||(13 posts)||Token-Based Thrashing Control|
|7.||31 Jul 2004 - 1 Aug 2004||(3 posts)||Linux 2.4.27-rc4|
|8.||2 Aug 2004 - 5 Aug 2004||(55 posts)||Linux 2.6.8-rc2-mm2 Released|
|9.||3 Aug 2004 - 4 Aug 2004||(3 posts)||Linux 2.4.27-rc5 Released|
Mailing List Stats For This Week
We looked at 1724 posts in 9677K.
There were 434 different contributors. 241 posted more than once. 167 posted last week too.
The top posters of the week were:
1. Status Of 2.6 Development Style; DevFS To Be Gone In 2.8
21 Jul 2004 - 22 Jul 2004 (93 posts) Archive Link: "[PATCH] delete devfs"
Topics: FS: devfs
People: Jonathan Corbet, Andrew Morton
Grek KH posted a patch to remove DevFS from the official kernel tree, and there was a whopping big discussion about it. In the course of this discussion, Jonathan Corbet said that "Andrew's vision, as expressed at the summit, is that the mainline kernel will be the fastest and most feature-rich kernel around, but not, necessarily, the most stable. Final stabilization is to be done by distributors (as happens now, really), but the distributors are expected to merge their patches quickly." As far as how this would impact DevFS, in terms of avoiding destablizing 2.6, Andrew remarked:
mid-2005 would be an appropriate time to remove devfs. If that schedule pushes things along faster than they would otherwise have progressed, well, good.
Nothing is cast in stone here btw - we're pushing the envelope, trying new things, keeping that which works well and reexamining things which perhaps don't work so well. Feel free to disagree - we're listening.
Greg agreed to wait another year before removing DevFS, and Andrew affirmed that DevFS would be gone by the 2.8 release.
2. New DumpFS API For RAS Components
22 Jul 2004 - 31 Jul 2004 (61 posts) Archive Link: "Announce: dumpfs v0.01 - common RAS output API"
Topics: Compression, Executable File Format, Kexec, Ottawa Linux Symposium
People: Keith Owens, Andrew Morton, Eric W. Biederman, Suparna Bhattacharya
Keith Owens said:
Announcing dumpfs - a common API for all the RAS code that wants to save data during a kernel failure and to extract that RAS data on the next boot. The documentation file is appended to this mail.
ftp://oss.sgi.com/projects/kdb/download/dumpfs - current version is v0.01, patch against 2.6.8-rc2.
This is a work in progress, the code is not complete and is subject to change without notice.
dumpfs-v0.01 handles mounting the dumpfs partitions, including reliable sharing with swap partitions and clearing the dumpfs partitions. I am working on the code that reads and writes dumpfs data from kernel space, it is incomplete and has not been tested yet. After dumpfs_kernel is working, dumpfs_user is trivial. The code is proof of concept, some sections of the API (including polled I/O and data compression) are not supported yet, and some of the code is ugly.
Why announce incomplete and untested code? Mainly because RAS and kernel dumping are being discussed at OLS this week. Since I cannot be at OLS, this is the next best thing. Also the dumpfs API has stabilized for the first cut, so it is time to get more discussion on the API and to determine if it is worth continuing with the dumpfs approach. If dumpfs is discussed at OLS then I would appreciate any feedback.
Questions for the other people who care about RAS (which rules out most of the kernel developers) -
Obviously I think that this makes sense. At the moment every bit of RAS code has its own dedicated I/O mechanism, not to mention its own user space tools to interface with the kernel, and to initialize, extract and clear its own data.
dumpfs consolidates a lot of common code that is scattered over several RAS tools. dumpfs removes the need for special RAS tools to extract dump data on reboot, instead standard user space commands will do the job.
Making mount dumpfs share the partition with swap is ugly. OTOH most of the existing code that dumpfs is intended to replace makes no attempt to verify its partition usage. At least dumpfs tries to verify its partition data, ugly though the code is.
One obvious extension is to make compression selective, so that some sections of the file can be compressed and others be in clear text. The lcrash header springs to mind. Omitted for now since this version does not support compression yet.
One thing that is absolutely required for reliable RAS output is a polling mode method. netdump is available for the network, we need the equivalent for disk I/O. What is the best way to integrate polling mode I/O into the block device subsystem?
If the people who care about RAS think that a common RAS output API is worthwhile then I will continue working on dumpfs. Otherwise it will be just another idea that did not get taken up, and each RAS tool will continue to be developed and maintained in isolation.
dumpfs provides a common API for RAS components that need to dump kernel data during a problem. The dumped data is expected to be copied and cleared on the next successful boot.
dumpfs consists of two layers, with completely different semantics. These are dumpfs (kernel only) and dumpfs_user (user space view of any saved dump data).
dumpfs uses one mount for each dump partition. Each dumpfs partition can be mounted with option share or noshare, the default is noshare. The only allowable user space operations on a dumpfs partition are mount and umount, user space cannot directly access the dumpfs data. Each dumpfs partition is mounted with "mount -t dumpfs /dev/partition /mnt/dumpfs". /mnt/dumpfs must be a directory; it never contains anything useful but the mount semantics require a directory here.
A shared dumpfs partition will normally coexist with a swap partition; the dumpfs superblock is stored at an offset which leaves the swap signature alone. A shared dump partition has no superblock on disk until the first dump file is created. Mounting a dumpfs partition with "-o clear" will completely zero the dumpfs superblock, including the magic field. This ensures that old dumpfs data in a shared partition will not be used, its contents are unreliable because of the data sharing.
When mounting a shared dumpfs partition, no check is made to see if the disk contains a dumpfs superblock. Mounting a dumpfs partition with -o share will only share with a swap partition, it will not share with any other mounted partition.
A non-shared dumpfs partition must have a superblock before being mounted. mkfs.dumpfs and fsck.dumpfs (only used for non-shared partitions) are trivial. Mounting dumpfs with "-o noshare,clear" will clear the metadata in the dumpfs superblock, but preserve the magic field.
dd if=/dev/zero of="$1" bs=64k count=1
echo 'dum0' | dd of="$1" bs=64k seek=1 conv=sync
Each dumpfs partition can be mounted with option poll or nopoll, the default is poll. Poll uses low level polled mode I/O direct to the partition, completely bypassing the normal interrupt driven code. This is done in an attempt to get the data out to disk even when the kernel is so badly broken that interrupts are not working. Poll requires that the device driver for the dumpfs partition supports polling mode I/O. Nopoll uses the standard kernel I/O mechanisms, so it is not guaranteed to work when the kernel is crashing. Nopoll should only be used when your device driver does not support polling mode I/O yet; you must accept that dumpfs may hang waiting for the I/O to be serviced.
Another option when mounting a dumpfs partition is to specify the size of its data buffer, in kibibytes. This buffer is permanently allocated as long as the dumpfs partition is mounted, it is only used when writing RAS data via dumpfs. The buffer size will be rounded up to a multiple of the kernel page size. The default is buffer=128.
The user space view of the RAS data held in the dumpfs partitions is created by "mount -t dumpfs_user none /mnt/dumpfs". It logically merges and validates all the dumpfs partitions that have been mounted and provides a user space view of the files that have been written to dumpfs. The only user space operations supported on dumpfs_user are llseek, read, readdir, open (read only), close and unlink. Just enough to copy the files out of dumpfs_user and remove them. User space cannot write to dumpfs_user.
The kernel can write to files held in dumpfs partitions, to save RAS data over a reboot. Note that when kernel RAS components write to dumpfs they do not use the normal VFS layer, it may not be working during a failure. Instead a RAS component makes direct calls to the following dumpfs_kernel functions.
Create and open for writing a file in dumpfs. It returns a file descriptor within dumpfs.
The dumpfs filename is constructed from "prefix-" followed by the value of xtime in the format CCYY-MM-DD-hh:mm:ss.n, where n starts at 0 and is incremented for each dumpfs file in the current boot.
There is no requirement that a dumpfs_user mount point exist before the kernel can dump its data. The first call to dumpfs_kernel_open will automatically create a kernel view that merges all the mounted dumpfs partitions. The first call to dumpfs_kernel_open also writes the dumpfs superblocks to any shared partitions.
Flags select compression, if any.
dumpfs_kernel_open() is the simple interface. It automatically stripes the data across all dumpfs partitions that are not currently being used.
Most RAS code will open one dump file at a time, mainly because most users will only have one dumpfs partition. The dumpfs code has a module_parm called dumpfs_max_open, with a default value of 1.
dumpfs_kernel_open_choose("prefix", flags, bdev_list)
Some platforms may need to have multiple output streams open in parallel. For example a system with large amounts of memory and multiple disks may wish to assign different sections of memory to each cpu and to write to separate partitions.
dumpfs_kernel_bdev_list() returns the list of usable dumpfs partitions. If all partitions are in use then the list is empty.
dumpfs_kernel_open_choose() opens a file using only the selected bdev entries.
Systems that use concurrent parallel dumps should set module_parm dumpfs_max_open to a suitable value.
Note: The following problems are inherently architecture and platform specific and are outside the scope of dumpfs. That is not to say that we should not have an API for handling these problems on large systems, but it would be a separate API from dumpfs.
Deciding which cpus to use for parallel dumping. Deciding which block devices each cpu should use. Getting the chosen cpus into the RAS code. Assigning the range of work to each cpu and each partition. Watching the dumping cpus for problems, recovering from those problems and reassigning the work to another cpu. Reconstructing the parallel dumps into a format for analysis. dumpfs_user
makes each dump file available to user space, but some code may be required to merge the separate files together.
Sync the file's data to disk, close the file and update the dumpfs metadata.
dumpfs_kernel_write(fd, buffer, length)
Write the buffer at the current dumpfs file location. The data may or may not be written to disk immediately. It returns the current location, including the data that was just written.
For performance, the dumpfs data is striped over all the assigned partitions, in round robin. The stripe unit is the minimum of the buffer= value across all the assigned partitions.
dumpfs_kernel_read(fd, buffer, length)
Read the buffer from the current dumpfs file location. It returns the current location, including the data that was just read.
Set the current dumpfs file location. It returns the previous location. Only absolute seeking is supported.
Sync the file's data to disk and update the dumpfs metadata.
Returns true if any shared partitions have been dirtied, in which case the kernel must be rebooted after all the RAS components have completed their work.
Returns true if all dumpfs partitions can support polling mode I/O. Otherwise the RAS code that calls dumpfs should enable interrupts, if at all possible.
Sample /etc/fstab entries for dumpfs partitions.
Sample code in /etc/rc.sysinit to save dump data from the previous boot. If you are sharing dumpfs with swap, these commands must be executed before mounting swap. Note that dumpfs does not require any special user space tools to poke inside partitions to see if there is any useful data to save, everything is a file.
# mount all the dumpfs partitions
mount -a -t dumpfs
# merge all dumpfs into dumpfs_user on /mnt/dump mount -t dumpfs_user none /mnt/dump
# copy the data out
(cd /mnt/dump; for f in `find -type f`; do echo saving $f; mv $f /var/log/dump; done) # drop dumpfs_user
# clear all the dumpfs metadata
umount -a -t dumpfs
mount -a -t dumpfs -o clear
umount -a -t dumpfs
rc.sysinit will later mount the swap partitions, then mount all the other partition types. That will remount the dumpfs partitions, ready for the next kernel crash.
Regarding the question of how to get a clean API to do polling mode input/output to disk, Andrew Morton replied:
We hope to not have to. The current plan is to use kexec: at boot time, do a kexec preload of a small (16MB) kernel image. When the main kernel crashes or panics, jump to the kexec kernel. The kexec kernel will hold a new device driver for /dev/hmem through which applications running under the kexec'ed kernel can access the crashed kernel's memory.
Write the contents of /dev/hmem to stable storage using whatever device drivers are in the kexeced kernel, then reboot into a real kernel again.
That's all pretty simple to do, and the quality of the platform's crash dump feature will depend only upon the quality of the platform's kexec support.
People have bits and pieces of this already - I'd hope to see candidate patches within a few weeks. The main participants are rddunlap, suparna and mbligh.
Eric W. Biederman asked, "Does anyone have a proof of concept implementation? I have been able to find a little bit of time for this kind of thing lately and have just done the x86-64 port. (You can all give me a hard time about taking a year to get back to it :) I am in the process of breaking everything up into their individual change patches and doing a code review so I feel comfortable with sending the code to Andrew. So this would be a very good time for me to look at any code for reporting a crash dump with a kernel started with kexec." Suparna Bhattacharya replied, "Hari has a nice POC implementation - it might make sense for him to post it rightaway for you to take a look. Basically, in addition to hmem (oldmem), the upcoming kernel exports an ELF core view of the saved register and memory state of the previous kernel as /proc/vmcore.prev (remember your suggestion of using an ELF core file format for dump ?), so one can use cp or scp to save the core dump to disk. He has a quick demo, where he uses gdb (unmodified) to open the dump and show a stack trace of the dumping cpu."
3. Altix System Controller Interface To User-Space
27 Jul 2004 - 2 Aug 2004 (12 posts) Archive Link: "[PATCH] Altix system controller communication driver"
People: Greg Howard, Christoph Hellwig, Jes Sorensen, Andrew Morton
Greg Howard said:
The following patch ("altix-system-controller-driver.patch") implements a driver that allows user applications to access the system controllers on SGI Altix machines. It applies on top of the 2.6.8-rc-mm1 patch.
Most of the patch is just the new file drivers/char/snsc.c. It allows system-controller-related applications (e.g., "flashsc" which flashes the system controller firmware) to forward data to SAL; SAL contains the code that multiplexes this system controller traffic with other such traffic (including console I/O). It's expected that each node will have a corresponding system controller device file, and each such device file can be used to open a number of "subchannels". The data structures and macros for the new driver are kept in a separate header file (snsc.h), since I anticipate eventually adding an additional file that will leverage some of this code to log environmental event notifications coming from the system controllers. Inline wrapper functions for the the SAL services used by the driver have been added to include/asm-ia64/sn/sn_sal.h.
The only other significant (though small) change is in the Altix console driver, drivers/serial/sn_console.c. This driver must share an interrupt with snsc.c. A few config-related files are also patched (sn2_defconfig and drivers/char/[Kconfig,Makefile]).
Jes Sorensen and Andrew Morton offered some technical suggestions for the patch; and after a little back-and-forth, Greg posted an update. Christoph Hellwig and Andrew now offered suggestions; and the thread petered out, with Greg on the way to produce another update.
4. Linux 2.6.8-rc2-mm1 Released
28 Jul 2004 - 3 Aug 2004 (40 posts) Archive Link: "2.6.8-rc2-mm1"
Topics: Disks: IDE, I2O, Kernel Build System, Kernel Release Announcement, Software Suspend, Version Control
People: Andrew Morton
Andrew Morton announced Linux 2.6.8-rc2-mm1, saying:
5. Allowing Non-Root User To mlock Memory
29 Jul 2004 - 4 Aug 2004 (49 posts) Archive Link: "[patch] mlock-as-nonroot revisted"
People: Arjan van de Ven, Andrea Arcangelo, Andrea Arcangeli, Rik van Riel, Andrew Morton
Arjan van de Ven said:
Below is a fixed up patch to allow non-root to mlock memory (but only if the rlimit allows it, which defaults to 0). This is needed/useful for oracle and co to be allowed to mlock/use hugetlb fs running as non-privileged user. Also setting the limit to 4Kb can be very useful for gnupg and similar apps.
Compared to the previous revision of this patch; shm accounting has been changed to be per user struct, while keeping track of which user struct allocated the shm segment in the first place. This is done in order to avoid the security bug where one process/user could mlock and another munlock which would screw up the accounting.
Andrew Morton seemed to recall Andrea Arcangeli having some technical objections to a previous incarnation of the patch, and asked Andrea if these still held true; Andrea replied:
yep, the rlimit for mlocked stuff works only for the pagetables pinning. In turn it works perfectly for mlock. But shared memory or hugetlbfs obviously aren't pinned via the pagetables in the virtual address space, such things are persistent and non-swappable objects, even killing the task won't change a thing.
So as described some month ago such patch is insecure and conceptually flawed since they're using rlimits to control persistent objects that have absolutely nothing to do with the task itself, which in turns make the rlimit useless.
the very best one can do right now is the below.
If you remove the shm/hugetlbfs brokeness from the rlimit patch that will become a safe feature for mlock (for mlock it works fine since mlock is all about pinning the pagetables, not about persistent objects that have nothing to do with the task), but it won't change almost anything for oracle standpoint since it doesn't allow hugetlbfs usage anyways.
Rik van Riel said that Andrea's objections had already been addressed in the current patch, but Andrea said no, the patch was still broken. Andrea remarked, "I'm looking forward to the next fixed revision. I'm not against per-user myself, but it's not like doing it for transient memory or transient objects associated with the task itself. Furthemore I'm not convinced rlimits should be used for such persistent things that have nothing to do with running tasks but ok, I can live with it if it works." He and Rik argued about it for awhile, but were not able to see eye to eye. The thread ended inconclusively.
6. Token-Based Thrashing Control
30 Jul 2004 - 4 Aug 2004 (13 posts) Archive Link: "[PATCH] token based thrashing control"
People: Rik van Riel, Andrew Morton, Con Kolivas, Song Jiang
Rik van Riel said:
The following experimental patch implements token based thrashing protection, using the algorithm described in:
When there are pageins going on, a task can grab a token, that protects the task from pageout (except by itself) until it is no longer doing heavy pageins, or until the maximum hold time of the token is over.
If the maximum hold time is exceeded, the task isn't eligable to hold the token for a while more, since it wasn't doing it much good anyway.
I have run a very unscientific benchmark on my system to test the effectiveness of the patch, timing how a 230MB two-process qsbench run takes, with and without the token thrashing protection present.
normal 2.6.8-rc6: 6m45s 2.6.8-rc6 + token: 4m24s
This is a quick hack, implemented without having talked to the inventor of the algorithm. He's copied on the mail and I suspect we'll be able to do better than my quick implementation ...
The next day he replied to himself:
I've now also ran day-long kernel compilate tests, 3 times each with make -j 10, 20, 30, 40, 50 and 60 on my dual pIII w/ 384 MB and a 180 MB named in the background.
For make -j 10 through make -j 50 the differences are in the noise, basically giving the same result for each kernel.
However, for make -j 60 there's a dramatic difference between a kernel with the token based swapout and a kernel without.
normal 2.6.8-rc2: 1h20m runtime / ~26% CPU use average 2.6.8-rc2 + token: 42m runtime / ~52% CPU use average
Time to dig out a dedicated test machine at the office and do some testing with (RE-)AIM7, I wonder if the max number of users supported will grow...
Andrew Morton remarked, "OK. My test is usually around 50-60% CPU occupancy so we're not gaining in the moderate swapping range." Con Kolivas said:
We have some results that need interpreting with contest.
mem_load: Kernel [runs] Time CPU% Loads LCPU% Ratio 2.6.8-rc2 4 78 146.2 94.5 4.7 1.30 2.6.8-rc2t 4 318 40.9 95.2 1.3 5.13
The "load" with mem_load is basically trying to allocate 110% of free ram, so the number of "loads" although similar is not a true indication of how much ram was handed out to mem_load. What is interesting is that since mem_load runs continuously and constantly asks for too much ram it seems to be receiving the token most frequently in preference to the cc processes which are short lived. I'd say it is quite hard to say convincingly that this is bad because the point of this patch is to prevent swap thrash.
It would get far more complicated to create a list of tasks trying to get the token and refuse to hand it back to the same task until it cycled through all the other tasks to prevent this... and I'm not even sure that would help since these are all short lived tasks... Any other thoughts?
To be honest I dont think this contest result is truly a bad thing...
Rik speculated, "It may be worth trying with a shorter token timeout time - maybe even keeping the long ineligibility ?" Con replied, "Give them a "refractory" bit which is set if they take the token? Next time they try to take the token unset the refractory bit instead of taking the token." Running with it, Con replied to himself, "Or take that concept even further; Give them an absolute refractory period where they cannot take the token again and a relative refractory bit which can only be reset after the refractory period is over." Song Jiang came in at this point, suggesting, "When there is memory competition among multiple processes, Which process grabs the token first is important. A process with its memory demand exceeding the total ram gets the token first and finally has to give it up due to a time-out would have little performance gain from token, It could also hurt others. Ideally we could make small processes more easily grab the token first and enjoy the benifis from token. That is, we want to protect those that are deserved to be protected. Can we take the rss or other available memory demand information for each process into the consideration of whether a token should be taken, or given up and how long a token is held." Rik replied:
I like this idea. I'm trying to think of a way to skew the "lottery" so small processes get an advantage, but the only thing I can come up with is as follows:
What do you think ?
To item 4, Song said:
So the score of each registered process, with or without token, is calculated periodically. After each calculation, a registered process with the highest score will take the token. So a process gives up its token in these 4 cases: (1) its page fault rate below a threshold (2) its score below a threshold; (3) it holds a token for too long time (4) it is done.
However, we have to avoid "token thrashing": a token is transfered among processes too frequently, which could actually create unnecessarily addtional page faults. So once a process gets the token, we can let it hold the token for at least a minimal period of time. The intention behind the score = time/size is very sound, but I am not sure how sensitive the performance is to the formula. We may need to tune it carefully to make it valid.
Which process will register itself? In my original design, I allow a process with any major page faults to take the token. However, I think now we should only allow the processes with their page fault rate higher than a threshold to register themselves. In this way we can limit the queue size.
To item 6, he said, "Do we need to periodically compare the scores of registered processes? If yes, that would take queueing complexity." Rik replied:
Hmmm, good points. And my "queue of one" idea has the danger of registering a process that doesn't want the token any more by the time it's handed off...
Maybe we should use the "time/size" score to influence the chance that a process gets to try and steal the token, in effect just modifying the odds.
After all, thrashing should be a relatively rare situation, so the code should be as low impact as possible...
The thread ended here.
7. Linux 2.4.27-rc4
31 Jul 2004 - 1 Aug 2004 (3 posts) Archive Link: "Linux 2.4.27-rc4"
Topics: FS: JFS, USB
People: Marcelo Tosatti, Stephen Hemminger, David S. Miller, Adrian Bunk
Marcelo Tosatti announced Linux 2.4.27-rc4, saying:
Here goes the forth 2.4.27 release candidate.
It includes a dozen of USB fixes, JFS update, IA64 fixes, networking update, amongst others.
2.4.27 final should be out soon.
Adrian Bunk noticed that a Configuration.help entry for CONFIG_NET_SCH_NETEM had been left out of the one of Stephen Hemminger's patches; he posted a fix, and David S. Miller accepted it.
8. Linux 2.6.8-rc2-mm2 Released
2 Aug 2004 - 5 Aug 2004 (55 posts) Archive Link: "2.6.8-rc2-mm2"
Topics: Kernel Release Announcement, Spam, Virtual Memory
People: Andrew Morton, Rik van Riel, Con Kolivas
Andrew Morton announced 2.6.8-rc2-mm2, saying:
Added Con's staircase CPU scheduler.
This will probably have to come out again because various people are still fiddling with the CPU scheduler. But my feeling here is that the current 1st-gen CPU scheduler has been tweaked as far as it can go and is still not 100% right. It is time to start thinking about a new design which addresses the requirements and current problems by algorithmic means rather than by tweaking. Removing over 300 lines from the scheduler is a good sign.
Feedback on this patch is sought.
Rik van Riel was excited that Andrew had added his token-based load control switch, and said:
I would really appreciate any testing results on this, both good and bad. I want to get this thing tuned and into a generally good shape for use by everybody upstream.
I'm especially interested in how this affects compute servers, desktops and heavily overloaded network servers (the "spamassassin slowed my system to a crawl" symptom would be one to test ;)).
I suspect the patch may need some tweaking to help interactivity in some cases, but maybe it'll already work magically by itself...
Hideo Aoki did some testing and reported back to Rik.
Con Kolivas was also excited to see his staircase scheduler in Andrew's kernel, and said, "Anyone with feedback on this please cc me. This was developed separately from the -mm series which has heaps of other scheduler patches which were not trivial to merge with so there may be teething problems. Good reports dont hurt either ;)" In a later post he remarked of his work, "The performance on both reaim and hackbench has always equalled or exceeded mainline" .
9. Linux 2.4.27-rc5 Released
3 Aug 2004 - 4 Aug 2004 (3 posts) Archive Link: "Linux 2.4.27-rc5"
Topics: FS: XFS
People: Marcelo Tosatti, Geert Uytterhoeven
Marcelo Tosatti announced Linux 2.4.27-rc5, saying:
Here goes the fifth release candidate of kernel v2.4.27.
It includes a handful of XFS fixes, a network update (Bluetooth, Netfilter, bridge), it revert problematic DVD-RW support for now (should be back in 2.4.28).
Most importantly this release fixes an exploitable race in file offset handling
which allows unpriviledged users from potentially reading kernel memory.
This touches several drivers and generic proc code. This issue is covered by
Vendors should be releasing their updates real soon now.
Here are the most important security issues fixed by the 2.4.27 release:
CAN-2004-0495 (Al Viro sparse fixes)
CAN-2004-0497 (users could modify group ID of arbitrary files on the system)
CAN-2004-0535 (e1000 minor info leak)
CAN-2004-0685 (backported Conectiva usb sparse fixes)
CAN-2004-0415 (file offset pointer handling race)
CAN-2004-0565 (information leak ia64)
-final should be out in a few days if nothing bad shows up.
Geert Uytterhoeven reported a small bug that prevented the kernel from compiling with GCC 2.95; he posted a fix and Marcelo accepted it gratefully.
Sharon And Joy
Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.