Table Of Contents
|1.||1�May�2000�-�9�May�2000||(44 posts)||CAPP Conformance: The Saga Continues|
|2.||6�May�2000�-�10�May�2000||(58 posts)||Treatment Of Contributors|
|3.||7�May�2000�-�10�May�2000||(95 posts)||Kernel Versioning Discussion|
|4.||8�May�2000�-�11�May�2000||(18 posts)||PC Speaker Driver|
|5.||8�May�2000�-�13�May�2000||(70 posts)||Virtual Memory Problems Persist In Development Series|
|6.||9�May�2000�-�11�May�2000||(15 posts)||'eepro100' Driver Problems In Stable Series|
|7.||9�May�2000||(10 posts)||'/proc/index.html' Bug In Latest Development Kernels|
|8.||10�May�2000�-�11�May�2000||(10 posts)||More On Kernel Versioning|
|9.||10�May�2000�-�11�May�2000||(7 posts)||Standard Kernel Or RTLinux For Real-Time Needs?|
Thanks go to Petr Vandrovec, for sending in a corrected URL for his patch in Issue�#67, Section�#12� (5�May�2000:�VMWare Breaks Under Latest Development Kernels) . The one I'd originally posted was incorrect. Thanks a lot, Petr!
Many thanks also go to Tom Davey, for catching a couple typos in last week's issue. Thanks a lot, Tom! ;-)
Mailing List Stats For This Week
We looked at 1319 posts in 5961K.
There were 460 different contributors. 204 posted more than once. 173 posted last week too.
The top posters of the week were:
1. CAPP Conformance: The Saga Continues
1�May�2000�-�9�May�2000 (44 posts) Archive Link: "Linus: [PATCH] (for 2.3.99pre6) audit_ids system calls"
Topics: Feature Freeze, Sound: OSS
People: Alexander S. Guy,�Linda Walsh
Linda Walsh posted a patch to implement some initial elements of the code necessary for CAPP security compliance. See Issue�#65, Section�#2� (14�Apr�2000:�Proposal: LUID For Secure Auditing) for the first coverage of this debate.
This week the discussion continued in much the same tone. Linda (speaking for SGI) argued that CAPP conformance was necessary for systems needing high security; but other folks raised many objections. Some reminded her that the kernel was in feature freeze, so this sort of thing should wait until 2.5; but Linda countered that it was important to get as much of the code into 2.4 as possible, so specialized CAPP patches could be smaller and more palatable to security-minded organizations. Another objection was that CAPP was a broken standard, and should be ignored. But Linda argued that it was "government issue", and had a lot of influence even if it wasn't perfect.
(ed.  This part of the debate is probably key to the eventual acceptance of the code into the kernel, but Linda does not seem to be convincing people of CAPP's integrity, try as she might. I suspect that she puts more weight on its status as an "official" standard than other folks on the list, who are concerned primarily with its technical merits.)
This led into another objection, which was that folks felt they were being asked to accept patches into the main kernel without really knowing what they were for or where they were really headed. As Alexander S. Guy put it at one point in the discussion, "I couldn't give a rat's ass about claiming certification. I want an architected security solution that is comprehensive, and actually functions." In response to this, Linda gave a lot of technical details of her proposed implementation, adding, "We have been letting folks know -- we are active in other security project lists, my manager has been on the speaking tour and mentioned more than once on slashdot -- the whole giving away the B1 on OSS has been mentioned more than once. I spoke in Florida last week and have 2 European speaking engagements in June, another one in August, and another unconfirmed possibility in July. Most of these talks (SGI "Linux" University) are open to the public, the others are at public conferences. Meanwhile we are trying to move while the movin' is good. :-)" She went on, "For the user level programs we are looking at distro industry partnering. We've already partnered w/RH in an earlier release to add in some I/O enhancements or something (I'm not sure of the exact details, 6.0 I think) for Oracle. We are partnering in the Trillian project for ia64 Linux and want to see Linux be a fully commercial player -- hopefully scaling up to the number of processors IRIX supports (currently 256P). My and my group's work on 'trusted' Linux is just another aspect of our desire to contribute and enhance Linux."
As in the previous discussion, Linda stood almost entirely alone throughout the thread, although it's clear that she's highly qualified. It'll be interesting to see how the debate progresses through 2.4 and 2.5
2. Treatment Of Contributors
6�May�2000�-�10�May�2000 (58 posts) Archive Link: "[PATCH] address_space_operations unification"
People: Brian J. Murrell,�Alexander Viro,�David Parsons,�Nathan Hand,�Linus Torvalds
Roman V. Shaposhnick posted a patch that Linus Torvalds did not think much of. In fact, Linus flamed it and the mailing list that discussed it. Brian J. Murrell could not believe his eyes and analyzed the post to see if Linus' address had been forged. He exclaimed, "I would hate to think Linus would "rip anyone a new one" in this fashion. It certainly does not ring too well to the tune of "The Cathedral and the Bazaar"'s discussion on cultivating OpenSource programmers." Alexander Viro replied, "Nah, not a forgery. WTF, who decided that Linus is not allowed to flame? If you want to start linux-kernel-nicey-nicey - feel free to do so." David Parsons added, "Linus, thank goodness, does not hire style consultants to teach him how to follow the political fashions of the day. Linus quite cheerfully rips people new ones whenever a proposed patch is submitted that he cares enough to comment on its shortcomings." And Nathan Hand put in, "A leader with neither bark nor bite isn't worth following." Linus also explained:
Hey, the very firsts posts in the history of Linux were flames from me - read the historical flame-war bewteen me an Andy Tanenbaum some day. I get quite agitated sometimes.
But I do dislike flaming, even if it occasionally feels good to release a bit pressure over a patch I don't like. And I prefer flaming people that I know can take it (or that I really don't care about - a _really_ good flame can be quite catharctic ;). And for that reason I should, and will, apologize to Roman about the posting. Not because I suddenly started really liking his patch, but because I don't know if he has thick enough skin to just brush off my occasional bursts of emotion.
So Roman, my apologies. I won't promise it won't happen again, but let's discuss why you felt your patch was needed, ok?
Roman accepted the apology, and there followed a bit of technical discussion in which Linus seriously considered the merits of Roman's patch.
3. Kernel Versioning Discussion
7�May�2000�-�10�May�2000 (95 posts) Archive Link: "Future Linux devel. Kernels"
People: Marco Colombo
Ron Van Dam felt that the reason new stable series' were always "delayed" was because of feature-creep during the later stages of development series'. He suggested having two separate development trees running concurrently. In the current case, it would be 2.5 and 2.7; in which 2.5 would have specific, predefined goals not to be exceeded, while 2.7 would get all the fall-through features that hadn't been determined beforehand. That way 2.6 could be completed in a predictable amount of time. The discussion quickly veered off into other areas, but Marco Colombo gave links to a prior discussion, having the Subject: "Standard Development Integration" (http://boudicca.tux.org/hypermail/linux-kernel/2000week02/0292.html) , continuing here (http://boudicca.tux.org/hypermail/linux-kernel/2000week03/0261.html) , and here (http://boudicca.tux.org/hypermail/linux-kernel/2000week04/0055.html) .
4. PC Speaker Driver
8�May�2000�-�11�May�2000 (18 posts) Archive Link: "PC speaker driver"
People: Ian Carr-de Avelon,�Vladimir Dergachev,�Richard B. Johnson,�David L. Nicol
In the course of discussion, Ian Carr-de Avelon observed, "I think part of the general problem with the speaker driver is that progresively smaller cheaper speakers have been put in by the makers, so generally only older PCs give something approaching AM sound." Vladimir Dergachev replied, "Most computers that I saw have a small nice speaker (the sort you could see in radios couple decades ago). The problem however is not with speaker but with the way it is driven (on and off if I remember correctly). That is the speaker can either be told to emit a square wave at a certain frequency (which is used for beeps) or you can drive that square wave with cpu. Hence unlike sound cards that have from 256 to 65536 gradations (or more) pc speaker has 2." David L. Nicol asked if it were possible to control the tone by driving the speaker for very short periods of time, and Richard B. Johnson replied:
Yes. And my very first Linux-box had that software (version 0.99). It played music quite well right out of the tiny speaker.
When I got my first XT clone, one of the first things I did was to make it 'talk'. There were no 'wav' files, I sampled a mike preamp at 10 kHz, made an 8-bit A/D converter using the printer-port, captured the resulting data into a file, then I could play it back into the speaker by varying a 10kHz pulse-width from timer channel 2 (the one connected to the speaker). All the source-code and the schematic was published on my BBS in the 80's.
Even though the distortion was probably greater than 10%, and the 10KHz rate with within audible range (the speaker becomes a LPF), the voice sounded quite okay.
Now practically everybody has BOOM-BOX Audio boards. I still haven't bought one.
5. Virtual Memory Problems Persist In Development Series
8�May�2000�-�13�May�2000 (70 posts) Archive Link: "[PATCH] Recent VM fiasco - fixed"
Topics: Big Memory Support, Sound: ALSA, Virtual Memory
People: Zlatko Calusic,�Rik van Riel,�Linus Torvalds,�Paul Barton-Davis,�Mikael Grahn,�Yoann Vandoorselaere,�Andrea Arcangeli,�Christoph Rohland,�Simon Kirby,�James H. Cloos
Zlatko Calusic finally had enough of bad virtual memory behavior in recent development kernels, and cited some history:
He posted a patch, and explained, "this patch mostly *removes* cruft recently added, and returns to the known state of operation. After that is achieved it is then easy to selectively add good things I might have removed, and change behaviour as wanted, but I would like to urge people to test things thoroughly before releasing patches this close to 2.4." Rik van Riel pointed out that his patch was broken for high memory machines, and remarked, "Think of a 1GB machine which has a 16MB DMA zone, a 950MB normal zone and a very small HIGHMEM zone. With the old VM code the HIGHMEM zone would be swapping like mad while the other two zones are idle." Zlatko replied that now he understood what the VM code was aiming for, but complained, "still, optimizing for 1GB, while at the same time completely killing performances even *usability* for the 99% of users doesn't look like a good solution, does it?" Linus Torvalds replied:
I'll make a new pre7 that has a lot of the simplifications discussed here over the weekend, and seems to work for me (tested both on a 512MB setup and a 64MB setup for some sanity).
This pre7 almost certainly won't be all that perfect either, but gives a better starting point.
The discussion continued, and at one point Paul Barton-Davis remarked:
The only usable 2.3 kernel for me (doing professional hard disk audio recording with Linux) has been 2.3.51. I switched to the 2.3.99 series because ALSA switched to matched new isapnp interfaces amongst other things, so 2.3.51 was no longer an option.
But its worse: because i got tired of 2.3.99pre-anything's performance, I switched to 2.2.15. This kernel seems almost as bad as 2.3.99pre6 when it comes to disk i/o performance. Stuff that used to work fine on 2.2.10 just dies with horrendous performance problems on 2.2.15. I am not sure if its kswapd or not - I just gave up with a distinct sense of frustration that such a basic function - lots and lots of disk i/o - was now broken in 2.2.15 as well!
I will be applying Andrea's classzone24 patch tonight to a pre7 kernel, but I am, frankly, worried that the work done on the VM system between 2.3.51 and 2.3.99 might have been a complete mistake that no-one seems able or willing to back out of. I see what Rik has said about his & Linus' attempts, and it sounds good, but I am bothered that neither of the two current "stable" and "development" kernels can do what 2.2.10 or 2.3.51 could do. This seems pretty grave.
A bit elsewhere, Linus suggested, "Try out the really recent one - pre7-8. So far it hassome good reviews, and I've tested it both on a 20MB machine and a 512MB one.." But Simon Kirby, Niels Kristian Bech Jensen, Florin Andrei, Christoph Rohland and James H. Cloos Jr. all reported little or no improvement. Linus felt there might be some bad interaction with some other code, but he couldn't pinpoint it.
Later, Linus announced pre7-9, but folks reported no improvement whatsoever. At one point Linus commented, "I think Ingo already posted a very valid concern about high-memory machines, and there are other issues we should look at. I just want to be in a position where we can look at the code and say "we do X because Y", rather than a collection of random tweaks that just happens to work." The discussion went on for awhile, as Linus and others got down and dirty with the code, but no clear solution emerged.
However, under the Subject: VM performance... (http://kernelnotes.org/lnxlists/linux-kernel/lk_0005_02/msg00308.html) , Mikael Grahn reported, "I just thought i would give my word on the VM performance problems. There has been alot of talk around this problem the last week. New pre patches come and it doesn't seem to help. Linus said that the new pre7 patch would make it better. I applied it and not surprised i noticed no change. There is however some light in the darkness. The nice classzone patch by Andrea Arcangeli is very nice ! It has killed all the performance problems over here. Now i can at last do heave I/O work and still use the computer at the same time :) Great work Andrea !" Luca Montecchiani confirmed this experience, as did Florin. Florin added, though, that VMWare would die with the classzone patch. He and others acknowledged that this wasn't really the kernel's concern, and Yoann Vandoorselaere put it, "I think vmware contain many hack and is very dependant on the kernel, which make it very sensible to whatever kernel change. If the application is in cause, it need to be fixed... not the kernel ( so if including the classzone patch is good for the kernel VM gestion ( what i think ), we should include it, right ?) ." The thread ended there.
6. 'eepro100' Driver Problems In Stable Series
9�May�2000�-�11�May�2000 (15 posts) Archive Link: "eepro100 probs"
People: Tim Hockin,�Andrew Morton,�Henning P. Schmiedehausen,�Alan Cox,�Matthew Kirkwood
Matthew Kirkwood ws getting a lot of stalls from the eepro100 driver under 2.2.15pre7 with Red Hat 6.2; Alan Cox recommended the driver from 2.2.14, and to report to Andrey Savochin regardless of the result. Anthony J. Biacco confirmed Matthew's report, adding that he'd had problems with the driver on all stable kernels up to 2.2.14, but he went on to say that 2.2.15pre18 seemed to fix it. Elsewhere, Tim Hockin reported:
We're having lots of trouble with eepro100 and Cisco switches - apparently after some quiescent period, the eepro stops responding until outgoing traffic is generated. We're working on isolating it, but I don't know if it is actually an eepro issue, or a cisco issue yet. Just thought I'd throw it out for anyone to add a "me too!", or more info.
Also, we're seeing some strange behavior with eepro100 and a Linksys Etherfast switch - the tx line stays on constantly, as if it is trying to negotiate, but negotiation is done. Again, trying to track it.
Matthew replied that his machine was on a Cisco Catalyst switch, but that the problems he observed were different from what Tim had described. Matthew would regularly see stalls half-way through downloading web pages. Nothing came of this, but Andrew Morton also replied to Tim, saying:
Interesting. There are several identical reports for the 3com drivers. The so-called "sleepy NIC" problem, where a NIC which is completely idle for 20-30 minutes loses its ability to receive.
Some people have resorted to a cron job which pings another machine once every ten minutes.
I wonder if it is due to the switch? Or perhaps some Linux problem above the device driver? Or common to the MII transceiver h/w or driver?
I'll shake a few more details out of the 3com uers who have observed this behaviour.
There was a small amount of technical discussion, and Henning P. Schmiedehausen threw into the soup:
Just to annoy you a little more, I have lots and lots of Linux boxes with 3C905B (3c59x driver) and 3c905C (3C90X driver from 3COM), some EEPro 100 and some tulip boxes behind Cisco 2924XL switches and none of them show these symptoms from 2.0.35-2.0.39 and 2.2.1 to 2.2.16pre2.
The Ciscos do need some tweaking, though, on some of the boxes I had to disable auto-negoiation.
No clarity came through the list.
7. '/proc/index.html' Bug In Latest Development Kernels
9�May�2000 (10 posts) Archive Link: "[bug-2.3.99-pre7-8] running fuser leaks mnt_count of /proc"
People: Alexander Viro,�Tigran Aivazian
A one-day bug hunt. Tigran Aivazian reported that running 'fuser' under 2.3.99pre7 or pre8, would cause the mount count of '/proc/index.html' to grow by 1 at each 'fuser' invocation. Alexander Viro said he'd look into it, then 3 and a half hours later puzzled, "How quaint... chdir("/proc/self/fd/index.html"); gets the process into the state where it will correctly deal with further chdir() calls, but fail to release fs_struct (contents?) upon the exit. It looks like a change of some state: been there once and that's it - you are doomed. WTF??? More coffee needed - it's getting seriously weird..." Tigran replied privately, saying he would look into it as well, now that Alexander had narrowed it down. They went back and forth a bit, and Alexander posted some patches. At one point, Tigran said, "once we chdir("/proc/self/fs") (or any directory name containing that e.g. "/proc/self/fd/../..") our fs->count gets incremented one extra time so not only we leak /proc's mnt_count but also root's, i.e. the whole chunk of code in __put_fs_struct() is never executed. So, the question is - why/where do we increment fs->count the extra time?" And to one of Alexander's patches, he replied, "yes, that fixed all of it, i.e. both root and pwd not have correct mnt_count and /proc umounts just fine.."
8. More On Kernel Versioning
10�May�2000�-�11�May�2000 (10 posts) Archive Link: "Version numbering proposal (2.5.x.xx)"
Topics: Code Freeze, Feature Freeze, USB
People: Deven T. Corzine,�Michael Poole
Deven T. Corzine had a long proposal for regularizing kernel version numbers so people could get an idea of where the development process was at any given time. To do this, he proposed adding an additional number to the version numbers of the development series. He summarized the current method, in which x.even.y represented kernels aiming for stability, while x.odd.y represented kernels adding new features and revamping bad ideas. To illustrate his idea, he took the next series as an example:
Final release candidates -- No architectural, new features or feature changes allowed at all. Bugfixes ONLY; final tuning before 2.6.xx stable release series. Final release candidates should be almost suitable for production use on mission-critical systems, as any stable series release should be. (This depends on getting 2.5.8.xx used on some production systems first...)
The 2.5.9.xx series should REPLACE the traditional initial stable series stabilization efforts. The final release in this series should be re-released as 2.6.0 and 126.96.36.199 with no changes but the version number -- if more bugfixes are needed, it's not time yet. Only when it's time to fork for a new development series should the stable series be declared. (This should avoid embarassments like 2.2.0 -- a "stable" release that crashed rather easily...)
Someone pointed out that this didn't really reflect the way the kernel was actually developed. He pointed out that generally the unstable series tended to add a bunch of patches and become really unstable, then stablize for a bit, then add more patches, then stablize some more, and so on. Daven replied that his new scheme would at least help keep people informed as to where the kernel was supposed to be. Michael Poole described his own observations of actual development:
I've been around to see the 2.1.xx and 2.3.xx trees be started, develop, and mature. In both cases there was 'feature freeze', 'code slush', and 'code freeze' .. followed by a bunch of new features getting in, and going back to a 'feature slush' before progressing to 'code freeze' state. 2.1.xx even repeated this cycle, having THREE points (that I remember) where it entered 'code freeze.' The reversions from 'code freeze' to an earlier state usually allowed newer patches to be incorporated even if they had not been accepted during the 'code freeze' state.
Also, if you think about it, different parts of the kernel aren't necessarily at the same point in their respective development cycles, although they are roughly synchronized. For example, even in 2.3.xx's first code slush, new USB drivers were being added and the architecture being revised because the USB support was otherwise very lacking.
If those two reasons aren't enough, I find the idea of having four parts to a version number extremely annoying. It's much easier and more practical to continue describing the releases as Linus has done in the past: with external descriptions of the criteria for changes being made. Sometimes these criteria do follow the 'open development,' 'feature freeze,' 'code slush,' 'code freeze,' 'release' progression. More often, they don't (even if one of those terms was used to describe the release).
As an example, take the stable kernel series: New stuff gets in, although the stability criteria are much higher for these additions. It wasn't really until 2.2 was out that 2.0 became bugfix-only; likewise, 2.2.15 adds some new features even as 2.4 seems to be knocking on the door.
There was not much support for Daven's idea, and the thread petered out.
9. Standard Kernel Or RTLinux For Real-Time Needs?
10�May�2000�-�11�May�2000 (7 posts) Archive Link: "System response time for Linux"
Topics: PCI, Real-Time: RTLinux
People: Victor Yodaiken,�Peter Monta,�Alan Cox
Ling Su had a PCI device driver that required response times of approximately 50 microseconds. He asked if Linux could handle this on something like an 800 Mhz Pentium III, or if he should look into RTLinux. Victor Yodaiken recommended RTLinux (http://www.rtlinux.com) , but added, "Linux is actually amazingly good normally, so if you need "typical" response within 50us then you could probably do without RTLinux. But if you need, "worst case" to be less than 50us it's not a problem unless you need to do a 40us processing every 50us." Peter Monta agreed that "if you can tolerate occasional few-millisecond delays, the plain kernel may suffice," otherwise RTLinux was the way to go, he said. Later, Alan Cox also put in, "We don't try to be 'seriously realtime' in paticular we don't deal with priority inversion and kernel pre-emption. If it hurts when you miss deadlines look at RtLinux."
Sharon And Joy
Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.