<?xml version="1.0" ?>

<kc>

<title>Kernel Traffic</title>

<author contact="mailto:zbrown@tumblerings.org">Zack Brown</author>

<issue num="169" date="02 Jun 2002 23:00:00 -0800" />

<stats posts="1315" size="6084" contrib="352" multiples="178" lastweek="159">

<person posts="86" size="230" who="Alan Cox " />
<person posts="72" size="468" who="Martin Dalecki " />
<person posts="43" size="141" who="Linus Torvalds " />
<person posts="40" size="105" who="&quot;David S. Miller&quot; " />
<person posts="39" size="257" who="Vojtech Pavlik " />
<person posts="35" size="109" who="Pavel Machek " />
<person posts="31" size="185" who="Andrew Morton " />
<person posts="30" size="112" who="Rusty Russell " />
<person posts="29" size="215" who="Pavel Machek " />
<person posts="24" size="114" who="Christoph Hellwig " />
<person posts="20" size="71" who="Tomas Szepe " />
<person posts="16" size="60" who="William Lee Irwin III " />
<person posts="15" size="128" who="Jan Kara " />
<person posts="15" size="43" who="Alexander Viro " />
<person posts="13" size="70" who="&quot;Peter T. Breuer&quot; " />
<person posts="13" size="56" who="Arnaldo Carvalho de Melo " />
<person posts="13" size="42" who="Russell King " />
<person posts="12" size="49" who="Andre Hedrick " />
<person posts="12" size="45" who="&quot;J.A. Magallon&quot; " />
<person posts="12" size="44" who="Robert Love " />
<person posts="12" size="35" who="Dave Jones " />
<person posts="9" size="28" who="Denis Vlasenko " />
<person posts="8" size="57" who="Mark Gross " />
<person posts="8" size="50" who="Jens Axboe " />
<person posts="8" size="26" who="Greg KH " />
<person posts="8" size="24" who="" />
<person posts="8" size="23" who="Kasper Dupont " />
<person posts="7" size="45" who="Matthias Andree " />
<person posts="7" size="42" who="OGAWA Hirofumi " />
<person posts="7" size="42" who="Steven Whitehouse " />
<person posts="7" size="28" who="William Jhun " />
<person posts="7" size="24" who="Austin Gonyou " />
<person posts="7" size="23" who="Sebastian Droege " />
<person posts="7" size="20" who="James Simmons " />
<person posts="7" size="20" who="" />
<person posts="7" size="19" who="Benjamin LaHaise " />
<person posts="7" size="19" who="Andi Kleen " />
<person posts="6" size="27" who="george anzinger " />
<person posts="6" size="21" who="Wolfgang Wegner " />
<person posts="6" size="20" who="Dave McCracken " />
<person posts="6" size="19" who="&quot;H. Peter Anvin&quot; " />
<person posts="6" size="19" who="&quot;Thomas 'Dent' Mirlacher&quot; " />
<person posts="6" size="17" who="Roman Zippel " />
<person posts="6" size="16" who="Pete Zaitcev " />
<person posts="6" size="15" who="David Woodhouse " />
<person posts="5" size="87" who="Diego Calleja " />
<person posts="5" size="86" who="&quot;Gross, Mark&quot; " />
<person posts="5" size="32" who="Stephen Rothwell " />
<person posts="5" size="29" who="Tim Schmielau " />
<person posts="5" size="26" who="Robert Love " />
<person posts="5" size="22" who="Lionel Bouton " />
<person posts="5" size="20" who="Dipankar Sarma " />
<person posts="5" size="20" who="David Mosberger " />
<person posts="5" size="17" who="&quot;Randy.Dunlap&quot; " />
<person posts="5" size="17" who="" />
<person posts="5" size="15" who="Xavier Bestel " />
<person posts="5" size="15" who="Tom Rini " />
<person posts="5" size="15" who="Riley Williams " />
<person posts="5" size="14" who="" />
<person posts="5" size="14" who="Thunder from the hill " />
<person posts="5" size="14" who="Aaron Sethman " />
<person posts="4" size="67" who="Ed Tomlinson " />
<person posts="4" size="33" who="Frederik Nosi " />
<person posts="4" size="19" who="Brian Gerst " />
<person posts="4" size="19" who="A Guy Called Tyketto " />
<person posts="4" size="18" who="&quot;Nick Evgeniev&quot; " />
<person posts="4" size="16" who="Mike Fedyk " />
<person posts="4" size="16" who="Nathan Scott " />
<person posts="4" size="15" who="Zlatko Calusic " />
<person posts="4" size="15" who="Larry McVoy " />
<person posts="4" size="14" who="David Brownell " />
<person posts="4" size="13" who="&quot;Albert D. Cahalan&quot; " />
<person posts="4" size="13" who="Borsenkow Andrej " />
<person posts="4" size="12" who="Keith Owens " />
<person posts="4" size="12" who="Dan Kegel " />
<person posts="4" size="12" who="Oleg Drokin " />
<person posts="4" size="12" who="Neale Banks " />
<person posts="4" size="12" who="Jeremy White " />
<person posts="4" size="11" who="" />
<person posts="4" size="11" who="&quot;Stephen C. Tweedie&quot; " />
<person posts="4" size="10" who="Meelis Roos " />
<person posts="4" size="10" who="Frank Davis " />
<person posts="4" size="9" who="Erik McKee " />
<person posts="3" size="107" who="Anders Gustafsson " />
<person posts="3" size="74" who="Anton Altaparmakov " />
<person posts="3" size="42" who="&quot;Anthony R.&quot; " />
<person posts="3" size="25" who="Andrea Arcangeli " />
<person posts="3" size="20" who="&quot;Richard B. Johnson&quot; " />
<person posts="3" size="18" who="Ravikiran G Thirumalai " />
<person posts="3" size="16" who="Daniel Jacobowitz " />
<person posts="3" size="13" who="Paul P Komkoff Jr " />
<person posts="3" size="12" who="&quot;Kirk&quot; " />
<person posts="3" size="12" who="Corin Hartland-Swann " />
<person posts="3" size="12" who="jw schultz " />
<person posts="3" size="11" who="Andi Kleen " />
<person posts="3" size="11" who="Myrddin Ambrosius " />
<person posts="3" size="11" who="Ben Greear " />
<person posts="3" size="10" who="&quot;Mike Black&quot; " />
<person posts="3" size="10" who="Andreas Dilger " />
<person posts="3" size="10" who="Peter Chubb " />
<person posts="3" size="10" who="Neil Brown " />
<person posts="3" size="10" who="Kai Germaschewski " />
<person posts="3" size="10" who="Kai Germaschewski " />
<person posts="3" size="9" who="Ruth Ivimey-Cook " />
<person posts="3" size="9" who="Ingo Oeser " />
<person posts="3" size="9" who="Marcelo Tosatti " />
<person posts="3" size="9" who="Nivedita Singhvi " />
<person posts="3" size="9" who="Arnaud Launay " />
<person posts="3" size="8" who="Matt Bernstein " />
<person posts="3" size="8" who="Hugh Dickins " />
<person posts="3" size="8" who="Ian Molton " />
<person posts="3" size="8" who="John Levon " />
<person posts="3" size="8" who="Padraig Brady " />
<person posts="3" size="8" who="Bill Davidsen " />
<person posts="3" size="7" who="Gert Vervoort " />
<person posts="3" size="7" who="Zwane Mwaikambo " />
<person posts="3" size="6" who="" />
<person posts="2" size="85" who="John Kacur " />
<person posts="2" size="48" who="Manik Raina " />
<person posts="2" size="17" who="Hilbert Barelds " />
<person posts="2" size="17" who="d_vangreg " />
<person posts="2" size="13" who="sean darcy " />
<person posts="2" size="13" who="Paul Clements " />
<person posts="2" size="12" who="&quot;Gryaznova E.&quot; " />
<person posts="2" size="11" who="&quot;Stephen J. Gowdy&quot; " />
<person posts="2" size="10" who="Eric Seppanen " />
<person posts="2" size="10" who="Andreas Mohr " />
<person posts="2" size="10" who="&quot;Eloy A. Paris&quot; " />
<person posts="2" size="9" who="Andrey Panin " />
<person posts="2" size="9" who="Sven Koch " />
<person posts="2" size="9" who="&quot;Herman Oosthuysen&quot; " />
<person posts="2" size="8" who="&quot;Todd R. Eigenschink&quot; " />
<person posts="2" size="8" who="Colin Slater " />
<person posts="2" size="8" who="John Weber " />
<person posts="2" size="8" who="Miles Lane " />
<person posts="2" size="8" who="Terje Eggestad " />
<person posts="2" size="8" who="Jorge Nerin " />
<person posts="2" size="8" who="Abraham vd Merwe " />
<person posts="2" size="7" who="Erich Focht " />
<person posts="2" size="7" who="&quot;Petr Vandrovec&quot; " />
<person posts="2" size="7" who="Luca Barbieri " />
<person posts="2" size="7" who="Frank van Maarseveen " />
<person posts="2" size="7" who="&quot;Nix N. Nix&quot; " />
<person posts="2" size="7" who="Gunther Mayer " />
<person posts="2" size="7" who="Marc SCHAEFER " />
<person posts="2" size="7" who="Jan-Benedict Glaw " />
<person posts="2" size="6" who=" (Linus Torvalds)" />
<person posts="2" size="6" who="Peter Osterlund " />
<person posts="2" size="6" who="Jaroslav Kysela " />
<person posts="2" size="6" who=" (Nick Holloway)" />
<person posts="2" size="6" who="Florian Hars " />
<person posts="2" size="6" who="Josh Fryman " />
<person posts="2" size="6" who="Pierre Rousselet " />
<person posts="2" size="6" who="Kristian Peters " />
<person posts="2" size="6" who="Chris Friesen " />
<person posts="2" size="6" who="James Bottomley " />
<person posts="2" size="6" who="Peter =?ISO-8859-1?Q?W=E4chtler?= " />
<person posts="2" size="6" who="Marc-Christian Petersen " />
<person posts="2" size="6" who="Ivan Kokshaysky " />
<person posts="2" size="6" who="&quot;Kevin P. Fleming&quot; " />
<person posts="2" size="6" who="Christopher Yeoh " />
<person posts="2" size="6" who="Alexander Trotsai " />
<person posts="2" size="6" who="Colin Gibbs " />
<person posts="2" size="5" who="Juan Quintela " />
<person posts="2" size="5" who="Frank Schaefer " />
<person posts="2" size="5" who="Geert Uytterhoeven " />
<person posts="2" size="5" who="Daniel Phillips " />
<person posts="2" size="5" who="Muli Ben-Yehuda " />
<person posts="2" size="5" who="Richard Gooch " />
<person posts="2" size="5" who="Skip Ford " />
<person posts="2" size="5" who="&quot;Sergey Kubushin&quot; " />
<person posts="2" size="5" who="Pawel Kot " />
<person posts="2" size="5" who="Chris Wedgwood " />
<person posts="2" size="5" who="Paul Mackerras " />
<person posts="2" size="5" who="Keith Whitwell " />
<person posts="2" size="4" who="Shanti Katta " />
<person posts="2" size="4" who="Edgar Toernig " />
<person posts="2" size="4" who="Jeff Garzik " />
<person posts="1" size="60" who="Georgi Chorbadzhiyski " />
<person posts="1" size="53" who="&quot;Hank Yang&quot; " />
<person posts="1" size="48" who="Helio Fujimoto " />
<person posts="1" size="35" who="&quot;Jonathan B. Horen&quot; " />
<person posts="1" size="30" who="Matteo Rinaudo " />
<person posts="1" size="27" who="mgross " />
<person posts="1" size="12" who="" />
<person posts="1" size="11" who="&quot;Guillaume Boissiere&quot; " />
<person posts="1" size="8" who="Sandy Harris " />
<person posts="1" size="8" who="Keith Thompson " />
<person posts="1" size="8" who="" />
<person posts="1" size="7" who="Samuel Flory " />
<person posts="1" size="7" who="&quot;Vamsi Krishna S.&quot; " />
<person posts="1" size="7" who="Petr Vandrovec " />
<person posts="1" size="7" who="=?iso-8859-1?Q?Andr=E9_Bonin?= " />
<person posts="1" size="7" who="jay " />
<person posts="1" size="7" who="Michael Barker " />
<person posts="1" size="5" who="Rob Landley " />
<person posts="1" size="5" who="&quot;D.J. Barrow&quot; " />
<person posts="1" size="5" who="&quot;Vamsi Krishna S .&quot; " />
<person posts="1" size="5" who="Jon Grimm " />
<person posts="1" size="5" who="Marcel Holtmann " />
<person posts="1" size="4" who="Andre Bonin " />
<person posts="1" size="4" who="" />
<person posts="1" size="4" who="Hanna Linder " />
<person posts="1" size="4" who="James Yonan " />
<person posts="1" size="4" who="David Howells " />
<person posts="1" size="4" who="&quot;Holzrichter, Bruce&quot; " />
<person posts="1" size="4" who="=?iso-8859-1?Q?Jakob_=D8stergaard?= " />
<person posts="1" size="4" who="&quot;Stefan M. Brandl&quot; " />
<person posts="1" size="4" who="Abdij Bhat " />
<person posts="1" size="4" who="Scorpion " />
<person posts="1" size="4" who="&quot;Paul G. Allen&quot; " />
<person posts="1" size="4" who="Michael Sinz " />
<person posts="1" size="4" who="Karim Yaghmour " />
<person posts="1" size="4" who="Jan Kara " />
<person posts="1" size="4" who="Ben Collins " />
<person posts="1" size="3" who="&quot;Steve Best&quot; " />
<person posts="1" size="3" who="Wiktor Wodecki " />
<person posts="1" size="3" who="Paul Larson " />
<person posts="1" size="3" who="Clemens Schwaighofer " />
<person posts="1" size="3" who="Ghozlane Toumi " />
<person posts="1" size="3" who="Gregoire Favre " />
<person posts="1" size="3" who="Jean Tourrilhes " />
<person posts="1" size="3" who="Joel Jaeggli " />
<person posts="1" size="3" who="Marco Colombo " />
<person posts="1" size="3" who="&quot;Udo A. Steinberg&quot; " />
<person posts="1" size="3" who="Marcus Sundberg " />
<person posts="1" size="3" who="Daniel Jacobowitz " />
<person posts="1" size="3" who="Andreas Roedl " />
<person posts="1" size="3" who="Nathan " />
<person posts="1" size="3" who=" (Kai Henningsen)" />
<person posts="1" size="3" who="Aschwin Marsman - aYniK Software Solutions " />
<person posts="1" size="3" who="Alex Brotman " />
<person posts="1" size="3" who="James Bottomley " />
<person posts="1" size="3" who="&quot;Adar Dembo&quot; " />
<person posts="1" size="3" who="Yogesh Swami " />
<person posts="1" size="3" who="&quot;Adam J. Richter&quot; " />
<person posts="1" size="3" who="&quot;DR VICTOR   UBA&quot; " />
<person posts="1" size="3" who="Benjamin Herrenschmidt " />
<person posts="1" size="3" who="Adrian Bunk " />
<person posts="1" size="3" who="&quot;Miquel van Smoorenburg&quot; " />
<person posts="1" size="3" who="David Lang " />
<person posts="1" size="3" who="Chris " />
<person posts="1" size="3" who="J Sloan " />
<person posts="1" size="3" who="David Gibson " />
<person posts="1" size="3" who="Aaron Lehmann " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="=?ISO-8859-1?Q? &quot;Fran=E7ois?= Leblanc&quot; " />
<person posts="1" size="3" who="Marc Wilson " />
<person posts="1" size="3" who=" (Dagfinn Ilmari =?iso-8859-1?q?Manns=E5ker?=)" />
<person posts="1" size="3" who="&quot;Shipman, Jeffrey E&quot; " />
<person posts="1" size="3" who="&quot;Stephane Charette&quot; " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Thierry Vignaud " />
<person posts="1" size="3" who="Joseph Mathewson " />
<person posts="1" size="3" who="Robert Schwebel " />
<person posts="1" size="3" who="Joe Thornber " />
<person posts="1" size="3" who="&quot;WWW.BAU-CENTER.COM&quot; " />
<person posts="1" size="3" who="Rui Sousa " />
<person posts="1" size="3" who="Ron Gage " />
<person posts="1" size="3" who="&quot;Timothy E. Jedlicka - wrk&quot; " />
<person posts="1" size="3" who="Petr Titera " />
<person posts="1" size="3" who="Marc-Christian Petersen " />
<person posts="1" size="3" who="Miles Bader " />
<person posts="1" size="3" who="Steven Augart " />
<person posts="1" size="3" who="&quot;Takuya Satoh&quot; " />
<person posts="1" size="3" who="Miles Bader " />
<person posts="1" size="3" who="Hanna V Linder " />
<person posts="1" size="3" who="Olaf Dietsche " />
<person posts="1" size="3" who="Joseph Pingenot " />
<person posts="1" size="3" who="Guest section DW " />
<person posts="1" size="3" who="Roger Luethi " />
<person posts="1" size="3" who="Rik van Riel " />
<person posts="1" size="3" who="Christoph Rohland " />
<person posts="1" size="3" who="&quot;Frederic Lochon (crazyfred)&quot; " />
<person posts="1" size="3" who="Tony Hoyle " />
<person posts="1" size="3" who="Frank v Waveren " />
<person posts="1" size="3" who="&quot;Jon Hedlund&quot; " />
<person posts="1" size="3" who="Mikael Pettersson " />
<person posts="1" size="3" who="David Weinehall " />
<person posts="1" size="2" who="Petro " />
<person posts="1" size="2" who="John Alvord " />
<person posts="1" size="2" who="&quot;Bloch, Jack&quot; " />
<person posts="1" size="2" who="Melchior FRANZ " />
<person posts="1" size="2" who="jfm3 " />
<person posts="1" size="2" who="Thomas Schenk " />
<person posts="1" size="2" who="Mike Jagdis " />
<person posts="1" size="2" who="Jochen Suckfuell " />
<person posts="1" size="2" who="Osamu Tomita " />
<person posts="1" size="2" who="Roland Dreier " />
<person posts="1" size="2" who="Christian Gennerat " />
<person posts="1" size="2" who="&quot;will fitzgerald&quot; " />
<person posts="1" size="2" who="Michael Dunsky " />
<person posts="1" size="2" who="Der Herr Hofrat " />
<person posts="1" size="2" who="Gianni Tedesco " />
<person posts="1" size="2" who="Trond Myklebust " />
<person posts="1" size="2" who="Karthik Thirumalai " />
<person posts="1" size="2" who="Chris Wright " />
<person posts="1" size="2" who="Go Taniguchi " />
<person posts="1" size="2" who="=?iso-8859-1?q?willy=20tarreau?= " />
<person posts="1" size="2" who="coody " />
<person posts="1" size="2" who="&quot;Martin J. Bligh&quot; " />
<person posts="1" size="2" who="Stelian Pop " />
<person posts="1" size="2" who="&quot;Simen Timian Thoresen&quot; " />
<person posts="1" size="2" who="Itai Nahshon " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Michael Clark " />
<person posts="1" size="2" who="&quot;Muthal Sangam&quot; " />
<person posts="1" size="2" who="Wayne Whitney " />
<person posts="1" size="2" who="Giuliano Pochini " />
<person posts="1" size="2" who="Christer Weinigel " />
<person posts="1" size="2" who="john slee " />
<person posts="1" size="2" who="Alastair Stevens " />
<person posts="1" size="2" who=" (Henrique de Moraes Holschuh)" />
<person posts="1" size="2" who="DervishD " />
<person posts="1" size="2" who="Martin Diehl " />
<person posts="1" size="2" who="J Sloan " />
<person posts="1" size="2" who="=?iso-8859-2?Q?Justyna_Bia=B3a?= " />
<person posts="1" size="2" who="&quot;Christopher E. Brown&quot; " />
<person posts="1" size="2" who="GertJan Spoelman " />
<person posts="1" size="2" who="Michail Rusinov " />
<person posts="1" size="2" who="Francois Romieu " />
<person posts="1" size="2" who="Otto Wyss " />
<person posts="1" size="2" who="John Bullock " />
<person posts="1" size="2" who="Lars Marowsky-Bree " />
<person posts="1" size="2" who="&quot;Christian.Gennerat&quot; " />
<person posts="1" size="2" who="Christian =?iso-8859-15?q?Borntr=E4ger?= " />
<person posts="1" size="2" who="&quot;Mohammad A. Haque&quot; " />
<person posts="1" size="2" who="Marcus Meissner " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Emmanuel Michon " />
<person posts="1" size="2" who="Bartlomiej Zolnierkiewicz " />
<person posts="1" size="2" who="Sebastian Szonyi " />
<person posts="1" size="2" who="Helge Hafting " />
<person posts="1" size="2" who="&quot;Peter J. Braam&quot; " />
<person posts="1" size="2" who="&quot;Hileman, Larry&quot; " />
<person posts="1" size="2" who="Rainer Ellinger " />
<person posts="1" size="2" who=" (khromy)" />
<person posts="1" size="2" who="Thomas Capricelli " />
<person posts="1" size="2" who="Stephan von Krawczynski " />
<person posts="1" size="2" who="Hayden James " />
<person posts="1" size="2" who="Urban Widmark " />
<person posts="1" size="2" who="Jes Sorensen " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Vinicius " />
<person posts="1" size="2" who="&quot;chen, xiangping&quot; " />
<person posts="1" size="2" who="Tobias Ringstrom " />
<person posts="1" size="2" who="samson swanson " />
<person posts="1" size="2" who="Matt Simonsen " />
<person posts="1" size="2" who="Pierre Cloutier " />
<person posts="1" size="2" who="Christoph Lameter " />
<person posts="1" size="1" who="Linux Kernel Newbie " />

</stats>

<section
  title="Multithreaded Core Dump Support For 2.5 And 2.4"
  subject="PATCH Multithreaded core dump support for the 2.5.14 (and 15) kernel."
  archive=""
  posts="33"
  startdate="13 May 2002 11:17:55 -0800"
  enddate="23 May 2002 14:27:23 -0800"
>
<topic>Debugging</topic>
<topic>Scheduler</topic>
<topic>Virtual Memory</topic>

<mention>Erich Focht</mention>

<p>Mark Gross posted a patch to implement multithreaded core dump support in
2.5.14 and 2.5.15 kernels. He explained, <quote who="Mark Gross">This work
has been tested on the 2.5.14 kernel using a few pthread applications to dump
core, from SIGQUIT and SIGSEV.  This unit test has been done on both 2 and 4
way systems.  Further, some stress testing has been done where, the core files
have been created while the system is under schedule stress from the chat
room benchmark running while creating the core files.  This implementation
seems to be quit stable under a busy scheduler, YMMV.</quote></p>

<p>Erich Focht asked how the patch would handle the case in which the suspended
thread happened to be in kernel mode, a possibility in the 2.5 kernels. A
couple posts later, Vamsi Krishna S. replied, <quote who="Vamsi Krishna S.">if
a thread happens to be in kernel mode when some other thread is dumping core
(capturing register state of other threads, to be more accurate) then we
would capture the _user mode_ register of that thread from the bottom of
it's kernel stack. GDB will show back trace untill the thread entered kernel
(int 0x80), eip will be pointing to the instruction after the system call
(return address).</quote> Pavel Machek thought he found a problem with
this. He described his exploit, <quote who="Pavel Machek">Thread 1 is in
kernel and holds lock A. You need lock A to dump state.  When you move 1
to phantom runqueue, you loose ability to get A and deadlock.</quote> But
Mark replied:</p>

<quote who="Mark Gross">

<p>Any pending tasklet / bottom half + top half get processes by the
real CPU's even thought the I/O bound process may have been moved to the
phantom run queue.  Its just that for the suspended processes sitting on
the phantom queue this processing stops with the call to try_to_wake_up,
until the process is moved back onto a run queue with a CPU.</p>

<p>The only way I can see what your talking about happening is for some
kernel code (or driver) to grab a lock and then hold it across a call to
one of the sleep_on functions pending some I/O.</p>

<p>Any driver that holds a lock across any sleep_on call I think is abusing
locks and needs adjusting.</p>

<p>Nothing prevents someone writing a driver that abuses locks.</p>

<p>If you know of such a case I need to worry about or there is another way
for this design to get into trouble please let me know.</p>

</quote>

<p>Pavel and Andi Kleen both took exception to Mark's statement that no driver
should hold a lock across a sleep_on() call. As Andi put it, <quote who="Andi
Kleen">That's true for spinlocks, but not for semaphores. The mm layer and
the vfs layer both use semaphores extensively and sleep with them hold,
also some other subsystems (like networking) use sleeping locks.</quote>
He and Mark and a couple other folks went back and forth on this for a few
posts. At first, Mark was still unconvinced there was a problem, but after
awhile he did start to see some areas that would need to be fixed. He posted
a new patch, and said:</p>

<quote who="Mark Gross">

<p>After some investigations I concluded that the
down_write(current-&gt;mm-&gt;mmap_sem) in elf_core_dump was to protect
crashing multithreaded applications from dumping corrupted and possibly illegal
mm data due to the actions of the other, still running, thread processes.</p>

<p>As my patch has these thread processes suspended on the phantom run queue
we don't need to grab this semaphore in elf_core_dump any more.</p>

<p>However; we did see another potential issue on 4+ way systems with 3
or more processes of the same thread group entering suspend threads at
about the same time.  In tcore_suspend_threads between the release of the
spin locks and the calls to set_cpus_allowed, one of the other, crashing,
thread processes could move this task, currently in set_cpus_allowed, to
the phantom queue before it returns.  (a bad thing)</p>

<p>I've put in a fix for this possibility by doing down_write/up_write
on current-&gt;mm-&gt;mmap_sem for the scope of tcore_suspend_threads.
This also as the benefit of stopping VM operations for the thread group
until the thread group process are suspended.</p>

<p>This updated tcore patch has been tested on 2 and 4 way i386 systems,
dumping core for pthread applications with 300+ thread process, while running
the chat room benchmark.  It "seems" stable.</p>

</quote>

<p>This apparently fixed the problems folks had with it, and there was a
brief integration discussion, and various talk of possible problems that
might arise.</p>

<p>At one point Mark remarked that he hoped to have the patch working on
2.4 soon as well.</p>

</section>

<section
  title="Improving Virtual Memory Balancing"
  subject="[RFC][PATCH] using page aging to shrink caches"
  archive=""
  posts="8"
  startdate="17 May 2002 20:10:51 -0800"
  enddate="29 May 2002 04:01:16 -0800"
>
<topic>Virtual Memory</topic>

<p>Ed Tomlinson posted a patch, and explained, <quote who="Ed Tomlinson">I
have never been happy with the way slab cache shrinking worked.  This is an
attempt to make it better.</quote> Benjamin LaHaise clapped Ed on the back,
and said, <quote who="Benjamin LaHaise">Thank you!  This is should help
greatly with some of the vm imbalances by making slab reclaim part of the
self tuning dynamics instead of hard coded magic numbers.  Do you have any
plans to port this patch to 2.5 for inclusion?  It would be useful to get
testing in the 2.5 before merging in 2.4.</quote> Over the next few days,
Ed replied with updated versions of the patch.</p>

</section>

<section
  title="Backward Compatibility"
  subject="Quota patches"
  archive=""
  posts="28"
  startdate="20 May 2002 05:55:31 -0800"
  enddate="24 May 2002 10:41:25 -0800"
>
<topic>Backward Compatibility</topic>
<topic>Executable File Format</topic>

<mention>Jan Kara</mention>

<p>In the course of discussion some new patches for disk quotas, Linus
Torvalds suggested removing some specific code that only provided backward
compatibility to older kernels. He asked, <quote who="Linus Torvalds">Are
there _any_ reasons to use the old stuff, if the fix is just to upgrade to
a newer quota tool?</quote> And Alan Cox added, <quote who="Alan Cox">Most
people use 2.4 with quota tools and 32bit uid quota already, so its not much
of a breakage at all. The 2.4 quota base code is unusable in the real world
so the problem got settled by the vendor trees.</quote></p>

<p>Jan Kara said he'd send a patch to Linus to remove the extra code, and
Martin Dalecki suggested, <quote who="Martin Dalecki">If we can do it for
quota - we could possible remove the IPC_OLD variant away as well. It's
looong overdue by now, becouse the IPC_OLD was not standard conformant
anyway.</quote> But Alan replied that it was:</p>

<quote who="Alan Cox">

<p>More code that takes almost no space, ensures old systems still work and
old XFree86 still runs on new kernels. Why remove it ?</p>

<p>If you want to design a mathematically elegant and small ultra clean
OS go do it. Linux however has to work in the real world not in the happy
clueless world of pure mathematical elegance.</p>

</quote>

<p>Martin replied, <quote who="Martin Dalecki">It is an illusion to think
that you can actually run *that old* a.out binaries on a modern kernel
I think.</quote> And Christoph Hellwig rejoined, <quote who="Christoph
Hellwig">Of course you can.  Even the latest OpenLinux release (shipping
2.4.13-ac) uses a libc4/a.out based installer fo space reasons.  Not to
forget the old quake1 binary from some redhat 4.x CD I run from time to time
:)</quote> Martin was impressed to hear that these things actually worked, and
Alan told him he should have tested the idea before making his suggestion. Alan
added, <quote who="Alan Cox">It btw goes beyond Libc4. Currently we have
almost 100% compatibility back to libc 2.2.2. The dated libc before that
doesn't work because we dropped some very very early obscure versions of a
few syscalls.</quote></p>

<p>At this point Christoph remarked, <quote who="Christoph Hellwig">For
2.5 I have some plans to make obsolete syscalls depend on CONFIG_COMPAT_*,
this allows to compile big and bloated kernel for compatiblity and smaller
kernels without that (e.g. for embedded devices).  And in fact we have
quite a loft of cruft that can go away for setups only having very modern
userspace..</quote> Martin and Alan both approved of this idea, and Alan added,
<quote who="Alan Cox">For embedded you also want config options to remove the
block layer and so forth. I'd been thinking about a set of options buried in
a config menu item like "Fine tune configuration for small/embedded devices"
CONFIG_SMALL.</quote></p>

</section>

<section
  title="Status Of /dev/port"
  subject="Linux-2.5.17"
  archive=""
  posts="153"
  startdate="20 May 2002 21:16:35 -0800"
  enddate="27 May 2002 01:07:46 -0800"
>
<topic>Development Strategy</topic>
<topic>FS</topic>
<topic>Kernel Release Announcement</topic>

<mention>Paul Mackerras</mention>
<mention>David S. Miller</mention>
<mention>Paul Rusty Russell</mention>

<p>Linus Torvalds announced 2.5.17 and there was a ton of discussion about
it. In one subthread, Martin Dalecki posted a patch to completely get rid of
the /dev/port interface. He argued:</p>

<quote who="Martin Dalecki">

<p>

<ol>

<li>It is not usable with ports which require 4 byte access.</li>

<li>The same can be achieved by using capabilities and su bits and so on.</li>

<li>__m68000__ doesn't even implement it and most other non i386 archs
"implement" it but apparently don't even care about endianess issues.</li>

<li>It's not standard.</li>

<li>seek() + port access is "racy" with respect to multiple usage.</li>

<li>Nothing is using it.</li>

</ol>

</p>

<p>... and so on and so on ...</p>

<p>And finally, kernel size with it:</p>

<pre>   text    data     bss     dec     hex filename
1480587  243280  259628 1983495  1e4407 vmlinux</pre>

<p>kernel size without it:</p>

<pre>[root@kozaczek linux]# size vmlinux
   text    data     bss     dec     hex filename
1480229  243184  259628 1983041  1e4241 vmlinux</pre>

<p>Which means a saving of 454 bytes :-).</p>

</quote>

<p>Paul Mackerras and David S. Miller both thought this was a fantastic idea,
but there was some speculation that Martin would be flamed to Hell and back for
making the suggestion. Several hours later folks started expressing surprise
at the absense of the expected flame-war. But at one point Alan Cox did point
out, <quote who="Alan Cox">The /dev/port interface is used by various apps
and its a traditional x86 in paticular unix thing. For platforms like ARM
its poorly implemented since it ought to turn into a fraction of /dev/mem and
support mmap for speedier user space in/out emulation..</quote> Martin raised
an eyebrow at this; he'd thought /dev/port was entirely Linux-specific. But
Alan went on, <quote who="Alan Cox">The /dev/port interface is in a whole
variety of older Unixen for x86, and also in systems like Minix.</quote>
And elsewhere he came down more firmly against the whole idea. At this
point the temperature did start to rise slightly, until Paul Rusty Russell
suggested that this entire issue would be better discussed at the Kernel
Summit rapidly approaching. That pretty much ended that subthread.</p>

<p>However, elsewhere, Linus Torvalds weighed in on the issue, saying he was
OK with getting rid of /dev/port. He explained, <quote who="Linus Torvalds">It
was done purely because Minix did it that way, and it wasn't even compatible
with Minix (I think Minix actually supoorted 2- and 4-byte accesses by just
doign 2- and 4-byte read/write calls, the Linux code never did).</quote> He
added, <quote who="Linus Torvalds">Anybody: if you've ever used /dev/ports,
holler _now_.</quote> Alan replied:</p>

<quote who="Alan Cox">

<p>Holler. I posted a list of examples to linux-kernel already. iopl and
ioperm are not portable in the way /dev/port is. ioperm/iopl also doesnt
work with most scripting languages, java tools trying to avoid JNI etc</p>

<p>I've seen it used in tools written in java, python, perl, even tcl</p>

<p>Other examples include libieee1284, the pic 16x84 programmer, hwclock,
older kbdrate, /sbin/clock on machines that don't have /dev/rtc.</p>

<p>Not everything in the world is an x86, and not every app wants to be
Linux/x86 specific or use weird syscalls</p>

</quote>

<p>Pete Zaitcev also said to Linus:</p>

<quote who="Pete Zaitcev">

<p>I often use it as an alternative to #include &lt;asm/io.h&gt;, which
you decreed illegal. I understand &lt;sys/io.h&gt; is a legal alternative,
but a bunch of platforms forget to include &lt;sys/io.h&gt;, for instance
Jes cried bloody murder when asked to add it to ia-64. But if you decide to
drop /dev/port I can tough it out. Solaris lives without it, and so can we.</p>

<p>I saw this whining about outl not implemented for write(fd, &amp;my_int, 4),
and I think the guy had a little point.  Though if he wanted it, he ought
to submit a patch.</p>

</quote>

<p>Martin replied, <quote who="Martin Dalecki">if someone want's to use
/dev/port for developement on some slow control experimental hardware for
example.  Why doesn't he just</quote> [...] <quote who="Martin Dalecki">compile
it as a *separate* character device module ?  That's linux - you have the
source, so use it.  You wan't to cheat around the OS abstractions - do it
for yourself!  There is no requirement that it has to be permanently in
the mainline kernel where it tends to attract people who shouldn't have
used it in first place for generic stuff like kbd rate settings and clock
device manipulation.</quote> But Linus said:</p>

<quote who="Linus Torvalds">

<p>That's not a productive approach, Martin.</p>

<p>Yes, with open source you can do whatever you want.</p>

<p>HOWEVER, there is a huge amount of advantage to having a common base that
is big enough to matter: why do you think MS does well commercially?</p>

<p>It's important to _not_ have to force people to do site-specific (or
problem-specific) hacks, even if they could do so. Because having to have
site-specific hacks detracts from the general usability of the code.</p>

<p>So when simplifying, it's not just important to say "we could do without
this". You have to also say "and nobody can reasonably expect to need it".</p>

<p>Which doesn't seem to be the case with /dev/ports. So it stays.</p>

</quote>

</section>

<section
  title="Status Of ext3 And RAID In 2.2"
  subject="2.2 kernel - Ext3 &amp; Raid patches"
  archive=""
  posts="11"
  startdate="21 May 2002 13:40:06 -0800"
  enddate="23 May 2002 14:25:04 -0800"
>
<topic>Disk Arrays: RAID</topic>
<topic>FS: ext3</topic>
<topic>Version Control</topic>

<p>Jon Hedlund seemed to remember hearing some warnings not to use ext3
with RAID in 2.2 kernels; but he'd been using ext3 and RAID 1 with almost no
problems for over nine months. He asked if this was normal, or had he just
been lucky?  Andreas Dilger replied, <quote who="Andreas Dilger">You've just
been lucky.  I forget the exact scenario, but it is something like if journal
replay is happening while the RAID is being reconstructed after a crash you
can get garbage written to your disk.</quote> And Stephen C. Tweedie added:</p>

<quote who="Stephen C. Tweedie">

<p>Right --- the raid resync code in 2.2 uses the normal buffer cache, which
results in writes being scheduled for clean buffers, behind ext3's back.
That's not allowed --- it violates the write ordering requirements that
make ext3 work, and trips up debugging assert failures in the ext3 write
checking code.</p>

<p>You might get away with it, but a raid resync on ext3 on 2.2 is basically
not safe.  If you wait until after the resync before mounting the ext3
filesystem, you'll be OK.</p>

<p>It should work on 2.4.</p>

</quote>

<p>Elsewhere, Mike Fedyk took credit for warning people against using those two
patches in combination, and suggested:</p>

<quote who="Mike Fedyk">

<p>If I were you, I'd just test a 2.4 kernel on the configuration you want.
Unless there is some binary driver that use that doesn't support 2.4 there
isn't much use staying with 2.2.</p>

<p>This configuration is unsafe for 2.2, and I've used raid1 and raid5 with
ext3 without any trouble, even on degraded arrays (for as short a period as
possible of course).</p>

</quote>

<p>But Stephen put in, <quote who="Stephen C. Tweedie">Actually, you just
need to renumber one of the conflicting #defines to something unused, and
it will work fine.  Soft raid0 or linear mode will work quite happily with
ext3 on 2.2 after you do that, it's only the resync after a crash that you
get with raid1 or raid5 that is dangerous.</quote> Later, he reiterated that
this fix would only work for RAID 0.</p>

<p>Also in reply to Mike, Tomas Szepe objected that the recommendation to
just use 2.4 was not feasible on a Sparc 32 system because of some bugs
that had surfaced recently. But David S. Miller replied, <quote who="David
S. Miller">There have been several patches posted to deal with that problem,
you can apply them yourself or grab Marcelo's current 2.4.x BK tree.</quote>
After some work, Tomas also offered:</p>

<quote who="Tomas Szepe">

<p>Here comes for all sparc people who can't install BK:</p>

<p>All sparc32/sparc64 related changes since 2.4.19-pre8
in one diff copied and fixed up by hand from <a
href="http://linux.bkbits.net:8080/linux-2.4/ChangeSet@-3w?nav=index.html">http://linux.bkbits.net:8080/linux-2.4/ChangeSet@-3w?nav=index.html</a></p>

<p>All I can claim as to the patched kernel's functionality -- it has compiled
for me on sparc32. I'm going to try to boot it next week when I'm changing
disks in my server.</p>

</quote>

</section>

<section
  title="LVM Cleanup"
  subject="[RFC/PATCH] lvm sanitation in 2.5"
  archive=""
  posts="4"
  startdate="22 May 2002 17:15:19 -0800"
  enddate="26 May 2002 08:43:27 -0800"
>
<topic>Disk Arrays: LVM</topic>
<topic>FS</topic>
<topic>Ioctls</topic>

<mention>Alexander Viro</mention>

<p>Anders Gustafsson announced, <quote who="Anders Gustafsson">I have
started cleaning up lvm. The following patch contains the first steps. It
disables a lot of functionallity but the basic things are there, I'm actually
running a kernel with this patch right now, with /home and /var on lvm. The
vg_t/lv_t..-structures are now available in to versions, one exported to
userspace (and that should remain constant through versions) and one used in
kernelspace containing stuff that should not be exposed to userspace (struct
block_device, kdev_t and such). (this also allows more flexibillity making
changes in the driver without changing the userspace interface).</quote>
Alexander Viro was very pleased to see this, and gave some advice for the
ongoing work. And Joe Thornber also said to Anders:</p>

<quote who="Joe Thornber">

<p>I started a similar process last summer, if you want to pick up on
my work you can find it in cvs under that tag 'experimental' (cvs co -d
:pserver:cvs@tech.sistina.com:/data/cvs -r experimental LVM).  There are a
*lot* of changes in there, particualarly I factored out the ioctl interface
into a file of its own and rewrote a lot of it.  I think I tidied up the
mapping functions a lot too.</p>

<p>However it soon became apparent that the end result would still be poor
due to the appalling ioctl interface.  Hence the LVM2 project, which the team
has been working on for the last 9 months.  So maybe the question should be
'is it time to switch from LVM1 to LVM2 in 2.5?'.</p>

<p>Just so that you are aware that no matter how much you tidy up LVM1 people
are not going to be happy with it - you have to compete against flawed design
as well as bad code.</p>

</quote>

<p>There was no reply.</p>

</section>

<section
  title="BitKeeper Repository Downtime"
  subject="bkbits.net downtime"
  archive=""
  posts="1"
  startdate="24 May 2002 09:11:46 -0800"
>
<topic>Version Control</topic>

<p>Larry McVoy announced:</p>

<quote who="Larry McVoy">

<p>Hi, we're working on an upgrade for bkbits.net and when we have it ready,
we'll want to switch the drives from one machine to another.  We're aiming
to do this later today, so please update your trees now.  Linus hasn't pushed
anything since yesterday so this is probably a good time.</p>

<p>One side effect of the upgrade is that we're going to get an online hot
spare out of the deal, so in the future, we'll be able to do this sort of
thing behind your back and you'll never know.</p>

</quote>

<p>There was no reply.</p>

</section>

<section
  title="Status Of I2O In 2.5 And 2.4"
  subject="Linux I2O Status"
  archive=""
  posts="1"
  startdate="24 May 2002 11:07:52 -0800"
>
<topic>Disks: SCSI</topic>
<topic>I/O</topic>
<topic>I2O</topic>
<topic>PCI</topic>

<p>Alan Cox announced:</p>

<quote who="Alan Cox">

<p>I asked folks to avoid touching the I2O stuff in 2.5 because major surgery
and work was needed in 2.4 before even tackling 2.5</p>

<p>The 2.4.19pre8-ac5  status is:</p>

<p>

<ul>

<li>The i2o_pci layer has been cleaned up massively and might well be both
32/64bit clean</li>

<li>The i2o_block layer has been half rewritten to deal with firmware bugs
now that the I/O layers can hand down blocks of 64K or more</li>

<li>The i2o_core should be pretty clean.</li>

<li>The i2o_scsi layer is still not 64bit clean and still uses the old eh
code. Basically it needs the same work as the other scsi drivers plus some
pointer into 32bit message field abuses fixing</li>

</ul>

</p>

<p>On x86 32bit it is all stable again and now works on my DPT and on the
AMI Megaraid as well as the cards it handled before. Block caching strategy
is now configurable.</p>

<p>I still have to do the pci mapping and try and find the rest of the 64bit
bogons, however the core code is now in a shape where it ought to be possible
to move it forward into 2.5 if anyone with i2o kit feels the urge.</p>

<p>I'll look at the SCSI 64bit cleanness and PCI mapping over time. They
are not priority items to me right now (at least until AMD Hammer hits the
mass market)</p>

</quote>

<p>There was no reply.</p>

</section>

<section
  title="BitKeeper Discussion"
  subject="2.4 SRMMU bug revisited"
  archive=""
  posts="21"
  startdate="27 May 2002 01:24:08 -0800"
  enddate="29 May 2002 13:42:12 -0800"
>
<topic>Version Control</topic>

<mention>Tomas Szepe</mention>

<p>In the course of discussion, Tomas Szepe was unable to find evidence
of a patch he'd been certain had been applied. He could not find it using
the <a href="http://linux.bkbits.net:8080/linux-2.4/">web interface to the
BitKeeper tree</a> David S. Miller replied:</p>

<quote who="David S. Miller">

<p>The BK repository to use has the URL:</p>

<p>bk://linux.bkbits.net/linux-2.4</p>

<p>The web stuff is updated still by hand and is as a result chronically
out of date.</p>

</quote>

<p>But David Woodhouse at one point gave a link to <a
href="http://ftp.kernel.org/pub/linux/kernel/people/dwmw2/bk-2.4/">his own
BitKeeper web interface</a>, and said <quote who="David Woodhouse">That web
stuff is updated by cron and is as a result never more than an hour out of date
(w.r.t. bk//linux.bkbits.net/linux-2.4) unless something breaks.</quote></p>

</section>

</kc>

