<?xml version="1.0" ?>

<kc>

<title>Kernel Traffic</title>

<author contact="mailto:zbrown@tumblerings.org">Zack Brown</author>

<issue num="49" date="03 Jan 2000 00:00:00 -0800" />

<stats posts="1023" size="4000" contrib="366" multiples="171" lastweek="156">

<person posts="43" size="134" who="Alan Cox " />
<person posts="34" size="118" who="Tigran Aivazian " />
<person posts="29" size="83" who="Tim Waugh " />
<person posts="23" size="102" who="James Simmons " />
<person posts="20" size="74" who="Linus Torvalds " />
<person posts="16" size="53" who="Alexander Viro " />
<person posts="15" size="49" who="Keith Owens " />
<person posts="14" size="62" who="" />
<person posts="13" size="61" who="Stephen Frost " />
<person posts="12" size="39" who="Mike Frisch " />
<person posts="11" size="31" who="Richard Gooch " />
<person posts="10" size="42" who="Jamie Lokier " />
<person posts="10" size="37" who="Marek Habersack " />
<person posts="10" size="37" who="Pavel Machek " />
<person posts="10" size="33" who="David S. Miller " />
<person posts="10" size="28" who="Thomas Sailer " />
<person posts="9" size="58" who="Bill Wendling " />
<person posts="9" size="35" who="Ted Sikora " />
<person posts="9" size="33" who="" />
<person posts="9" size="29" who="Richard B. Johnson " />
<person posts="9" size="25" who="Alan Cox " />
<person posts="8" size="28" who="Martin Dalecki " />
<person posts="8" size="26" who="Jes Sorensen " />
<person posts="8" size="25" who="Chris Wedgwood " />
<person posts="7" size="30" who="Mark Lord " />
<person posts="7" size="28" who=" (Linus Torvalds)" />
<person posts="7" size="23" who="" />
<person posts="7" size="23" who="Andrea Arcangeli " />
<person posts="7" size="22" who="Khimenko Victor " />
<person posts="7" size="22" who="Rik van Riel " />
<person posts="7" size="21" who="Rusty Russell " />
<person posts="6" size="33" who="Riley Williams " />
<person posts="6" size="29" who="Gerard Roudier " />
<person posts="6" size="28" who="Neil Brown " />
<person posts="6" size="27" who="Trond Myklebust " />
<person posts="6" size="23" who="Martin Mares " />
<person posts="6" size="23" who="Manfred Spraul " />
<person posts="6" size="21" who="Homme R. Bitter " />
<person posts="6" size="21" who="Horst von Brand " />
<person posts="6" size="20" who="Andi Kleen " />
<person posts="6" size="18" who="Petri Kaukasoina " />
<person posts="6" size="18" who="Jeff Garzik " />
<person posts="6" size="16" who="Ingo Molnar " />
<person posts="5" size="34" who=" (david parsons)" />
<person posts="5" size="27" who="Matt Robinson " />
<person posts="5" size="21" who="Wakko Warner " />
<person posts="5" size="18" who="Raul Miller " />
<person posts="5" size="17" who="Matthias Andree " />
<person posts="5" size="17" who="David Schwartz " />
<person posts="5" size="17" who=" (Rogier Wolff)" />
<person posts="5" size="17" who="Horst von Brand " />
<person posts="5" size="15" who="Dominik Kubla " />
<person posts="5" size="15" who="Roeland Th. Jansen " />
<person posts="5" size="15" who="Matthew Kirkwood " />
<person posts="5" size="14" who="Jens Axboe " />
<person posts="5" size="14" who="Matthew Wilcox " />
<person posts="5" size="12" who="Bernhard Rosenkraenzer " />
<person posts="4" size="92" who="Nicholas R LeRoy " />
<person posts="4" size="34" who="Christoph Brauckmann " />
<person posts="4" size="32" who="Ward Vandewege " />
<person posts="4" size="30" who="Sergey Kubushin " />
<person posts="4" size="20" who="Gerd Knorr " />
<person posts="4" size="19" who="Walter Hofmann " />
<person posts="4" size="17" who="David Ford " />
<person posts="4" size="16" who="Andre Hedrick " />
<person posts="4" size="15" who="TenThumbs " />
<person posts="4" size="13" who="Stephen C. Tweedie " />
<person posts="4" size="13" who="Daniel Silverstone (Kinnison) " />
<person posts="4" size="13" who="Prashant TR " />
<person posts="4" size="12" who="Heinz Diehl " />
<person posts="4" size="12" who="David Weinehall " />
<person posts="4" size="12" who="Bernard Wei " />
<person posts="3" size="22" who="Alexandre Hautequest " />
<person posts="3" size="17" who="Gabriel Paubert " />
<person posts="3" size="17" who="Ketil Malde " />
<person posts="3" size="16" who="Markus Schoder " />
<person posts="3" size="12" who="David Edwards " />
<person posts="3" size="12" who="Benjamin C.R. LaHaise " />
<person posts="3" size="12" who="Dwayne C . Litzenberger " />
<person posts="3" size="12" who="Dan Kegel " />
<person posts="3" size="12" who=" (Rogier Wolff)" />
<person posts="3" size="11" who="Brian Pomerantz " />
<person posts="3" size="11" who="Bret Indrelee " />
<person posts="3" size="11" who="Mike A. Harris " />
<person posts="3" size="11" who="Giacomo Catenazzi " />
<person posts="3" size="11" who="Zach Brown " />
<person posts="3" size="10" who="Francesco Chemolli " />
<person posts="3" size="10" who="Hans Reiser " />
<person posts="3" size="10" who="Jesse Pollard " />
<person posts="3" size="10" who="Theodore Y. Ts'o " />
<person posts="3" size="10" who="Dunlap, Randy " />
<person posts="3" size="10" who="" />
<person posts="3" size="9" who="Ralf Baechle " />
<person posts="3" size="9" who=" (Eugene Crosser)" />
<person posts="3" size="9" who="Richard Henderson " />
<person posts="3" size="9" who="Chris Noe " />
<person posts="3" size="8" who="Ron Flory " />
<person posts="3" size="8" who="Peter Samuelson " />
<person posts="3" size="8" who="Jeff Garzik " />
<person posts="3" size="7" who="Dan Hollis " />
<person posts="3" size="7" who="Robert A. Hayden " />
<person posts="3" size="6" who="inf " />
<person posts="2" size="63" who="Zach Brown " />
<person posts="2" size="46" who="Chuck Lever " />
<person posts="2" size="16" who="David Morton " />
<person posts="2" size="14" who="Arjan Filius " />
<person posts="2" size="13" who="James Mulcahy " />
<person posts="2" size="12" who="Brian Macy " />
<person posts="2" size="12" who="B. D. Elliott " />
<person posts="2" size="11" who=" (Jim Gettys)" />
<person posts="2" size="10" who="Chuck Phillips " />
<person posts="2" size="10" who="Derrick Steed " />
<person posts="2" size="10" who="Nicholas Waltham " />
<person posts="2" size="10" who="Marques Johansson " />
<person posts="2" size="10" who="Brian Hall " />
<person posts="2" size="9" who="Josef =?iso-8859-1?Q?H=F6=F6k?= " />
<person posts="2" size="9" who="water modem " />
<person posts="2" size="9" who="Andreas Scherbaum " />
<person posts="2" size="9" who="" />
<person posts="2" size="8" who="Sean Hunter " />
<person posts="2" size="8" who=" (Arjan van de Ven)" />
<person posts="2" size="8" who="Larry McVoy " />
<person posts="2" size="8" who="Anders Larsen " />
<person posts="2" size="8" who="Jon Leech " />
<person posts="2" size="8" who="Scott Henry " />
<person posts="2" size="8" who="Rik Faith " />
<person posts="2" size="7" who="Robert Dinse " />
<person posts="2" size="7" who=" (H. Peter Anvin)" />
<person posts="2" size="7" who="Rob Hall " />
<person posts="2" size="7" who=" (Steven S. Dick)" />
<person posts="2" size="7" who="Simon Kirby " />
<person posts="2" size="7" who="Robert Cohen " />
<person posts="2" size="7" who="Jan Kasprzak " />
<person posts="2" size="7" who="Vladimir Ivaschenko " />
<person posts="2" size="7" who="" />
<person posts="2" size="7" who="Wichert Akkerman " />
<person posts="2" size="7" who="Jakub Jelinek " />
<person posts="2" size="7" who="Jeff Uphoff " />
<person posts="2" size="7" who="Lincoln Dale " />
<person posts="2" size="7" who="Borek Lupomesky " />
<person posts="2" size="6" who="Kristian Koehntopp " />
<person posts="2" size="6" who="Ingo Oeser " />
<person posts="2" size="6" who="" />
<person posts="2" size="6" who="Adam Fritzler " />
<person posts="2" size="6" who="Artur Skawina " />
<person posts="2" size="6" who="Yasuhide OOMORI " />
<person posts="2" size="6" who="Chris Meadors " />
<person posts="2" size="6" who="Vikram " />
<person posts="2" size="6" who="Mark Hahn " />
<person posts="2" size="6" who="Lee Rhodes " />
<person posts="2" size="6" who="Dimitris Michailidis " />
<person posts="2" size="6" who="Aaron Holtzman " />
<person posts="2" size="6" who="Taso Hatzi " />
<person posts="2" size="6" who="H. Peter Anvin " />
<person posts="2" size="6" who=" (Peter Bornemann)" />
<person posts="2" size="6" who="" />
<person posts="2" size="5" who="Stanislav Meduna " />
<person posts="2" size="5" who="Petko Manolov " />
<person posts="2" size="5" who="Gaurav Yadav " />
<person posts="2" size="5" who="Martin Maciaszek " />
<person posts="2" size="5" who="Yuri Kuzmenko " />
<person posts="2" size="5" who="Willy Tarreau " />
<person posts="2" size="5" who="Pete Clements " />
<person posts="2" size="5" who="Wayne Pascoe " />
<person posts="2" size="5" who="James H. Cloos Jr. " />
<person posts="2" size="5" who="Edgar Toernig " />
<person posts="2" size="5" who="Catalin BOIE " />
<person posts="2" size="5" who=" (Hans-Joachim Baader)" />
<person posts="2" size="4" who="Matthew D. Pitts " />
<person posts="2" size="4" who="Gerald Haese " />
<person posts="2" size="4" who="Albert D. Cahalan " />
<person posts="1" size="13" who="Donn Washburn " />
<person posts="1" size="9" who="" />
<person posts="1" size="8" who="Frank Bernard " />
<person posts="1" size="8" who="James Thoenen " />
<person posts="1" size="7" who="Upshur Parks " />
<person posts="1" size="6" who="Abramo Bagnara " />
<person posts="1" size="6" who="John Leon " />
<person posts="1" size="6" who="Tom Zerucha " />
<person posts="1" size="6" who=" (Kai Henningsen)" />
<person posts="1" size="5" who="Jason Jordan " />
<person posts="1" size="5" who="Ulrich Windl " />
<person posts="1" size="5" who="ursus " />
<person posts="1" size="5" who="Daryll Strauss " />
<person posts="1" size="5" who="Sumner, Jeff " />
<person posts="1" size="5" who="Yuri Kuzmenko " />
<person posts="1" size="5" who="Dr.Nick P.Kostrov " />
<person posts="1" size="5" who="Jelle Foks " />
<person posts="1" size="5" who="Malcolm Beattie " />
<person posts="1" size="5" who="Joel Jaeggli " />
<person posts="1" size="5" who="" />
<person posts="1" size="5" who="Benno Senoner " />
<person posts="1" size="5" who="Peter J. Braam " />
<person posts="1" size="4" who="Wolfram Pienkoss " />
<person posts="1" size="4" who="Peter Rival " />
<person posts="1" size="4" who="De Schrijver Peter " />
<person posts="1" size="4" who="Rene Rebe " />
<person posts="1" size="4" who="Kristian Nielsen " />
<person posts="1" size="4" who="Serge Robyns " />
<person posts="1" size="4" who="vt " />
<person posts="1" size="4" who="Anton Ivanov " />
<person posts="1" size="4" who="Juergen Rose " />
<person posts="1" size="4" who="Ben LaHaise " />
<person posts="1" size="4" who="Anthony Barbachan " />
<person posts="1" size="4" who="Terry Katz " />
<person posts="1" size="4" who=" (Scott Lurndal)" />
<person posts="1" size="4" who="Zdenek Kabelac " />
<person posts="1" size="4" who="Pawel Krawczyk " />
<person posts="1" size="4" who="Tom Gilbert " />
<person posts="1" size="4" who="Frank de Lange " />
<person posts="1" size="4" who="Jim Breton " />
<person posts="1" size="4" who="Eric Lemar " />
<person posts="1" size="4" who="M Sweger " />
<person posts="1" size="4" who="pm " />
<person posts="1" size="4" who="Bradley M Keryan " />
<person posts="1" size="4" who="James Antill " />
<person posts="1" size="3" who="Benjamin J. Stassart " />
<person posts="1" size="3" who="Chris Chiappa " />
<person posts="1" size="3" who="Andreas =?iso-8859-1?Q?G=FCnther?= " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Pavel Machek " />
<person posts="1" size="3" who="Joel Becker " />
<person posts="1" size="3" who="Pol Muaddib " />
<person posts="1" size="3" who="Jan-Benedict Glaw " />
<person posts="1" size="3" who="Stephen Rothwell " />
<person posts="1" size="3" who="William Stearns " />
<person posts="1" size="3" who="Jonathan Disher " />
<person posts="1" size="3" who="Florian Heinz " />
<person posts="1" size="3" who="Mike Touloumtzis " />
<person posts="1" size="3" who="Frank v Waveren " />
<person posts="1" size="3" who="Brett Person " />
<person posts="1" size="3" who="Marcus Sundberg " />
<person posts="1" size="3" who="Russell King " />
<person posts="1" size="3" who="Antonio M. Trindade " />
<person posts="1" size="3" who="Mike Galbraith " />
<person posts="1" size="3" who="Jeremy Fitzhardinge " />
<person posts="1" size="3" who="Richard Adams " />
<person posts="1" size="3" who="Dale Amon " />
<person posts="1" size="3" who="Adam D. Bradley " />
<person posts="1" size="3" who="Tigran Aivazian " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Jason Gunthorpe " />
<person posts="1" size="3" who=" (Kanoj Sarcar)" />
<person posts="1" size="3" who="Jonathan Hseu " />
<person posts="1" size="3" who="Harald Koenig " />
<person posts="1" size="3" who="Raul Miller " />
<person posts="1" size="3" who="Dancer " />
<person posts="1" size="3" who="Jan-Friso Evers " />
<person posts="1" size="3" who="Patrick Mau " />
<person posts="1" size="3" who="Christopher E. Brown " />
<person posts="1" size="3" who="Dave Gilbert " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Harold Oga " />
<person posts="1" size="3" who="Vladislav Malyshkin " />
<person posts="1" size="3" who="Donald Becker " />
<person posts="1" size="3" who="=?iso-8859-1?Q?V=EDctor_R=2E_Ruiz?= " />
<person posts="1" size="3" who="Folkert van Heusden " />
<person posts="1" size="3" who="=?iso-8859-1?Q?Fran=E7ois=20D=E9sarm=E9nien?= " />
<person posts="1" size="3" who="Pavel Lajsner " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Michael Marxmeier " />
<person posts="1" size="3" who="Mike Harrelson " />
<person posts="1" size="3" who="Arjan van de Ven " />
<person posts="1" size="3" who=" (Greg KH)" />
<person posts="1" size="3" who="Philipp Rumpf " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Mark Montague " />
<person posts="1" size="3" who="Juan Carlos Castro y Castro " />
<person posts="1" size="3" who="Sasi Peter " />
<person posts="1" size="3" who="Jean-Luc " />
<person posts="1" size="3" who="Henning P. Schmiedehausen " />
<person posts="1" size="3" who=" (Juergen Fischer)" />
<person posts="1" size="3" who="jordi ros " />
<person posts="1" size="3" who="Stephen Waters " />
<person posts="1" size="3" who="Steven Whitehouse (SUCS Admin) " />
<person posts="1" size="3" who=" (Jon Leech)" />
<person posts="1" size="3" who=" (Arjan van de Ven)" />
<person posts="1" size="3" who="Kevin Xie (Xie Huagang) " />
<person posts="1" size="3" who="Henrik Olsen " />
<person posts="1" size="3" who="Darrell Wright " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Tim Coleman " />
<person posts="1" size="3" who="David Wragg " />
<person posts="1" size="3" who="Wichert Akkerman " />
<person posts="1" size="3" who="David Schleef " />
<person posts="1" size="3" who="" />
<person posts="1" size="3" who="Juan Jose Casero " />
<person posts="1" size="3" who="Robert L. Harris " />
<person posts="1" size="3" who="Ulrich Drepper " />
<person posts="1" size="3" who="Richard Ketchersid " />
<person posts="1" size="3" who="Petru Paler " />
<person posts="1" size="3" who="Borislav Deianov " />
<person posts="1" size="3" who="David Parsons " />
<person posts="1" size="2" who="Stephen Williams " />
<person posts="1" size="2" who="Alec Smith " />
<person posts="1" size="2" who="Jim Nance " />
<person posts="1" size="2" who="Doug Hass " />
<person posts="1" size="2" who="Mark H. Wood " />
<person posts="1" size="2" who="Erez Zadok " />
<person posts="1" size="2" who="Mark Anthony J. Mercado " />
<person posts="1" size="2" who="Stephane Dudzinski " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Richard Campbell " />
<person posts="1" size="2" who="Craig I. Hagan " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Jan Kara " />
<person posts="1" size="2" who="Marcin Dalecki " />
<person posts="1" size="2" who="Michael Elizabeth Chastain " />
<person posts="1" size="2" who="Ph. Marek " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Rick Hohensee " />
<person posts="1" size="2" who="Douglas Gilbert " />
<person posts="1" size="2" who="Thomas Molina " />
<person posts="1" size="2" who="Premek Marek " />
<person posts="1" size="2" who="Thomas Davis " />
<person posts="1" size="2" who="Pascal A. Dupuis " />
<person posts="1" size="2" who="Balaji Srinivasan " />
<person posts="1" size="2" who="Jim Brown " />
<person posts="1" size="2" who="Giovanni Picoli Tirloni " />
<person posts="1" size="2" who="Scott Thomason " />
<person posts="1" size="2" who="" />
<person posts="1" size="2" who="Robert " />
<person posts="1" size="2" who="Gregory Maxwell " />
<person posts="1" size="2" who="Matti Aarnio " />
<person posts="1" size="2" who="Tomas Franke " />
<person posts="1" size="2" who="Guido Cervone " />
<person posts="1" size="2" who="Dag Wieers " />
<person posts="1" size="2" who="=?ISO-8859-1?Q?Mattias Engdeg=E5rd?= " />
<person posts="1" size="2" who="=?iso-8859-1?Q?Ragnar_Kj=F8rstad?= " />
<person posts="1" size="2" who="Armin Schindler " />
<person posts="1" size="2" who="Manfred " />
<person posts="1" size="2" who="Elmer Joandi " />
<person posts="1" size="2" who="Alex Buell " />
<person posts="1" size="2" who="Andrew Clausen " />
<person posts="1" size="2" who="=?gb2312?q?do=20do?= " />
<person posts="1" size="2" who="Chris Jones " />
<person posts="1" size="2" who="Tony Hoyle " />
<person posts="1" size="2" who=" (Michael Surenbrock)" />
<person posts="1" size="2" who=" (Miquel van Smoorenburg)" />
<person posts="1" size="2" who="Philip Blundell " />
<person posts="1" size="2" who="Matthias Riese " />
<person posts="1" size="2" who="mathijs " />
<person posts="1" size="2" who="vinny " />
<person posts="1" size="2" who="nag " />
<person posts="1" size="2" who="Ingo Molnar " />
<person posts="1" size="2" who="B. James Phillippe " />
<person posts="1" size="2" who=" (Aydin OKUTANOGLU)" />
<person posts="1" size="2" who="David Cougle " />
<person posts="1" size="2" who="Enrico Demarin " />
<person posts="1" size="2" who="Clint Adams " />
<person posts="1" size="2" who="Meino Christian Cramer " />
<person posts="1" size="2" who="Alan Curry " />
<person posts="1" size="2" who="Prashant TR " />
<person posts="1" size="2" who=" (Andreas Jellinghaus)" />
<person posts="1" size="2" who="bacon " />
<person posts="1" size="2" who="Bryan O'Sullivan " />
<person posts="1" size="2" who="Brian Gerst " />
<person posts="1" size="2" who="Azeem Shahjahan Jiva " />
<person posts="1" size="2" who="Eleonora Autore " />
<person posts="1" size="2" who="Erik McKee " />
<person posts="1" size="2" who="Jason Venner " />
<person posts="1" size="2" who="Sherif Abou Seda " />
<person posts="1" size="2" who="Vlad Petric " />
<person posts="1" size="2" who="Ahmad Hatami " />

</stats>

<section
  title="Thread-Private Mappings; Linus On Unix"
  subject="Thread-private mappings and graphics (was Re: Per-Processor Data Page)"
  archive="http://kernelnotes.org/lnxlists/linux-kernel/lk_9912_02/msg01170.html"
  posts="67"
  startdate="13 Dec 1999 00:00:00 -0800"
  enddate="21 Dec 1999 00:00:00 -0800"
>
<topic>PCI</topic>
<topic>SMP</topic>
<topic>Virtual Memory</topic>

<p>Jon Leech pointed out that thread-private mappings were very useful for 3D
graphics applications. He explained, <quote who="Jon Leech">The OpenGL API
relies on an implicit graphics context, so that multithreaded apps need to
do some sort of thread-specific lookup at each call.</quote> He added,
<quote who="Jon Leech">Many GL calls can be just a few instructions long on
suitable hardware (e.g. shove parameters into a DMA buffer or FIFO), so the
lookup needs to be very fast. Brian Paul has done some preliminary testing
demonstrating a roughly 3:1 performance hit for performing thread-specific
lookup via pthread_getspecific() on an otherwise empty call, giving some
guidance as to impact in a real driver.</quote> He suggested that some sort
of kernel implementation would be much easier than requiring changes to the
OpenGL API, which would in turn necessitate rewriting a lot of programs and
documentation.</p>

<p>James Simmons agreed, but Linus Torvalds said:</p>

<quote who="Linus Torvalds">

<p>It will never happen on Linux.</p>

<p>I'm suprised an SGI person hasn't learnt from past mistakes. IRIX is
unstable, and unmaintainable, and please just face it - it's because SGI had
the "cool feature of the day" disease.</p>

<p>Thread-private mappings WILL NOT HAPPEN. You can obviously do them in SGI
Linux, but that way lies madness, and it's something I'll keep pointing out
in public.</p>

<p>You can have thread-private _pointers_. Just have different mappings of the
same hardware context if you have to, and just have different pointers to it
in different threads.</p>

</quote>

<p>In the same post, he went on:</p>

<quote who="Linus Torvalds">

<p>thread-private mappings are
FUNDAMENTALLY broken. They completely break the whole point of having a
thread in the first place, and I can only suggest that if you want them you
should look at a great concept that was invented, oh, thirty years ago by
people more intelligent than you or I.</p>

<p>Namely "fork()". Which gives you the thread-private mapping you want.</p>

<p>If you want threads with shared address spaces, then make them SHARED. None
of this private stuff. It's not on the table, and it won't be. I've
explained to people why before, and I bet I'll end up explaining again, but
the short and sweet of it is that you CANNOT do thread-private mappings
without losing most of the performance advatages of a thread in the first
place.</p>

<p>And once you lose the performance-advantage of threads, not all that much
else remains. Threads certainly aren't any easier to program than
processes...</p>

</quote>

<p>Larry McVoy replied:</p>

<quote who="Larry McVoy">

<p>I agree with Linus. IRIX is
full of things that noone but a marketing bunny would like, and Linux
shouldn't follow in those misguided footsteps.</p>

<p>My understanding of what IRIX had to do to get their semantics to work,
based on having been a kernel engineer at SGI for 3 years, so I've looked at
this code, is that when you have private mappings, you typically end up
replacing the TLB miss handler. The details get sort of messy, but the
problem can be summed up as the address space does not belong to a thread,
in fact it is the other way around. As long as your threads either share
completely or don't share completely, then you only have one VM struct and
one set of page tables to walk. As soon as you start having thread private
mappings, then you end up having to have different (usually overlapping)
page tables. Hence the new TLB miss handler which sorts out the mess.</p>

<p>You really really don't want to do what SGI did.  What you do want to do is
clearly state what your needs are - leaving out any details about how you
want those needs solved - and keep restating them until someone says "ah -
you do that like this" and Linus likes it.</p>

</quote>

<p>James also replied to Linus, arguing in favor of thread-private mappings. He
said:</p>

<quote who="James Simmons">

<p>I'm not talking about
implementing Thread-private mappings for the entire system. What I have done
is when the process mmaps the accel region set a flag. This way if the
process creates new threads or forks it will have a private mapping. The
impact of private mapping is thus minimized. Once a process unmaps the accel
region the flag is turned off. The only way preformance can be killed on the
system is if the process that mmapped the accel region creates a bunch of
threads and none of the threads use the accel region. I just don't see that
happening. Also how many threads will a well designed OpenGL library have.
You don't want to go crazy here creating a bunch of threads.</p>

<p>The question we have to ask why do we want private mapping. The reason DRE
(Direct Rendering Engine) needs this is to ensure a page fault happens. If
you don't want private mapping for thread in this case then tell me a way to
ensure every thread or forked process that uses the accel engine page
faults. Why do you need a page fault? Because the page fault is what I used
to serialize access to the accel engine on SMP machines and to save and
restore the graphics context. As Jon Leech pointed out each threads need to
refer to *different* graphics contexts. OpenGL is based on the graphics
pipeline. Ideally each part of the pipeline would be used by a different
process. Say we have SMP machine with one video card. Ideally a different
process on each CPU would be in different places in the graphics pipeline.
The problem is one thread could have the hardware in a state that could
alter the desired results for the other threads since the other threads most
likely need the hardware in another state.</p>

<p>Also as Jon pointed out the API requires that these contexts be identified
by some thread-specific mechanism available to the graphics library, not by
explicit stack pointers in the application. Either the threaded OpenGL API
is broken or DRE for linux-SGI is. If this the case then linux will need its
own special threaded OpenGL library compared to all the other platforms
which don't require this special rewrite.</p>

</quote>

<p>Jon replied, <quote who="Jon Leech">Just to be
absolutely clear, the topic I brought up initially has to do only with use
of the OpenGL API from multithreaded applications; it has nothing whatsoever
to do with "DRE", which is not a project SGI is involved in, or with other
aspects of the graphics hardware access model.</quote></p>

<p>Linus also replied to James, with:</p>

<quote who="James Simmons">

<p>The thing that makes me
admire UNIX is not that it's UNIX. The thing I like about UNIX is that even
after 30 years, the fundamentals stand out. The basic notion of files as
unstructured streams of data, and the UNIX fork/exec way of doing things
(most everybody else does a "spawn()", which does not have the same
philosophy at all).</p>

<p>And quite frankly, especially when it comes to threads, Linux has a DESIGN.
It's not the hodge-podge of threading issues that we call "pthreads" that
came about from different vendors tryign to solve the same problem in
different ways (inluding user-mode only solutions). It's a CONCEPT, the same
way "fork()" is a concept.</p>

<p>The difference between a "concept" and a "random collection of routines" is
that the concept will survive, and the routines won't.</p>

<p>But that also requires that people don't mess up the concept by thinking
that it's "just an implementation". Because once lost, you won't be able to
undo the changes. You can fix bugs, but you can't easily remove the "feature
of the day".</p>

</quote>

<p>Linus went on:</p>

<quote who="Linus Torvalds">

<p>You've broken the concept of
shared TLB's. You've then superglued the broken parts together and said "if
you look at it from the right angle you cannot SEE the cracks".</p>

<p>But the cracks ARE THERE! They make the system more complex internally, and
even when you don't see the cracks you may notice that the thing doesn't
quite stand up straight.</p>

<p>You broke the whole notion of interchangeable, anonymous, shared TLB
mappings.</p>

<p>You broke the notion that when a thread is switched, no MM work needs to be
done.</p>

<p>You broke the notion that you can simply look at "process-&gt;mm" and
determine whether it can share the TLB state with another thread (even if
that TLB state in between was used by an anonymous kernel thread).</p>

<p>In short, you took a notion and dirtied it until it was just a random
collection of routines that superficially LOOK like it has a design.</p>

<p>In short, you didn't care about the BEAUTY.</p>

</quote>

<p>To James' statement that the question to ask was why we wanted shared
mappings, Linus continued:</p>

<quote who="Linus Torvalds">

<p>No. You're asking the WRONG
question.</p>

<p>There is no way in HELL we want private mappings. End of story.</p>

<p>Use processes and SysV shared memory if that is what you're after.It works
today, and it gets you EXACTLY the same semantics that you are apparently
after. Sure, you have to think about the problem another way, but it's just
a mirror image.</p>

<p>Instead of saying "ok, I want private mappings in a shared address space",
you say "ok, I want shared mappings in private address spaces".</p>

<p>What's the difference? Doesn't soundlike much, no?</p>

<p>But look at it from a NOTIONAL standpoint. You don't break anything by
taking a private address space and adding a shared object to it - we've had
that notion for a long time, and then you HAVE a private TLB and a private
page table to play with.</p>

<p>In contrast, if you break the sharedness of a shared address space, YOU
DON'T HAVE ANYTHING LEFT! You just broke the bubble, and it popped. You
turned it into a private address space, for no gain (you already HAD private
address spaces, so you just degenerated the whole system).</p>

</quote>

<p>Linus concluded, <quote who="Linus Torvalds">I'm not asking
you. I'm TELLING you that your idea will not be accepted in the standard
kernel. I can go on explaining all day WHY, but you don't seem to care.
You're ignoring the bigger picture, and that's your right. It's also your
right to f*ck up your own version of Linux, but you're not getting close to
mine.</quote></p>

<p>James replied, <quote who="James Simmons">You are
right about the idea of not breaking a concept. Its true I shouldn't break a
standard concept to do what a single driver/project wants done. If this was
the case their would be no standards. Egg on my face :(</quote> He added
that he <i>did</i> care about other people's opinions, and went on to say
that he would keep working on the problem, keeping Linus' admonitions in
mind.</p>

<p>David S. Miller also replied to Linus with some code, based on an
implementation suggestion in Linus' previous post. James replied that he'd
work on coding that into the kernel. Linus replied, <quote who="Linus Torvalds">Note that you really should look at what DRI
did with the 3dfx driver, that does all of this =and= tries to keep much of
the context locking in user space (so that only on clashes does it go to the
kernel to fix things up). The kernel side is in the standard 2.3.x kernel
these days, the X server side is in the unofficial 3dfx X server.</quote></p>

<p>David replied:</p>

<quote who="David S. Miller">

<p>Please keep in mind that
what the SGI folks are complaining about here and how the 3dfx has to do are
radically different issues.</p>

<p>Most commodity 3D graphics hardware these days cannot be interrupted
mid-operation, have it's state fully saved, have another renderer's state
fully restored, and let the latter continue where he left off. You must
complete the full operation you are currently in the middle of before you
can let someone else have the card.</p>

<p>Whereas most SGI, Sun, and other vendor's higher end cards allows you to
arbitrarily stop a renderer mid-place, save the card state, and restore the
card state another thread has. This can happen at nearly arbitrary
locations, so the following works:</p>

<blockquote>

<table border="0">

        <th>Thread 1                </th><th>Thread 2</th>

<tr>    <td>OP = RECTANGLE</td><td></td></tr>
<tr>    <td>X = 0</td><td></td></tr>
<tr>    <td>Y = 0</td><td></td></tr>
<tr>    <td></td><td>           OP = LINE
                                (takes fault, thread1's graphics card
                                 state is saved, thread2's is restored
                                 and mappings are removed from thread1's
                                 space)</td></tr>
<tr>    <td></td><td>           X1 = 5</td></tr>
<tr>    <td></td><td>           Y1 = 10</td></tr>
<tr>    <td></td><td>           X2 = 10</td></tr>
<tr>    <td></td><td>           Y2 = 10 (draws the line)</td></tr>
<tr>    <td>W = 50
        (takes fault, graphics
         state restored, thread2's
         state saved and his mappings
         removed)</td><td></td></tr>
<tr>    <td>H = 50 (draws the rectangle)</td><td></td></tr>

</table>

</blockquote>

<p>Commodity PCI/AGI 3D graphics cards cannot do this, which is why the
userland locking solution exists at all. However for cards that can do the
above, this is what people want.</p>

</quote>

<p>To David's "Most commodity 3D graphics hardware these days" comment, Linus replied:</p>

<quote who="Linus Torvalds">

<p>Look out when you're condescending.</p>

<p>Commodity hardware is where it is at when it comes to 3D. Forget about CAD
work etc - 3D is all about games, and probably always will be. It's just a
fact that the game market is about a million times larger than the
traditional 3D market ever was, and as a result they have more resources.</p>

<p>It's the i386 all over again. The "non-professional" 3D solutions are
already getting level with the "professional" ones.</p>

</quote>

<p>Raul Miller also replied to David, asking if there were benchmarks comparing
on-board and off-board state swapping. David mentioned one card that would
cache multiple rendering contexts' states; and described the page fault
handler for such a thing. Linus took up the new subject, saying:</p>

<quote who="Linus Torvalds">

<p>I don't understand why people are so hung up about page faults.</p>

<p>I think it's ENTIRELY because of historical baggage, and the particular
implementation under Irix.</p>

<p>What I'm surprised about is that nobody seems to just come out and say:</p>

<blockquote>

<p> Page faults are BAD. Playing with the page tables is EXPENSIVE. Page faults
 fundamentally are NOT thread-safe, because page tables are fundamentally
 shared among threads.</p>

</blockquote>

<p>Ok. Nobody else said it, so now I have.</p>

<p>YOU SHOULD NOT PLAY MM GAMES! They do not scale in SMP, they do not scale
with threads, and the costs of missing are absolutely huge. The whole thing
is also extremely hard to debug, and implies a much tighter coupling between
the kernel and the X server than there should ever be!</p>

<p>You can do a _regular_ SMP-safe lock with _real_ thread safety and no
faulting behaviour in a few instructions. We're talking maybe 50 cycles here
- about 40 cycles for the actual two locked instructions, and a very
generous 10 cycles to check whether you are the old owner and going to the
switch routine if not).</p>

<p>Note that IF you have to switch contexts, the regular lock will be a hell of
a lot faster than taking page faults, so let's just ignore that case: page
faulting obviously loses, and there's no way anybody can seriously claim
anything else.</p>

<p>So let's look at the no contention case, where you got the lock, and
everything was fine. You spent 50 cycles on verifying it. Big deal. That's
50 _CPU_ cycles. Not memory cycles, not PCI cycles. In exchange for those
50 cycles you get:</p>

<p>

<ul>

<li>you can debug things</li>

<li>it's thread-safe</li>

<li>you have much lower latency if you DO get graphical context switching,
and you can actually test it in user space first!</li>

</ul>

</p>

<p>Oh. And btw. It's already been done. See the 3dfx driver.</p>

<p>So forget this playing with mmap and page faults. Use mmap() as a way to
access the physical hardware, but NOT as a way to switch contexts.
Ok?</p>

</quote>

</section>

<section
  title="Preparing For Code Freeze"
  subject="Ok, making ready for pre-2.4 and code-freeze.."
  archive="http://kernelnotes.org/lnxlists/linux-kernel/lk_9912_02/msg01344.html"
  posts="167"
  startdate="14 Dec 1999 00:00:00 -0800"
  enddate="23 Dec 1999 00:00:00 -0800"
>
<topic>Code Freeze</topic>
<topic>Disk Arrays: RAID</topic>
<topic>Disks: SCSI</topic>
<topic>FS: NTFS</topic>
<topic>FS: procfs</topic>
<topic>I2C</topic>
<topic>Kernel Release Announcement</topic>
<topic>Networking</topic>
<topic>PCI</topic>
<topic>SMP</topic>

<mention>Tigran Aivazian</mention>
<mention>Roman Zippel</mention>
<mention>Ingo Molnar</mention>
<mention>Jes Sorensen</mention>
<mention>Dominik Kubla</mention>
<mention>Henrik Olsen</mention>

<p>Linus Torvalds announced Linux 2.3.33, saying:</p>

<quote who="Linus Torvalds">

<p>After doing too many last-minute
updates of critical code that we really shouldn't have left this late (Both
the mm layer and the SCSI layer was changed quite a lot: we'll be better for
it, but I'd have been happier if we hadn't needed to), I'm going to calm
things down. I've released 2.3.33 which fixes a few smaller problems with
2.3.32, and I'll let it quiet down a bit for a while.</p>

<p>We're obviously not going to have a 2.4 this millenium, but let's get the
pre-2.4 series going this year, with the real release Q1 of 2000.</p>

</quote>

<p>Henrik Olsen, Tigran Aivazian, Dominik Kubla, Henning P. Schmiedeh, Catalin
BOIE, and Ron Flory all pointed out that the millennium actually wouldn't
pass until next year. Ron said, <quote who="Ron Flory">Contrary to popular misconceptions, the year 2000 is
actually the LAST year of this millennium. After Dec 31 1999 we will have
completed 1999 full years. Jan 1 2001 is the first day of the next
millennium.</quote> Linus replied:</p>

<quote who="Linus Torvalds">

<p>Contrary to popular
misconceptions, PEOPLE DON'T CARE!</p>

<p>The fact that our forefathers were Pascal-programmers, and started counting
from one does not mean that we have to continue that mistake forever. We've
since moved on to C, and the change from 1999-&gt;2000 is a lot more
interesting in a base-10 system than the change from 2000->2001.</p>

<p>The reference point of our timekeeping is based on an event where the
uncertainty about the timing is much more than a year, and was made up
several hundred years AFTER the fact. As such, if you want to be a stickler,
you might as well say that the next millenium may have started several years
ago.</p>

<p>So please stop sending me email.  You don't have to celebrate if you don't
want to. But let the rest of the world who doesn't care about silly
irrelevant details (what's a millenium to you anyway) just go on with our
life.</p>

<p>NEXT year I may agree with you. I'll join the ranks of people with no life
but the ability to count in another 360 days or so. But that's mainly in
order to have an excuse to go out to town.</p>

</quote>

<p>Riley Williams added:</p>

<quote who="Riley Williams">

<p>For those of you
interested in this:</p>

<p>

<ul>

<li>According to the bible, Jesus was born when Herod Agrippa was ruling in
Judea. Herod Agrippa is documented has having died in 4 BC.</li>

<li>According to the bible, Joseph and Mary had gone to the city of
Bethlehem because of the first Great Roman Census. History records that
these censii were held every 14 years with the second being in AD 7 and the
third in AD 21.</li>

</ul>

</p>

<p>These two facts imply that the second millenium ended on 31st December 1993,
and we have been in the third millenium for nearly six years now.</p>

</quote>

<p>David S. Miller had a more on-topic reply, saying, <quote who="David S.
Miller">I'm still a few weeks away from getting my
platforms working again, currently I'm wedged at 2.3.27 with some weird
perhaps Sparc-specific issue that is preventing user apps from stating up
after boot. Could be the new zone code, who knows, no hard clues... been on
this for 4 days now.</quote></p>

<p>Alexander Viro added:</p>

<quote who="Alexander Viro">

<p>With the filesystems/VFS
situation looks so (and I'm not going into IWBNI area, only code that needs
fixing):</p>

<p>

<ul>

<li>ADFS, AFFS, HFS, NTFS, QNX4 - blatantly broken.</li>

<li>UFS - needs cleanup/fixes.</li>

<li>loopback, ramdisk, raid - more or less broken.</li>

<li>CODA - will need serious testing after the expected large patch.</li>

<li>procfs - needs decision on namespace policy/namespace rework. And proper
dealing with races on module offloading, but that's old story.</li>

</ul>

</p>

<p>That's just the most pressing stuff. I'm going to fork the -bird (aka
VFS-CURRENT) after 2.4.0 and will feed the stable/well-tested stuff back
into the main tree (with intention to collapse it after 2.5 will open), but
it would be nice if we could fix at least the stuff mentioned above _before_
2.4.</p>

</quote>

<p>Jes Sorensen mentioned that he though Roman Zippel was working on the AFFS
code. Alan Cox also replied to Alexander. He said Ingo Molnar was working on
RAID and PIII patch merging. He added that CODA could wait until later.
Regarding the /proc issues, he went on, <quote who="Alan Cox">We have a ton
of other user exploitable races with module load/unload including basic
stuff like open which with a bit of care you can use to crash the machine as
any user. Procfs is only a part of this.</quote> He continued:</p>

<quote who="Alan Cox">

<p>Taking my working list the key items seem to be</p>

<p>

<ul>

<li>About 70% of device drivers dont have the setup code fixed. In many
cases that makes them unusable non modular</li>

<li>About 1/3rd of the ISA device drivers using memory mapped IO are still
broken AF_UNIX sockets are broken, and there are numerous IP layer bugs only
fixed in DaveM's trees that have to be fixed, either by extracting them or
going softnet.</li>

<li>The SPX sockets are totally broken</li>

<li>The PIII/Athlon/K6/Pentium memcpy etc acceleration code needs merging</li>

<li>NCR5380 is broken for SMP</li>

<li>AHA152x is broken for SMP</li>

<li>Anyuser can crash the code due to module races (eg open)</li>

<li>RAID 0.90 needs merging</li>

<li>Vmalloc needs to take flags for DMA optionally (for 32bit PCI scatter
gather)</li>

<li>Getblk/mark buffer races in most file systems</li>

<li>Protection for inode size fields is wrong right now</li>

</ul>

</p>

<p>Less critical stuff</p>

<p>

<ul>

<li>Switching to the new i2c code core and the new bttv driver. Affects only
some tv cards so is ok</li>

<li>Per process rt signal queue limits</li>

<li>Syncppp should use the new generic ppp code</li>

<li>Back merge 2.2.13/2.2.14 fixes</li>

</ul>

</p>

<p>I have the i2c stuff sorted mostly (thanks to Gerd and co), the 2.2.13/14
stuff I will do next week and is all bug fixes, Im still trying to stomp all
the isa memory mapped I/O issues. I'll also sort the vmalloc cases out.</p>

<p>Most of the above is bug fixing or driver stuff so is post freeze work. The
raid and PIII stuff are not. I'd also prefer you to look at the core softnet
changes and say yes/no, then draw your line the side of it you choose. The
softnet work is the other 50% of the scaling work, without it 2.4 wont scale
much better than 2.2 for real world situations. That bothers me.</p>

<p>I've got a list of other stuff (Erez stackable fs, telephony API work,
performance counters, lm-sensors, ibcs) that are in the nice but so be it
category. I have no problem with those landing outside of 2.4.0 or staying
as add ons. Skipping softnet though would I think be a mistake.</p>

</quote>

</section>

<section
  title="ReiserFS Or Ext3 In Standard Kernel?"
  subject="JFS"
  archive="http://kernelnotes.org/lnxlists/linux-kernel/lk_9912_04/msg00055.html"
  posts="4"
  startdate="17 Dec 1999 00:00:00 -0800"
  enddate="23 Dec 1999 00:00:00 -0800"
>
<topic>Disk Arrays: LVM</topic>
<topic>FS: ReiserFS</topic>
<topic>FS: XFS</topic>
<topic>FS: ext2</topic>
<topic>FS: ext3</topic>
<topic>Ioctls</topic>

<mention>Stephen C. Tweedie</mention>
<mention>Theodore Y. Ts'o</mention>

<p>Ted Sikora knew that folks were talking about a journalling filesystem, and
asked if ReiserFS or ext3 would eventually be included in the Linux tree.
Stephen C. Tweedie, author of ext3, suggested that it would be best to have
both, and Hans Reiser, author of ReiserFS, agreed. In a different post, Hans
added, <quote who="Hans Reiser">We are working
through the holidays to port reiserfs to 2.3. Our not finishing the port is
what is holding up our introduction into 2.3. We are late but working hard
at it....</quote> EOT.</p>

<p>The possibility of an Ext3 was barely a glimmer in a developer's eye back
in <kcref subject="Re: linux capabilities and ACLs" startdate="03 Feb 1999 00:00:00 -0800"></kcref><!-- kt19990211_5.html#1 -->. The flameware produced by
Stephen's early journalling work was covered in <kcref subject="fsync on large
files" startdate="12 Feb 1999 00:00:00 -0800"></kcref><!-- kt19990224_7.html#6
-->. Then in <kcref subject="softupdates and ext2" startdate="31 Mar 1999 00:00:00 -0800"></kcref><!-- kt19990422_15.html#2 -->, the possibility of
including 'capabilities' in ext3 was discussed, including a long analysis by
Theodore Y. Ts'o. In <kcref subject="[OT] SGI to OpenSource XFS" startdate="20 May 1999 00:00:00 -0800"></kcref><!-- kt19990603_21.html#2 -->, SGI's promise
to release XFS under an Open Source license, made some folks wonder whether any
other journalling filesystem should be bothered with. This article was also
Kernel Traffic's first mention of ReiserFS. ReiserFS was next mentioned in
<kcref subject="RFC: from FIBMAP to FIONDEV" startdate="11 Jun 1999 00:00:00 -0800"></kcref><!-- kt19990623_24.html#6 -->, in the context of migrating the
kernel from the FIBMAP to FIONDEV ioctl. Next, in <kcref subject="[PATCH]
putting old-style lock handling back into 2.2.10" startdate="04 Jul 1999 00:00:00 -0800"></kcref><!-- kt19990715_27.html#10 -->, ReiserFS was listed
as a valuable 2.2.x feature, and a good reason to upgrade from 2.0 or 1.2;
its next appearance was in <kcref subject="vm kills processes in our 2.3.12
port of reiserfs - what was the story on the changes to mark_buffer_dirty()
and the too many dirty buffers issue?"  startdate="29 Aug 1999 00:00:00 -0800"></kcref><!-- kt19990913_34.html#4 -->, where Hans claimed it was
almost ready for inclusion in the 2.3.x series. Then ext3 came back under
discussion in <kcref subject="Ext3 filesystem info?"  startdate="16 Sep 1999 00:00:00 -0800"></kcref><!-- kt19991011_38.html#2 -->, where it was
revealed that Stephen had released version 0.01 for kernel 2.2.2; both ext3
and ReiserFS then came up in <kcref subject="jfs/linux" startdate="28 Oct 1999 00:00:00 -0800"></kcref><!-- kt19991115_43.html#2 -->, in which (aside
from another historical summary like this one) it came out that the latest
SuSE was shipping with ReiserFS. In the same issue, ext3 also got a brief
mention in <kcref subject="Linux Buffer Cache Does Not Support Mirroring"
startdate="29 Oct 1999 00:00:00 -0800"></kcref><!-- kt19991115_43.html#5
--> as part of a different argument. Issue #44 had an ext3 status report in
<kcref subject="ReiserFS" startdate="08 Nov 1999 00:00:00 -0800"></kcref><!--
kt19991122_44.html#2 -->, and some possible ReiserFS licensing conflicts in
<kcref subject="Reiserfs licencing - possible GPL conflict?"  startdate="08 Nov 1999 00:00:00 -0800"></kcref><!-- kt19991122_44.html#4 -->. They both came
up again in <kcref subject="Re: Announce: LVM Patch against kernel 2.3.28"
startdate="19 Nov 1999 00:00:00 -0800"></kcref><!-- kt19991206_45.html#11
-->, in the context of folding them (and LVM) into the 2.4 kernel. Finally,
in <kcref subject="Oops with ext3 journaling" startdate="01 Dec 1999 00:00:00 -0800"></kcref><!-- kt19991220_47.html#3 -->, there was some discussion of
compatibility between ext2 and ext3.</p>

</section>

<section
  title="How To Be A Kernel Hacker"
  subject="Kernel design"
  archive="../unavailable.html"
  posts="4"
  startdate="18 Dec 1999 00:00:00 -0800"
  enddate="21 Dec 1999 00:00:00 -0800"
>

<p>Sherif Abou Seda asked how to get the "algorithms of the kernel design."
Bill Wendling replied, <quote who="Bill Wendling">First, get a scalpal and
a saw. Then, once you have the kernel hacker strapped down to the table and
under anesthitized, you can remove his/her brain. Of course, you'll have to
translate the wet-ware to some other form of your choosing...</quote></p>

<p>Rik van Riel also replied to Sherif, saying, <quote who="Rik van
Riel">Algorithms for designing the kernel (for doing
the actual design and stuff) vary from developer to developer, but most seem
to involve staring, deep thought and lots of caffeine. The "graduate student
algorithm" should prove a sufficiently close approximation :)</quote> and
Riley Williams added:</p>

<quote who="Riley Williams">

<p>The algorithms I know
about seem to be along the lines of the following:</p>

<p>

<ol>

<li>Read a dozen or so pages of source code looking for whatever bug one is
after. Drink several large mugs of coffee whilst doing so.</li>

<li>Toss a coin. If heads, go to step 3; if tails, go to step 4; if it lands
on its edge, go to step 5.</li>

<li>Note a bug one wasn't looking for, and fix it. Go to step 1.</li>

<li>Overlook the bug one was looking for. Go to step 1.</li>

<li>Have a brainwave, and go directly to the bug and fix it, even though it
was nowhere near the code one was looking at. Go to step 1.</li>

</ol>

</p>

<p>Put any timescale you care to against that lot.</p>

</quote>

</section>

<section
  title="Protecting Permissions In NFS"
  subject="2.3.30 linuxNFS import is broken (Screwed up NFS/RPC credentials)"
  archive="../unavailable.html"
  posts="18"
  startdate="20 Dec 1999 00:00:00 -0800"
  enddate="22 Dec 1999 00:00:00 -0800"
>
<topic>FS: NFS</topic>
<topic>POSIX</topic>

<mention>Alexander Viro</mention>
<mention>Linus Torvalds</mention>
<mention>Horst von Brand</mention>

<p>Trond Myklebust noticed that the API for readpage() and writepage() had been
changed to no longer pass the file pointer. He explained, <quote who="Trond
Myklebust">This screws up any attempt to cache the RPC credentials at file
opening, since there's no longer any way to pass the credential down to the
read/write.</quote> The result was a broken 2.3.30 LinuxNFS tree. Alexander
Viro asked where the file pointer was used; he said the behavior seemed okay
to him. Trond elaborated:</p>

<quote who="Trond Myklebust">

<p>The problem is that NFS
relies on the user sending the RPC authentication each and every time we
access data on the server. In order not to get a permissions error suddenly
if the user changes euid while s/he is reading/writing to the open file, we
therefore want to use the same RPC authentication info throughout the file's
lifetime. Ideally that means taking the RPC auth info that was valid when
opening the file (since this is more or less in line with a POSIX
filesystem's behaviour with permission checking at file open only) and
caching it somewhere.</p>

<p>The most practical way of implementing this policy is therefore to hide the
RPC auth in the file descriptor structure (I use the private data field),
and pass that info via the file pointer to readpage/writepage/whatever else
needs it.</p>

</quote>

<p>He added that Linus Torvalds had rejected the patch for this as not yet
clean enough.</p>

<p>Horst von Brand objected that if several people open the same file, the
permissions might have changed inbetween, forcing the system to cache the
RPC auth information once for each open(); which would get complicated.
Trond felt this was a non-problem: he explained that the file structure was
allocated when the file was opened, and was not shared between users. Each
user could therefore cache their own set of permissions. This was why he
preferred to pass the file structure rather than a dentry (directory entry
structure), which would be common to all users and would prevent this
treatment.</p>

</section>

</kc>

