Latest | Archives | People | Topics
Latest | Archives | People | Topics
Latest | Archives | People | Topics
|Home | News | RSS Feeds | Mailing Lists | Authors Info | Mirrors | Stalled Traffic|
This is a work in progress. Send your questions, suggestions, and corrections to the Zack Brown.
This is very much a personal choice. Any mail reader with good threading should be fine. I use 'mutt', and generally I will read the entire thread through once, and then go back and do the summary.
Emacs and vim both have modes that will colorize XML tags in a convenient way. Other editors can do this as well. In general, any editor that will let you create "hot-keys" to produce XML tags for you, is a fine editor to use. If you have one that is your favorite and you use it for thread summaries, send in your configuration and I'll include it here.
In vim, here is a sample recipe for a <quote> tag:
imap ESCq <quote who=""></quote>ESCzhhhhhhhhhhi
Note that the "ESC" strings actually mean the escape character, and won't be produced if you just type the letters E-S-C into vim. You can produce the escape character alone by typing Ctrl-V Ctrl-[ in vim.
Currently you need a complete KT development environment (fancy term for a few scripts and makefiles). You can get it at ktpub.tgz. A .deb file (or a couple)is in the works.
There are two parts to the 3-day rule. The primary part is, 'don't summarize threads that have been active within the past 3 days.' This is because if you summarize threads that are still ongoing, you will tend to draw the wrong conclusions, and thus get a wrong idea of what the thread is really saying. So the summary you write will be less good than if you'd waited. At the same time, since the thread is ongoing, there may be hundreds of new posts to be summarized for the next issue. In this case, the parts of the discussion may sprout from many different places in the thread, and there will be no good way to connect all the tendrils back to their roots. The new summary will tend to end up disjointed and broken. After several weeks of covering the new parts of a thread each week, it is almost impossible to make real sense out of it.
The other, less important part of the 3-day rule, is, 'don't summarize threads that have been inactive since 3 days before the deadline of the previous issue.' This ensures that old threads won't be covered, and each issue will be fairly current. Covering the occassional old thread is OK, but you should be aware that it's an exceptional case.
Note that only the latter, less important part of the 3-day rule refers to the deadline of any given issue. The primary part, about not covering threads that have been active recently, can be true at any time. If you notice that a thread has been inactive for at least 3 days, you should feel free to cover that thread even if there are still several days before the next issue's deadline. That thread is officially over, so you don't have to worry about missing any future posts.
This will depend somewhat on the KC editor, and what their preferences are. They have to read and revise your submissions, and then send you feedback (maybe even diffs), so you should format submissions to make this as easy as possible for them.
In general, you should keep the lines shorter than 80 characters, and put blank spaces between paragraphs.
KC issues are written in XML, then processed by compiler into output suitable for publication. There are a number of things to keep in mind about that
there are only 4 non-HTML tags you need to know about: <section>, <quote>, <kcref>, and <editorialize>. You can also use HTML, but in order to maintain XML compliance, you should close all your tags (note: even though <p /> is legal HTML and XML, you should still start all paragraphs with <p> and end them with </p>).
A full submission consists of a single <section> tag. The other tags can appear within the <section> tag.
title="The title you give to the section you're writing"
author="your name here"
contact="http://yourURL(or mailto:)" (optional)
subject="The full subject line from the mailing list"
archive="archive location of the first post of the thread"
posts="number of posts in the thread"
startdate="the date the thread began"
enddate="the date the thread ended"
summary goes here
The <quote> tag is used to quote a message from the thread being summarized. Each <quote> tag must contain a "who" attribute, to indicate the full name of the person being quoted.
The <quote> tag can be processed in two different ways. If it is embedded within a single paragraph (i.e. if there are no paragraphs or other line breaks within it) it is called a single-paragraph quote and is processed to flow within the surrounding paragraph. If it consists of multiple paragraphs of quoted text, it is called a multi-paragraph quote and is set off from the rest of the text via indentation.
Single-paragraph quotes are kept within the paragraph that surrounds them. They change their color to indicate that they are quotes, but aside from that, they are just regular text, so you can write around them fairly easily. For example, you could write, "Some One posted a patch and said, <quote who="Some One">Here is a patch for your pleasure,<quote> and Some One Else replied with some technical problems." In that passage, the quoted material flows within the summary. Notice the commas, both at the beginning and the end of the quote, which allow the summary to continue unbroken through its idea.
Multi-paragraph quotes are separated from the surrounding paragraphs and indented. You don't need to worry about doing the indentation yourself, but you should make sure to separate the quoted text from your own writing, and use the <p> tag properly. For instance, the following text is correctly formatted:
<p>Some One announced a new project for locating bad blocks on CDROMs and deriving the full amount of data from stored groupings over the rest of the CDROM. She said:</p>
<quote who="Some One">
<p>It's not necessary to lose data even from scratched CDROMs. As long as most of the disk is intact, you should be able to retrieve all your data, regardless of what part of the CD was damaged.</p>
<p>With this patch, your iso image is transformed into a redundant warehouse consisting of 600 separate pieces. Any 500 will be enough to restore your data. I'm working on bringing that number down to 448, but that will take more coding.</p>
<p>Someone Else argued that this could never be made efficient, but A Third Person felt that this would best be the judgement of the users, more than anyone else.</p>
In this case, notice how the quote surrounds multiple paragraphs, but none of the paragraphs cross from one side of the <quote> tag to the other. They all fall squarely either inside or outside the quote.
The "who" attribute should also be handled carefully. The information in the "who" attribute will be used by the database to construct its indices, so you should try to use the full name of the person being quoted. If you don't know the person's full name, then try to avoid quoting them, rather than put a partial name in the database. If you really want to quote them, send them email asking for their full name.
Every time someone new is quoted, that person (as recorded in the "who" attribute) is added to the database, and the compiler creates index pages for them. These pages are linked from each issue, based on where that name appears in the non-marked-up text. In other words, for each section that is processed by the compiler, the compiler searches for all names of people that have been quoted. For each name it finds, it creates a link to that person's index file.
For this reason, it's important to use a person's full name, the first time you mention them in a given section. After that, you can refer to them by first name for the rest of the section (of course, the "who" attribute should always have the full name). So you might do something like this:
<p>Daniel Bowman announced a new library to take advantage of WinModems, and Peter MacAbee pointed out that such a library had existed already for quite a while. Daniel replied that his library was different, and would allow the WinModem to actually behave as a modem, instead of merely a sound card. Peter was impressed, but Rrlondai Stunsanscin thought the two project should be merged, because, <quote who="Rrlondai Stunsanscin">at this point it would amount to no more than a fork in the codebase.</quote> Rrlondai went on to suggest creating a new mailing list, or simply joining the existing list, but Daniel objected that he had already tried to work with the other developers and found them to be indifferent to his patches. He felt that in this case, a fork was warranted, since it was the only way to produce the functionality he was after.</p>
<p>Peter and Rrlondai agreed with this, but Sandy Turrell, the maintainer of the 'WinSound' library, objected that Daniel had never made adequate proposals, but had posted a few vague messages to the 'WinSound' mailing list and then rushed off to start his own project. She went on, <quote who="Sandy Turrell">I don't think you made enough of an effort to justify a fork. The WinSound developers have nothing against extending the library to allow true modem functionality, we just have questions about your approach.</quote> Daniel replied that this hadn't been his impression, and added that Sandy had been very dismissive in private emails. At this point, a lot of people piled into the thread and urged Daniel and Sandy to find a peaceful way to integrate both projects together. Eventually, Daniel seemed willing to give it a try, and the thread petered out.</p>
To refer to an old issue, or an issue from a different newsletter, the <kcref> element mimics <section>'s attributes, though for <kcref> all attributes are optional. Also, leaving <kcref> as an empty element (no text between <kcref> and </kcref> will cause the compiler to provide a good default text, so you should only put text in there if you really need to. A typical example of <kcref> usage is:
This came up in <kcref subject="the thread's subject line" startdate="the starting date"></kcref>, where so'n'so seemed to like the proposal...Or:
This came up <kcref subject="the thread's subject line" startdate="the starting date"> a few weeks ago </kcref>, where so'n'so seemed to like the proposal...
You can also use any other attributes from the <section> tag, though "subject" and "startdate" are usually enough.
The <kcref> tag was designed to allow references between sections of a single issue, between issues of a given newsletter, and between newsletters in the KC project; yet also allow massive revisions of all issues, without breaking the reference.
Every new newsletter issue is parsed by a script, and the information from its <section> tags is put into a database along with other relevant data. When the compiler processes an issue, every time it encounters a <kcref> tag, it queries the database for equivalent <section> data, finds the proper reference, and inserts that into the issue. If any back issues are revised, the database is reconstructed and all changes republished, so inter-issue references remain accurate.
In terms of "writing around" kcrefs, there are really two ways. The first is to put some text inside the kcref itself, and that text will be the link to the article referenced. So if you did <kcref subject="Some subject line" startdate="10 Oct 2000 00:00:00 -0700">something like this<kcref>, it would display as, "So if you did something like this, it would display as," with the "something like this" part of an href.
However, by doing that, you lose out on the information the compiler would otherwise generate, such as issue number, section number, section title, etc.; so normally it's best to prefer a construction like, "this was covered in <kcref subject="Some subject line" startdate="10 Oct 2000 00:00:00 -0700"></kcref>, several weeks ago." The trick, when trying to write around a kcref, is to treat the entire kcref as if it said "Issue X, Section Y". This will always result in grammatically correct output from the compiler.
Currently, even for empty kcrefs (i.e. no text between the <kcref> and the </kcref>, you should not use <kcref subject="" etc. />. Always explicitly close the tag.
If you want to speak for awhile without worrying about objectivity, use the <editorialize> tag. No need to put it in parentheses or anything, just do something like
<editorialize who="your name here">My own personal opinion is that this package should go in, though perhaps in a form more palatable to the upstream developers.</editorialize>
Yes. <QUOTE WHO="Some Name"> is not the same as <quote who="Some Name">. Kernel Traffic standardizes on lowercase tags. This applies to embedded HTML as well.
The basic and best guideline for picking threads is, "pick the threads that hold some interest for you, and summarize the parts that interest you." If you don't find part of a thread interesting enough to summarize, don't summarize that part.
Another way to think of it might be to look at a thread as making various points. If you like some of those points, you can use the thread to make those points yourself, in the form of a summary.
For technical constraints on which threads to consider summarizing, see Choosing Which Threads To Cover in the KC Authorship page.
You can use pretty much any HTML you want, but there are some exceptions and other things to bear in mind.
Quotes can be very useful in showing what went on in a thread. They reveal the personalities of the people speaking, they set a mood, and they also express the ideas someone is aiming for. At the same time, in a summary there is the danger that quotes can seem to hang in thin air with no context. How and when to quote is an interesting part of the overall problem of thread summaries.
In general, no. The basic rule is, someone should be able to cut and paste quoted text into the search engine of a mailing-list archive site, and match the post they're trying to find. In that case, a spelling or grammar error will tend to give them a dead-on match, while a spelling or grammar correction will never match the post they want.
On the other hand, sometimes an error in quoted text is so glaring that it makes the text ambiguous and harder to read. In those cases, it's preferable to silently fix the error.
In your own text, on the other hand, spelling should be as good as you can conveniently make it.
This is a judgement call, though in general you shouldn't do it without reason. If you can turn a multi-paragraph quote into a single-paragraph quote and still keep it looking good, that can be a good thing because it allows you to use real paragraph breaks more effectively in your own text. But turning several larger or even medium sized paragraphs into one big one can harm the intentions of the original author. After all, they may be using their own paragraph breaks effectively too.
In general, if you can say something better and more concisely than the person posting the email, you should paraphrase them. If they speak well and clearly, and don't take up more space than they need, quote them.
Paraphrasing is generally to be preferred if possible, because it tends to be easier to read, and the different points of view can be more clearly linked, than if the reader is just given the raw quote to sift for themselves.
Sometimes paraphrasing is not the best thing, though. If there is a particular significance to a given exchange, individual voices can become very important, and go beyond the bare information of the post. Again, someone might also speak well, to the point where a paraphrasing would do no justice, and at that point a quote is appropriate.
Generally only what is relevant. There's usually no need to include the "hi folks" at the top, or the "have a good fortnight" at the bottom, and there is often no need to include most of the rest of the post either.
If you want to quote two parts of a single post, you should never silently skip over the parts you don't want. Always give at least some indication that you are skipping over a portion of the text. Some standard methods are:
<quote who="Some One">Something they said</quote> [...] <quote who="Same Person">some more stuff they said</quote>
<quote who="Some One">Something they said.</quote> They went on to say, <quote who="Same Person">some more stuff they said</quote>
or some similar construct. If you want to break a multi-paragraph quote into pieces, don't use the '[...]' method, but instead try to talk your way through it, for clarity's sake.
It's good to put unordered lists inside <ul> tags, ordered lists inside <ol> tags (don't for get to close all your <li> tags with matching </li> tags.
You should try to always stay away from <pre> tags. They can interfere with the browser width, making a given issue harder to view on certain setups. You can usually achieve whatever look you're going for, with <blockquote>, , and <br /> (don't use <br> alone because it's not XML compliant. Don't forget the space between the <br and the />. Otherwise most browsers will misinterpret the tag. Also, remember that <quote> and <blockquote> are unrelated. Multi-paragraph <quote>s will take care of their own indentation, but you can use <blockquote>s to add your extra indentation as needed.
<table> elements are also fine, though other methods should be preferred if possible.
It's also a good thing to wrap URLs into the appropriate <a href="URL">URL</a> tag so people can follow them. If you feel like hunting down URLs relevant to what someone is saying, even if they didn't actually give the URL themselves, that's also great, and you can turn their quote into a link to the site you found. For example:
<quote who="Some One">Now there is a site about the <a href="URL you found elsewhere">project I'm doing</a> later this year.</quote>
You should avoid using swearwords in the texts you write for the summaries, but anything someone says in a post is fair to quote. Don't feel you should "write around" swear words. Just quote them along with whatever else is relevant.
This is the fundamental question of the whole project. There's obviously plenty of room for your own creativity and voice, but there are some techniques that can save you a lot of grief, or at least give you a standard to deviate from.
If one post is directly in reply to another (this is the most common case), you have a lot of leeway in terms of how to express this. In fact, readers will probably assume that if you don't say anything about it, a summary of one post is in direct reply to the text you wrote just before it. This lets you be very natural in the way you describe an interaction. You can say:
Person A said they liked something, and Person B said they didn't.
Person A said they liked something. Person B replied that they didn't.
In both of the above examples, it's clear that person B is posting in direct reply to Person A. But suppose there was an intervening post between Person A's post and Person B's post. Now it's no longer true that B "replied to" A, and this should be indicated somehow. You could try to give some indication of just how far away the two posts are from each other, like this:
A couple posts along the subthread, Person B said that they didn't.
Ten posts along the subthread, Person B said that they didn't.
or you can be less specific about exactly how far along in the subthread B's post was from A's, like this:
Later in the subthread, Person B said that they didn't.
Note that you wouldn't generally say:
Later in the subthread, Person B replied that they didn't.
That's because B's post is not really in reply to A's, it's in reply to someone else's that you've skipped over in your summary. But you might say:
Later in the subthread, Person B replied to A's idea of modules, saying he disagreed.
In that case you've made it clear that B is not replying to A's post directly, but to A's idea.
Now suppose A says something, and B says something very much related to it, but in a completely different subthread. This is where the word "elsewhere" can come in very handy:
Person A said they liked modules. Elsewhere, Person B said they didn't like modules.
The word "elsewhere" is very useful for specifying any kind of jump. For instance, if you've finished summarizing one subthread and are ready to move on to the next, you can start a new paragraph, and say:
...said they didn't like modules.</p>
<p>Elsewhere, Person D reported a buggy driver and posted some code to fix it.
In the above case, you don't have to try to specify exactly where D's post is in relation to the one described above it. They're not really related in subject matter, so it doesn't matter whether they're near or far. A simple "elsewhere" suffices. Starting a new paragraph is also helpful in that case, because it ends the previous cluster of ideas, and starts a new cluster.
Now suppose there's a post somewhere in the thread that has more than one reply coming off of it. If you're going to cover some or all of those replies, you'll probably want to give some indication of the fact that they all stemmed from this one post. There are a number of possibilities, depending on the situation.
If there are only a small number (say 2 or 3) brief threads coming off of the one post, chances are you won't be writing a huge summary about it, so you can do something like this:
In the above summary, it is fairly clear that person A posted something initially that got a lot of attention. The summary covers 3 replies (there may have been more, but they were left out of the summary), one from B, one from D, and one from E. The reply from B got one reply from person C; the reply from D got no replies; and the reply from E got two replies, one from B and one from C. C's reply in turn got a reply from D. All of this is expressed adequately in the summary above, and can be graphed as follows. I'm not saying you should graph all your threads, but for purposes of explanation, it makes things easier.
Person A said they didn't like modules. There were a few replies to this. Person B disagreed, and said modules were fine. Person C replied that this was hooey and that A was right. Person D also replied to A, saying they agreed completely, and felt that modules were no good at all. Person E also replied to A, and argued that the upstream maintainer should be contacted. Person B and Person C both felt there was no need for this, though person D replied to Person C, <quote who="Person D">Obviously we have to contact upstream maintainers if we find bugs. That's the whole point.</quote>
A---B---C | |---D | |---E---B | |---C---D
You might feel a little silly repeating certain phrases over and over, like "so-and-so also replied to A", but it's probably OK to not worry about it, let yourself be repetitive, and just concentrate on summarizing what was actually said.
Now suppose there are several big subthreads and a lot of smaller ones, and a lot of individual replies as well, all stemming off of a single post. There are many many ways to handle this, and there is no clear-cut recipe that takes care of all cases. You have to use your judgement.
One thing you could do is just ignore the problem. This is a perfectly acceptable way of dealing with it. Just summarize one subthread in the thicket, then say "elsewhere", and move on to the next subthread in the thicket. This will lose the fullest clarity of where exactly you are in the thread, but it will save you a lot of headaches, and if anyone is really dying to know the full thread structure, they can always check it out themselves via the "archive" link.
You can also combine "elsewhere"s with more specific expressions. So you can use the "elsewhere" thing for a few subthreads, then say something like, "Also in reply to A's earlier post on developer organization, Person R said that there should be an organization to deal with this sort of occurrence."
No, in fact it's often desirable to leave out large portions of a thread in a summary. When summarizing a thread, you should try to get a sense of the overall point of the thread. Sometimes there is more than one point, but often there are many posts that are just tangential. So for instance, coming back to the graph:
A---B---C | |---D | |---E---B | |---C---D
You might find that only
A | |---D
Person A said they didn't like modules. Person D replied that they agreed completely, and felt that modules were no good at all.
A | |---E | |---C---D
Person A said they didn't like modules. Person E argued that the upstream maintainer should be contacted. Person C felt there was no need for this, though person D then replied, <quote who="Person D">Obviously we have to contact upstream maintainers if we find bugs. That's the whole point.</quote>
A---B---C | |--...--B
Person A said they didn't like modules. There were a few replies to this. Person B disagreed, and said modules were fine. Person C replied that this was hooey and that A was right. Elsewhere, Person B said there was also no need to contact the upstream maintainer.
really get to the point. In that case, it's perfectly acceptable to simply skip over the posts you don't feel are so significant, and just treat the thread as if it had essentially this new simplified structure. You shouldn't say that B is replying to A if it's not true, but you don't have to go through the whole chain to get from A to B if you'd rather just go there directly.
Another case is when part of the thread is interesting, but the interesting part is not the early part. So you would like to just jump right into the middle of the thread and start summarizing, instead of starting at the beginning and jumping from post to post to get where you really want to be. In this case, you can make use of the very handy, "In the course of discussion" phrase. You can use this whenever you start a summary anywhere other than at the beginning of the thread.
...and then another replies to each of those, or maybe more than one post replies to each of those point, so that what might seem on the outside to be a small subthread, actually contains hidden "threadlets"? What is the best way to summarize those discussions?
Threadlets should be handled just as a special class of subthread. To graph a threadlet, suppose your mail reader shows the thread or subthread as having the following structure:
A | |---E---B | |---C
Now suppose that Person A actually talked about points a, b, and c; and that Person E replied to a and b; and that Person B replied to a and b; and that Person C replied to b. Now the graph looks like this:
Aabc | |---Eab-Bab | |---Cb
Or, to expand it out:
Aa | |---Ea--Ba Ab | |---Eb--Bb | |---Cb Ac
Now it is much easier to deal with. You can just treat these the same way you treat ordinary subthreads. Just try to give some indication that they are actually part of one threadlet (you can assume the reader knows what a threadlet is). So the above graph might be summarized like this:
<p>Person A started a threadlet that went on for several posts. She claimed that backups were unnecessary on RAID systems, because RAID encompassed its own backup within it. Person E replied that this was hogwash, and person B asserted that bad backup policy would always come back to haunt you.<p>
<p>In A's same post that claimed backups were unnecessary, she also added that her company never used backups and preferred hiring employees with very good memories. E replied that good memory was very important, and B added that bad memory could be disasterous. C also replied to E, saying that in the 1970's bad memory was more common than good memory.<p>
<p>Also in A's original post about backups, she added that modules should determine a fixed and unchanging API; but there was no reply to this.<p>
The above is a perfectly acceptible way to handle threadlets. If you feel that parts of the threadlet are more or less important than other parts, you can prune accordingly, and make your graph a little easier to manage:
Aa | |---Ea Ab | |---Eb---Bb
or whatever seems best. All of this is to help break the problem of thread summarizing into manageable chunks. There's no need to follow any of these suggestions rigorously.
Also note that it is probably better to conceptualize threads in terms of subthreads and not threadlets wherever possible. Mail reading software doesn't help deal with threadlets, so they end up being more complex and difficult to handle than ordinary subthreads. If you find that after pruning, the threadlet can be made to resemble one or a few ordinary subthreads without too much of a problem, it's often best to choose that alternative. The temptation exists to call everything a threadlet, because almost all posts respond to others on multiple levels. The basic rule could be, if you can safely avoid treating something as a threadlet, do.
Paragraph breaks can be very useful for conveying the course of a thread. They can also be over-used, in which case they can get confusing. A good place to consider ending a paragraph is right at the end of a subthread. When you move on to summarize the next subthread, you can symbolize this by starting a new paragraph.
Sometimes you have no choice but to start a new paragraph, as when you've given a multi-paragraph quote and are now continuing in your own voice. This is perfectly acceptable, and anyway there's no way to avoid it.
The thing not to do, is to create many paragraphs where they're not needed. Don't just break a paragraph because one post has been summarized and you're moving on to the next post. If the subject matter is related, let the actual physical text be related too. Once the subject changes, then it's time to start a new paragraph.
title="Using Standard Date Formats"
author="Eusebio C Rufian-Zilbermann"
subject="NM Page info."
startdate="13 Sep 2000 11:02:43 -0700"
enddate="18 Sep 2000 13:16:55 -0700"
<p>This thread started when David Brown noted that the dates in the New Maintainer pages were expressed in a confusing format, and provided a <a href="http://email@example.com">link to an example</a>. He requested: <quote who="David Brown">Would it be possible to post the dates in ISO date format YYYY-MM-DD</quote>?</p>
<p>Further in the thread, Paul Slootman supported the idea. <quote who="Paul Slootman">The great thing about the ISO date format yyyy-mm-dd is that it can be sorted easily (with correct results :-)</quote></p>
<p>Elsewhere, Lars Wirzenius said:</p>
<quote who="Lars Wirzenius">
<p>It is, of course, not unheard of for normal people (non-nerds) to interpret 2000-09-01 as the ninth of January, year two thousand. I've even in my old life as Linux Software Map maintainer seen people express dates as 01-2000-09. No, I didn't understand what they meant, either.</p>
<p>That's why in normal text it tends to be best to express dates using non-encoded formats, such as "September 1, 2000" or "01-SEP-2000". They don't sort very well, and they're language dependent, but in normal text that isn't very important - having fewer errors is. When sorting or mechanical interpretation is important, the ISO date format shines.</p>
<p>David continued in <kcref subject="Dates again." startdate="14 Sep 2000 08:18:14 -0700"></kcref> with another example.</p>
Share And Enjoy!
Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.