best practices

Quote: Hunting Down Nearby Sillinesses

Submitted by Jeremy
on May 27, 2008 - 12:04pm

"I do think that if one is already changing a line which is incorrectly laid out then there's no point in _leaving_ it incorrect. There's no downside to fixing it. That being said, it's often sorely tempting to go hunting down nearby sillinesses. I succumb to that temptation and usually won't complain when others do also, up to a point."

Git Management

Submitted by Jeremy
on May 20, 2008 - 6:47pm

"Is there a write up of what you consider the 'proper' git workflow?" Theodore Ts'o asked Linux creator Linus Torvalds, "why do you consider rebasing topic branches a bad thing?" Linus replied, "rebasing branches is absolutely not a bad thing for individual developers. But it *is* a bad thing for a subsystem maintainer." He went on to differentiate between 'grunts' who write the code and 'managers' who primarily collect other people's code, "a grunt should use 'git rebase' to keep his own work in line. A technical manager, while he hopefully does some useful work on his own, should strive to make _others_ do as much work as possible, and then 'git rebase' is the wrong thing, because it will always make it harder for the people around you to track your tree and to help you update your tree." Linus compared his own patch management style and productivity from over six years ago before he started using BK and git, to his current style using git:

"You can either try to drink from the firehose and inevitably be bitched about because you're holding something up or not giving something the attention it deserves, or you can try to make sure that you can let others help you. And you'd better select the 'let other people help you', because otherwise you _will_ burn out. It's not a matter of 'if', but of 'when'. [...] And when you're in that kind of ballpark, you should at least think of yourself as being where I was six+ years ago before BK. You should really seriously try to make sure that you are *not* the single point of failure, and you should plan on doing git merges. [...] I think a lot of people are a lot happier with how I can take their work these days than they were six+ years ago."

Formatting Bug Fix Emails

Submitted by Jeremy
on October 24, 2007 - 3:31am
Linux news

"I'll just take this opportunity to ask people that when they send bug-fixes, please try to make the subject line and message make sense for a *reader*, not for yourself (or even to me, although if it's readable to some generic person, it's hopefully readable to me too!)," Linus Torvalds explained in response to a recent bugfix. He went on to provide some general rules:

"1)if it's not fairly generic, specify the area (architecture, subsystem, driver) that the fix is for in the subject line. [...] 2) don't use commit names in the subject line - and while it's great to use them in the body of the explanation, even there you don't want to assume that people read it from within git. [...] 3) write the commit message for an outsider, and use whitespace. The third-most common fixup I end up doing (after the above two) is to split things up into shorter paragraphs, after somebody wrote a good changelog entry, but made it one large unreadable blob of text."

Linus added, "I end up editing just about half of all the commit messages of stuff I get in email (except for Andrew's stuff, since Andrew largely does the same kinds of cleanups anyway, so I only need to edit up a small percentage of the patches he forwards). I'd like it to be *much* less than that, so I thought I should speak up since I had an example of this."

Exact Kernel Names

Submitted by Jeremy
on October 21, 2007 - 6:33am
Linux news

When asked how to best refer to kernels between official releases and release candidates, Linus Torvalds pointed to his automated git snapshots. "I still call them 'nightly snapshots', but they do in fact happen twice a day if there have been changes, so that's not technically correct," he noted. The latest snapshot is 2.6.23-git15, "this is an exact name, because you can go to kernel.org and look up the exact commit ID that was used to generate it (there's an 'ID' file associated with each snapshot there)." For git users, he suggested using the "git describe" command to get the git name, with the current head being named v2.6.23-6562-g8add244. He went on to explain that the name "tells you three things: (a) it's based on 2.6.23 (b) there's been 6562 commits since 2.6.23 and (c) the top-of-tree abbreviated commit is '8add244'."

When asked about the previously discussed usage of "-rc0" and other similar proposed naming conventions, Linus replied:

"Please don't use those names. They don't actually tell anything about where in the cycle it is, and as you can see above, there's been 6500+ commits since 2.6.23, so saying '2.6.23-rc0' or similar really isn't very helpful if anybody actually cares about just where in the release cycle you are."

Tracking Down Merge Errors With git

Submitted by Jeremy
on October 16, 2007 - 8:14pm
Linux news

In a short discussion following a patch titled "fix abdhid mismerge", Al Viro noted troubles in tracking down the changeset that caused the problem, "what's the right way to trace the things like that? Linus?" Linus Torvalds, as the original author of git, replied, "in general, I'm afraid that merge errors are simply not very easy to find." He then offered some general tips for tracking down mis-merges, noting, "if anybody can come up with a better way to find these kinds of mis-merges, I'd love to hear about it." In regards to this particular case, he explained:

"'-c' is for regular combined merges: any file that was modified in both parents will show up as a combination of the diffs of both sides, while a file that was taken in its *entirety* is ignored.

"In this case that's exactly what you wanted. It's just too noisy to necessarily be the default, and you can still have a silent mis-merge if the merger picked *only* one side.

But in general, I suspect that '-c' is often a good thing to try if you cannot find the cause of some change in a regular commit, and suspect a merge error."

Memset versus Memzero

Submitted by Jeremy
on September 23, 2007 - 9:53am
Linux news

A short thread on the lkml discussed the lack of a memzero function in the Linux kernel. Cyrill Gorcunov asked, "could anyone tell me why there is no official memzero function (or macros) in the kernel?" Arjan van de Ven explained, "it doesn't add value.... memset with a constant 0 is just as fast (since the compiler knows it's 0) than any wrapper around it, and the syntax around it is otherwise the same." Linux creator Linus Torvalds went on to explain:

"The reason we have 'clear_page()' is not because the value we're writing is constant - that doesn't really help/change anything at all. We could have had a 'fill_page()' that sets the value to any random byte, it's just that zero is the only value that we really care about.

"So the reason we have 'clear_page()' is because the *size* and *alignment* is constant and known at compile time, and unlike the value you write, that actually matters. So 'memzero()' would never really make sense as anything but a syntactic wrapper around 'memset(x,0,size)'."

Email Clients and Patches

Submitted by Jeremy
on September 15, 2007 - 2:27pm
Linux news

Randy Dunlap sent a patch to the Linux kernel mailing list described as adding "info about various email clients and their applicability in being used to send Linux kernel patches." The first revision generated quite a bit of discussion, quickly resulting in a second version, and eventually a third version that Andrew Morton referred to as "soon to be merged". In addition to some general suggestions about emailing patches, it also offers some specific configuration suggestions for a number of popular email clients. It begins:

"Patches for the Linux kernel are submitted via email, preferably as inline text in the body of the email. Some maintainers accept attachments, but then the attachments should have content-type 'text/plain'. However, attachments are generally frowned upon because it makes quoting portions of the patch more difficult in the patch review process.

"Email clients that are used for Linux kernel patches should send the patch text untouched. For example, they should not modify or delete tabs or spaces, even at the beginning or end of lines."

Linux: Enforcing the Merge Window

Submitted by Jeremy
on August 8, 2007 - 7:37pm
Linux news

Following a recent merge request, Linus Torvalds stressed that he was serious about not wanting to merge any big changes after the merge window closes, "get the changes in before -rc1, or just *wait*. If they aren't ready before the merge window opens, they simply shouldn't be merged at all." Jeff Garzik reiterated, "once -rc1 is out there, that means the focus should be on stabilizing the existing codebase. Pushing a big driver update means that effort must restart from scratch. We just don't want to go down that road, which a big reason for the merge window in general." Further when it was noted that the recent changes were heavily tested by the vendor, Jeff stressed the importance of community testing:

"Take a lesson from when I was on Linus's shit-list... twice: Twice, Intel submitted an e1000 update after the merge window closed. Twice, they claimed the driver passed their quite-exhaustive internal testing. And twice, the most popular network driver broke for large masses of users because I took a hardware vendor's word on testing rather than rely on the testing PROVEN to flush out problems: public linux kernel testing.

"I'm not singling out Intel, there are plenty of other hardware vendors that repeat the exact same pattern."

Linux: Formatting Merge Requests

Submitted by Jeremy
on July 20, 2007 - 5:46am
Linux news

In response to a recent merge request, Linus Torvalds explained how he preferred the request to be formatted, "please don't hide the branch name in the free-flowing text". He noted that he wanted the git URL indented and on it's own line, "so that I don't miss the right branch-name even by mistake." He went on to explain:

"I try to be careful, and I think I missed the branch-name just once (and noticed it when the diffstat didn't match), but this is something I want to teach every git user to know very intimately: make it impossible to make mistakes, by always keeping the whole git information together, and never mixed up with any free-flowing explanation."

Linux: Moving and Changing Code

Submitted by Jeremy
on July 16, 2007 - 1:59pm
Linux news

In response to a recent merge request, Linus Torvalds explained a best practice when moving and changing code, "when doing renames it is generally *much* nicer to do a 100% rename (perhaps with just _trivial_ changes to make it compile - the include statements etc change, and maybe you want to change the name in the comment header too)." He went on to explain, "doing 'move the code and change it at the same time' is considered bad form. Movement diffs are much harder to read anyway (a traditional diff will show it as a new-file + delete, of course), so the general rule is: move code around _without_ modifying it, so that code movement (whether it's a whole file, or just a set of functions between files) doesn't really introduce any real changes, and is easier to look through the changes; do the actual changes to the code as a separate thing." He went on to note why this is especially important during Linux development, "where patches are the main way people communicate.":

"And when using git, the whole 'keep code movement separate from changes' has an even more fundamental reason: git can track code movement (again, whether moving a whole file or just a function between files), and doing a 'git blame -C' will actually follow code movement between files. It does that by similarity analysis, but it does mean that if you both move the code *and* change it at the same time, git cannot see that 'oh, that function came originally from that other file', and now you get worse annotations about where code actually originated."

Linux: Documenting How Patches Reach The Kernel

Submitted by Jeremy
on May 23, 2004 - 9:34am
Linux news

Linux creator Linus Torvalds began a recent email, "this is a request for discussion.." describing a method of tracking how patches find their way into the Linux kernel. The incentive for this change in process was made clear:

"Some of you may have heard of this crazy company called SCO (aka "Smoking Crack Organization") who seem to have a hard time believing that open source works better than their five engineers do. They've apparently made a couple of outlandish claims about where our source code comes from, including claiming to own code that was clearly written by me over a decade ago."

He notes that though people have proven to be quite good at debunking these claims, the effort has been tedious. "So, to avoid these kinds of issues ten years from now, I'm suggesting that we put in more of a process to explicitly document not only where a patch comes from (which we do actually already document pretty well in the changelogs), but the path it came through." Read on for Linus' complete description of how this process would work.