Responding only to those portions where I think Windows experience and a
Windows perspective would be helpful...
On Mon, 15 Oct 2007, Eli Zaretskii wrote:
I believe the hassle is that readdir doesn't necessarily report a README in
a directory which is supposed to have a README, when it has a readme
instead. I think we want O(n) comparison of sorted lists, which doesn't
work if equivalent names don't sort the same.
We want getting stat info, using readdir to figure out what files exist,
for 106083 files in 1603 directories with a hot cache to take under 1s;
otherwise "git status" takes a noticeable amount of time with a medium-big
project, and we want people to be able to get info on what's changed
effectively instantly. My impression is that Windows' native stat and
readdir are plenty fast for what normal Windows programs want, but we
actually expect reasonable performance on an unreasonably-big
metadata-heavy input. AFAICT, nothing but Linux is optimized for this, but
we're used to being able to find out if there's any change to a large
directory structure in practically no time. On the other hand, we really
just want to beat users' expectations for this operation, not our own
expectations, so this may only be a problem for people benchmarking
Windows git against Linux git.
I believe the need here is quick setup and fast access to sparse portions
of several 100M files. It's hard to beat a page fault for read speed.
We also expect to be able to make a sequence of file system operations
such that programs starting at any time see the same database as the files
containing the database get restructured. My impression is that this is
very hard or impossible with Windows, and also that it doesn't matter for
Windows users, because they'll only have one program at a time accessing
the repository. A lot of our filesystem demands are about making a wide
variety of race conditions give the same result regardless of how the race
goes, and we're just being overly careful for a Windows environment
(although not necessarily for users with a UNIX background using Windows
only because they have to).
A unixy pipeline was convenient, given what else we had already written.
It's getting converted to single tasks, but it's not a top priority for
most developers, since streaming 100M from one program to the next under
most of our environments is trivial.
-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html