El 16/1/2008, a las 16:43, Kevin Ballard escribió:
As far as I know, Subversion has basically exactly the same problem,
and any time you consume/produce files on Mac OS X that are be
consumed/produced on other platforms you will run into this kind of
issue, with any software.
Tell Mac OS X to write a file with "ó" in the file name ("\xc3\xb3" in
UTF-8), and it will "normalize" it prior to writing by converting it
into a decomposed form (that is, ASCII "o" followed by "\xcc\x81", or
"combining acute accent"). So they're both valid Unicode, both valid
UTF-8, and they encode exactly the same characters but the byte stream
is different.
If you only work on Mac OS X then this will never be a problem because
all the files you create and therefore all the files you add to your
Git repository will have their names in decomposed UTF-8. But when you
start cloning repositories containing files added on other systems,
systems which might use precomposed rather than decomposed UTF-8 then
you'll run into exactly this kind of problem. The git.git repo has one
such file itself (gitweb/test/Märchen, if I remember correctly, which
Git reports as untracked).
Now, Mac OS X's behaviour is not entirely "insane" as some would
claim; there is indeed a rationale behind it even if you don't agree
with it, but it *does* produce some unfortunate teething problems for
people wanting to use Mac OS X in a cross-platform environment.
Here are some Apple docs on the subject:
http://developer.apple.com/qa/qa2001/qa1173.htmlhttp://developer.apple.com/qa/qa2001/qa1235.html
I personally wish that UTF-8 didn't allow different normalization
forms; then this kind of problem wouldn't arise. But it has arisen and
we have to live with it. Some workarounds have been proposed for Git,
but I haven't seen any convincing proposals yet.
Cheers,
Wincent
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html