[PATCH] Introduce core.keepHardLinks

Previous thread: [PATCH] fetch: refuse to fetch into the current branch in a non-bare repository by Johannes Schindelin on Saturday, October 11, 2008 - 4:38 am. (23 messages)

Next thread: [PATCH] Fix testcase failure when extended attributes are in use by Deskin Miller on Saturday, October 11, 2008 - 8:41 am. (2 messages)
From: Johannes Schindelin
Date: Saturday, October 11, 2008 - 4:45 am

When a tracked file was hard linked, we used to break the hard link
whenever Git writes to that file.  Make that optional.

To keep the implementation simple, mode changes will still break the
hard links.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	With the current revision of the patch, I can set
	keep_hard_links = 1 and the test suite still passes.

	I briefly tried to fix the "mode changes" issue, but replacing
	the "st.st_mode != ce->ce_mode" with "S_ISREG(ce->ce_mode)"
	(and consequently also adding
	 "|| (keep_hard_links && chmod(path, mode)" to create_file()),
	made at least t3400 fail, and then I ran out of my Git time budget.

 Documentation/config.txt |    4 +++
 cache.h                  |    1 +
 config.c                 |    5 ++++
 entry.c                  |    7 ++++-
 environment.c            |    1 +
 t/t0056-hardlinked.sh    |   58 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 74 insertions(+), 2 deletions(-)
 create mode 100644 t/t0056-hardlinked.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 173386e..7bfe431 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -207,6 +207,10 @@ core.symlinks::
 	file. Useful on filesystems like FAT that do not support
 	symbolic links. True by default.
 
+core.keepHardLinks::
+	If true, do not break hard links by deleting and recreating the
+	files.  Off by default.
+
 core.gitProxy::
 	A "proxy command" to execute (as 'command host port') instead
 	of establishing direct connection to the remote server when
diff --git a/cache.h b/cache.h
index c89f2c6..c4bdece 100644
--- a/cache.h
+++ b/cache.h
@@ -479,6 +479,7 @@ enum rebase_setup_type {
 
 extern enum branch_track git_branch_track;
 extern enum rebase_setup_type autorebase;
+extern int keep_hard_links;
 
 #define GIT_REPO_VERSION 0
 extern int repository_format_version;
diff --git a/config.c b/config.c
index 18d305c..35ffdef 100644
--- a/config.c
+++ ...
From: Shawn O. Pearce
Date: Sunday, October 12, 2008 - 11:38 am

Why would anyone want to do this?

I cannot fathom why a user wants this much rope to hang themselves

-- 
Shawn.
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 1:58 am

Hi,


The question is not so much why anyone want to do this, but _if_.

And the answer is: yes.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Monday, October 13, 2008 - 7:01 am

Sorry, I think the question should be _why_.

You can gain a sympathetic "Ah, that is sensible, and 'this much rope to
hang themselves with' comment was unwarranted" only by answering that
question.

--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 9:09 am

Hi,


Okay, here are a couple of reasons:

- all the editors that this guy tested keep the hard links, so it was 
  kinda hard to understand why Git insists on behaving differently,

- if the user asked for hard links, it is not Git's place to question that 
  decision,

- and in that user's scenario a certain file has to be shared between 
  different projects that are all version-controlled with Git, but in 
  different teams and with different servers they connect to.  So no, you 
  cannot even make a submodule of it, because the guys involved do not 
  share any repository/server access.  Besides, submodules are not 
  user-friendly enough yet.

Oh, and come to think of it, this could solve the old issue of "I want to 
track only a few files in my $HOME/".

Anyway, I think that breaking hard links is not a nice habit of Git (after 
all, from the user's point of view the file is not created, but 
modified!), and I would have expected others to need a lot less arguments 
to see it that way, too.

Ciao,
Dscho

--

From: Stephan Beyer
Date: Monday, October 13, 2008 - 9:21 am

Hi,


Despite the fact that I've never used hardlinks in a git repository, I
would have expected git to keep them.  So I'm one of the "others" who
thinks this config option is just sane (and should perhaps even be
enabled by default, if it does not break stuff on file systems that do
not have a hardlink feature... but ok)

Regards,
  Stephan

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F
--

From: Shawn O. Pearce
Date: Monday, October 13, 2008 - 9:23 am

My problem is many users do "cp -rl a b" to clone a->b and hardlink
the working directory.  They expect "cd b && git checkout foo" to
then only unlink the paths that differ.  Updating the original inode
would break repository a.

Its a change in behavior, to some of our oldest users.  So it can't
really be on by default.

-- 
Shawn.
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 10:55 am

Hi,


Which is the reason why my commit is not titled "Make Git respect hard 
links by default", but "Introduce core.keepHardLinks".  I also hate people 
who try to break my setup.

Which scenario (breaking someone's setup) was exactly what triggered this 
patch.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Monday, October 13, 2008 - 10:33 am

These are non-arguments, when you are asked to give rationale for adding
capability to "ask for hard links" to begin with.

IOW, the question was why git should support tracked contents being
hardlinked to something else.

[jc: Sorry for dropping Shawn from CC: list.  pobox.com seems to complain
on his address for some reason.  Here are msmtp log for an identical
message with and without him on the recipient list.

Oct 13 10:23:57 host=sasl.smtp.pobox.com tls=on auth=on user=junio@pobox.com from=junio@pobox.com recipients=Johannes.Schindelin@gmx.de,barkalow@iabervon.org,git@vger.kernel.org,spearce@spearce.org,gitster@pobox.com smtpstatus=451 smtpmsg='451 4.3.5 Server configuration problem' errormsg='recipient address spearce@spearce.org not accepted by the server' exitcode=EX_DATAERR
Oct 13 10:31:26 host=sasl.smtp.pobox.com tls=on auth=on user=junio@pobox.com from=junio@pobox.com recipients=Johannes.Schindelin@gmx.de,barkalow@iabervon.org,git@vger.kernel.org,gitster@pobox.com mailsize=2283 exitcode=EX_OK
]
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 10:54 am

Hi,


Actually, they are arguments.

The thing is: these editors do what they do for a reason.  Which is 
exactly the second reason.

When a user makes hard links, it is not just for fun and bullocks.  It is 
not for copy-on-write either, that's not what hard links are supposed to 
do.  It is for cases when you need the _same_ information in two places.

I am not that big a user of hard links myself, but when I do, I know 
exactly what I am doing.  And with my patch and that config variable set 
to true, Git will not interfer with that.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Monday, October 13, 2008 - 11:06 am

Ok, such a possible benefit should be described and defended better then.
I only thought of the scenario Shawn has brought up, which is a downside
of using this option without any conceivable upside, when I read your
patch and your rationale you repeated in a few messages that followed.

You could have said something like "The users may want to have a checkout,
and keep the same contents always appear elsewhere i.e. 'installing by
hardlinking'.  In such a setup, editing the source with an editor
configured not to break hardlinks would still work fine, but git-checkout
will unconditionally break such links, which is undesirable.  Such a setup
would want a configuration variable like this."

"It is not for fun and bullocks" is not a description any clearer than
what you kept repeating, but if you stated it like the above, then I would
not have had trouble understanding what you wanted to say.

The documentation update needs to warn about the Shawn's scenario.  I also
do not know what this should do when you have two tracked paths with
identical contents hardlinked to each other.  Because we do not track
hardlinks, I _think_ breaking links would be the right thing to do for
such paths regardless of this configuration variable.

It also raises another question.  Don't you want this to be an attribute
for paths, not all-or-nothing configuration per repository?

--

From: Johannes Schindelin
Date: Wednesday, October 15, 2008 - 2:07 am

Hi,


Sounds very nice.  Sorry for being grumpy, and not being able to come up 

When the user does that, it's the user's wish.  I'd not let Git play cute 

I'd rather not have it as an attribute, because it is not so much about 
file types that should show this behavior.  It is more like an option that 
I fully expect to be set in $HOME/.gitconfig.

Ciao,
Dscho

--

Previous thread: [PATCH] fetch: refuse to fetch into the current branch in a non-bare repository by Johannes Schindelin on Saturday, October 11, 2008 - 4:38 am. (23 messages)

Next thread: [PATCH] Fix testcase failure when extended attributes are in use by Deskin Miller on Saturday, October 11, 2008 - 8:41 am. (2 messages)