Re: pack.packSizeLimit, safety checks

Previous thread: [RFC/PATCH 0/6] add --ff option to cherry-pick by Christian Couder on Monday, February 1, 2010 - 12:55 am. (14 messages)

Next thread: Problem listing GIT repository with accents by =?iso-8859-1?Q?Elli=E9_Computing_Open_Source_Program?= on Monday, February 1, 2010 - 3:48 am. (8 messages)
From: Sergio
Date: Monday, February 1, 2010 - 2:20 am

Hi,

documentation about pack.packSizeLimit

says:

The default maximum size of a pack. This setting only affects packing to a file,
i.e. the git:// protocol is unaffected. It can be overridden by the
--max-pack-size option of git-repack(1).

I would suggest clarifying it into

The default maximum size of a pack in bytes. This setting only affects packing
to a file, i.e. the git:// protocol is unaffected. It can be overridden by the
--max-pack-size option of git-repack(1).

Since --max-pack-size takes MB and one might be tempted to assume that the same
is valid for pack.packSizeLimit.

Also note that some safety check on pack.packSizeLimit could probably be
desirable to avoid an unreasonably small limit. For instance:

Assume that pack.packSizeLimit is set to 1 (believing it would be 1MB, but it is
in fact 1B). With this at the first git gc every object goes in its own pack.
You realize the mistake, you fix pack.packSizeLimit to 1000000, but at this
point you cannot go back since git gc cannot run anymore (too many open files).

Maybe, considering all the possible use cases of pack.packSizeLimit could help
finding a reasonable lower bound.


--

From: Nicolas Pitre
Date: Monday, February 1, 2010 - 9:11 am

Grrrrr.  This is a terrible discrepency given that all the other 
arguments in Git are always byte based, with the optional k/m/g suffix, 
by using git_parse_ulong().  So IMHO I'd just change --max-pack-size to 
be in line with all the rest and have it accept bytes instead of MB.  
And of course I'd push such a change to be included in v1.7.0 along with 
the other incompatible fixes.


That's a totally orthogonal issue.  There are other ways to get into 
trouble with too many open files and that deserves a fix of its own 
(such as limiting the number of simultaneous opened packs).


Nicolas
--

From: Johannes Sixt
Date: Monday, February 1, 2010 - 9:26 am

While at it, also change --big-file-threshold that fast-import learnt the
other day...

-- Hannes
--

From: Shawn O. Pearce
Date: Monday, February 1, 2010 - 9:28 am

Yup.  WTF was I thinking when I did megabytes as the default unit
on the command line...

-- 
Shawn.
--

From: Junio C Hamano
Date: Monday, February 1, 2010 - 10:20 am

That thing is new, so it is worth fixing before 1.7.0 final.
Please make it so.

Thanks.
--

From: Junio C Hamano
Date: Monday, February 1, 2010 - 10:19 am

All of the "other incompatible" changes had ample leading time for
transition with warnings and all.

I am afraid that doing this "unit change" is way too late for 1.7.0, and
it makes me somewhat unhappy to hear such a suggestion.  It belittles all
the careful planning that has been done for these other changes to help
protect the users from transition pain.

Introduce --max-pack-megabytes that is a synonym for --max-pack-size for
now, and warn when --max-pack-size is used; warn that --max-pack-size will
count in bytes in 1.8.0. Ship 1.7.0 with that change.  --max-pack-bytes
can also be added if you feel like, while at it.

But changing the unit --max-pack-size counts in to bytes in 1.7.0 feels
a bit too irresponsible for the existing users.

--

From: Nicolas Pitre
Date: Monday, February 1, 2010 - 11:04 am

Thing is... I don't know if the --max-pack-size argument is really that 
used.  I'd expect people relying on that feature to use the config 
variable instead, given that 'git gc' has no idea about --max-pack-size 
anyway.  People using the --max-pack-size argument directly are probably 
doing so only to experiment with it, and then setting the config 
variable, probably using the wrong unit.  The fact that such a 
discrepancy just came to our attention after all the time this feature 
has existed is certainly a good indicator of its popularity.

I understand the really unfortunate timing for such a change.  OTOH 
there is a big advantage to bundle as much incompatibilities together at 
the same time, as people will be prepared for such things already.

While I share your concern for advance warning and such, I think those 
concerns are worth an effort proportional to the depth of user exposure.  
Like for the THREADED_DELTA_SEARCH case, I'm wondering how much pain if 
at all might be saved with a transition plan vs the cost of maintaining 
that plan and carrying the discrepancy further.


Nicolas
--

From: Junio C Hamano
Date: Monday, February 1, 2010 - 11:45 am

Absolutely.  I just wanted to hear that --max-pack-size command line
option is not widely used and it is Ok to change it.
--

From: Junio C Hamano
Date: Wednesday, February 3, 2010 - 7:14 pm

I suspect one of us need to be careful not to forget this thing...

-- >8 --
Subject: pack-objects --max-pack-size=<n> counts in bytes

The --window-memory argument and pack.packsizelimit configuration used by
the same program counted in bytes and honored the standard k/m/g suffixes.
Make this option do the same for consistency.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes-1.7.0.txt   |    6 ++++++
 Documentation/git-pack-objects.txt |    3 ++-
 builtin-pack-objects.c             |    7 +++----
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/RelNotes-1.7.0.txt b/Documentation/RelNotes-1.7.0.txt
index 323ae54..adf8824 100644
--- a/Documentation/RelNotes-1.7.0.txt
+++ b/Documentation/RelNotes-1.7.0.txt
@@ -46,6 +46,12 @@ Notes on behaviour change
    environment, and diff.*.command and diff.*.textconv in the config
    file.
 
+ * "git pack-objects --max-pack-size=<n>" used to count in megabytes,
+   which was inconsistent with its corresponding configuration
+   variable and other options the command takes.  Now it counts in bytes
+   and allows standard k/m/g suffixes to be given.
+
+
 Updates since v1.6.6
 --------------------
 
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 097a147..fdaf775 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -106,7 +106,8 @@ base-name::
 	default.
 
 --max-pack-size=<n>::
-	Maximum size of each output packfile, expressed in MiB.
+	Maximum size of each output packfile, expressed in bytes.  The
+	size can be suffixed with "k", "m", or "g".
 	If specified,  multiple packfiles may be created.
 	The default is unlimited, unless the config variable
 	`pack.packSizeLimit` is set.
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 4a41547..33e11d7 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -2203,11 +2203,10 @@ int cmd_pack_objects(int argc, ...
From: Nicolas Pitre
Date: Wednesday, February 3, 2010 - 7:31 pm

Please hold on.  I think I have a better patch.


Nicolas
--

From: Junio C Hamano
Date: Wednesday, February 3, 2010 - 7:34 pm

Ok, I dropped both fast-import.c mismerge fix and the one you are
responding to.
--

Previous thread: [RFC/PATCH 0/6] add --ff option to cherry-pick by Christian Couder on Monday, February 1, 2010 - 12:55 am. (14 messages)

Next thread: Problem listing GIT repository with accents by =?iso-8859-1?Q?Elli=E9_Computing_Open_Source_Program?= on Monday, February 1, 2010 - 3:48 am. (8 messages)