I am not sure if there is any practical difference between the two ;-). But in either case, it appears that we should first revert d9fa4a8 from 'next' and start from clean slate. It really seems that the patch series did upset too many people; personally I found the first patch still was follow-able, but I do agree that it should have been much smaller and not mixing too many things into one). So, let's do 1. -
This patch series implements part of the ground work for the 'notes'
feature discussed earlier in the thread "[PATCH 00/15] git-note: A
mechanism for providing free-form after-the-fact annotations on commits".
The following patches refactors the tag object by:
1. Unifying parsing and verification of tag objects (patches 1-9)
2. Do better and more thorough verification of tag objects (patches 10-13)
3. Making the "tagger" header mandatory as far as possible (patch 14)
4. Making the "tag" header optional (patch 15)
5. Introducing a new optional "keywords" header (patch 16)
6. Auxiliary changes supporting the above (patches 17-21)
This patch series replaces the earlier patch series of the same name
(plus the current bugfixes on top of that series). It's also much easier
on the eyes for those with 80 chars wide displays. Also, the selftest
suite should run successfully at any point in this patch series.
Here's the shortlog:
Johan Herland (21):
Remove unnecessary code and comments on non-existing 8kB tag object restriction
Return error messages when parsing fails.
Refactoring to make verify_tag() and parse_tag_buffer() more similar
Refactor verification of "tagger" line to be more similar to verification of "type" and "tagger" lines
Make parse_tag_buffer_internal() handle item == NULL
Refactor tag name verification loop to use index 'i' instead of incrementing pointer 'tag_line'
Copy the remaining differences from verify_tag() to parse_tag_buffer_internal()
Switch from verify_tag() to parse_and_verify_tag_buffer() for verifying tag objects in git-mktag
Remove unneeded code from mktag.c
Free mktag's buffer before dying
Rewrite error messages; fix up line lengths
Use prefixcmp() instead of memcmp() for cleaner code with less magic numbers
Collect skipping of header field names and calculation of line lengths in one place
Add proper parsing of "tagger" line, but only when thorough_verify is set
...Signed-off-by: Johan Herland <johan@herland.net> --- mktag.c | 7 ------- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/mktag.c b/mktag.c index 070bc96..b82e377 100644 --- a/mktag.c +++ b/mktag.c @@ -12,15 +12,8 @@ * "object <sha1>\n" is 48 bytes, "type tag\n" at 9 bytes is the * shortest possible type-line, and "tag .\n" at 6 bytes is the * shortest single-character-tag line. - * - * We also artificially limit the size of the full object to 8kB. - * Just because I'm a lazy bastard, and if you can't fit a signature - * in that size, you're doing something wrong. */ -/* Some random size */ -#define MAXSIZE (8192) - /* * We refuse to tag something we can't verify. Just because. */ -- 1.5.2 -
This patch brings the already similar tag.c:parse_tag_buffer() and
mktag.c:verify_tag() a little bit closer to eachother.
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 39 ++++++++++++++++++++++++++++++---------
1 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/tag.c b/tag.c
index bbacd59..954ed8a 100644
--- a/tag.c
+++ b/tag.c
@@ -35,6 +35,12 @@ struct tag *lookup_tag(const unsigned char *sha1)
int parse_tag_buffer(struct tag *item, void *data, unsigned long size)
{
+#ifdef NO_C99_FORMAT
+#define PD_FMT "%d"
+#else
+#define PD_FMT "%td"
+#endif
+
int typelen, taglen;
unsigned char sha1[20];
const char *type_line, *tag_line, *sig_line;
@@ -45,28 +51,41 @@ int parse_tag_buffer(struct tag *item, void *data, unsigned long size)
item->object.parsed = 1;
if (size < 64)
- return -1;
- if (memcmp("object ", data, 7) || get_sha1_hex((char *) data + 7, sha1))
- return -1;
+ return error("failed preliminary size check");
+ /* Verify object line */
+ if (memcmp(data, "object ", 7))
+ return error("char%d: does not start with \"object \"", 0);
+
+ if (get_sha1_hex((char *) data + 7, sha1))
+ return error("char%d: could not get SHA1 hash", 7);
+
+ /* Verify type line */
type_line = (char *) data + 48;
- if (memcmp("\ntype ", type_line-1, 6))
- return -1;
+ if (memcmp(type_line - 1, "\ntype ", 6))
+ return error("char%d: could not find \"\\ntype \"", 47);
+ /* Verify tag-line */
tag_line = strchr(type_line, '\n');
- if (!tag_line || memcmp("tag ", ++tag_line, 4))
- return -1;
+ if (!tag_line)
+ return error("char" PD_FMT ": could not find next \"\\n\"", type_line - (char *) data);
+ tag_line++;
+ if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n')
+ return error("char" PD_FMT ": no \"tag \" found", tag_line - (char *) data);
sig_line = strchr(tag_line, '\n');
if (!sig_line)
return -1;
sig_line++;
+ /* Get the actual type */
typelen = tag_line - type_line - strlen("type ...A fair amount of refactoring is included in this patch, none of which
should affect the actual behaviour of the code in any way.
Here are the changes done:
- Refactor out all the (char *) casting in parse_tag_buffer(). Create a
wrapper function (parse_tag_buffer()) that casts _once_ and then calls
parse_tag_buffer_internal() which does the real work
- Variable renaming in parse_tag_buffer_internal():
- sig_line -> tagger_line
- typelen -> type_len
- taglen -> tag_len
- Variable renaming in verify_tag():
- buffer -> data
- typelen -> type_len
- Remove unnecessary variable 'object' from verify_tag(). It has always
the same value as 'data', so use 'data' directly instead
- Fix type of length variables (int -> unsigned long)
- Move null-termination of tag buffer out of verify_tag() and into main().
Signed-off-by: Johan Herland <johan@herland.net>
---
mktag.c | 41 +++++++++++++++++++----------------------
tag.c | 50 +++++++++++++++++++++++++++-----------------------
2 files changed, 46 insertions(+), 45 deletions(-)
diff --git a/mktag.c b/mktag.c
index b82e377..0bc20c8 100644
--- a/mktag.c
+++ b/mktag.c
@@ -32,52 +32,48 @@ static int verify_object(unsigned char *sha1, const char *expected_type)
return ret;
}
+static int verify_tag(char *data, unsigned long size)
+{
#ifdef NO_C99_FORMAT
#define PD_FMT "%d"
#else
#define PD_FMT "%td"
#endif
-static int verify_tag(char *buffer, unsigned long size)
-{
- int typelen;
- char type[20];
unsigned char sha1[20];
- const char *object, *type_line, *tag_line, *tagger_line;
+ char type[20];
+ const char *type_line, *tag_line, *tagger_line;
+ unsigned long type_len;
if (size < 64)
return error("wanna fool me ? you obviously got the size wrong !");
- buffer[size] = 0;
-
/* Verify object line */
- object = buffer;
- if (memcmp(object, "object ", 7))
+ if (memcmp(data, "object ", 7))
return error("char%d: does not start with \"object \"", 0);
- if ...Hi, Quite a lot of changes seem to do this object->data. The patch would have been much more compact if you just had renamed buffer to object instead of This renaming variables has nothing to do with refactoring. In fact, I have a hard time to find code changes (which your subject suggests, as you Ah, so you terminate the buffer here. From the patch, it is relatively hard to see if this line is always hit _before_ the function is called that evidently relies on NUL termination. By moving it here, I think it is much more likely to overlook the fact that the function, which made sure that its assumption was met, needs this line. Whereas if you left it where Is this really necessary? Even if (which I doubt), it has nothing to do with refactoring. If you _want_ to _explicitely_ do arithmetic on a char* instead of void*, I am really reluctant with renamings like these. IMHO they don't buy you much, except for possible confusion. It is evident that sig means the signer (and it is obvious in the case of an unsigned tag, who is meant, This cast (and indeed, this function, if you ask me) is unnecessary. I reviewed only this patch out of your long series, mostly because I found the subject line interesting. But IMHO the patch does not what the subject line suggests. Unfortunately, it's unlikely that I will have time until Monday night to continue with this patch series. Ciao, Dscho -
Hi. Thanks for taking the time to look at (some of) my patch. Most of your questions below can be answered with a single answer: The main purpose of the patch is (as the subject line says) to bring the two functions more in line with eachother. At the time I made the patch, I had made the observation that these function were trying to do much the same thing, albeit in a slightly different form. This patch is therefore about applying a series of (mostly non-functional) refactorings to make their diff as small as possible. This involves "stupid" changes such as renaming variables, tweaking whitespace, reordering the declaration of variables, etc. It's all to make the functions similar to the point where I can diff them, get a small and meaningful result, see the remaining _real_ differences, and in the end, _merge_ them (see patches 7-9). If this whole exercise didn't end up with merging the two functions into one, I would _totally_ agree with you that all this refactoring is more First, (and you'll see this in the commit message) I'm _moving_ (not removing) the NUL termination out of verify_tag() and into main() (which I can be sure is the only caller of verify_tag(), since verify_tag is declared static, and there is no other call in that file). Two reasons for doing this: 1. Make verify_tag more similar to parse_tag_buffer() (because parse_tag_buffer() does not NUL terminate) 2. Do the NUL termination as close to the code that actually populated the buffer with data (the read_pipe() in main()) So now you can ask: Why doesn't parse_tag_buffer() NUL terminate its input? It _that_ safe? And I ran around checking all the callers of parse_tag_buffer, and found that all of them use data (most of which originates from read_sha1_file()) that's already NUL terminated. In the end, I also put in a comment on the resulting function (parse_and_verify_tag_buffer()), explicitly saying the given data _must_ be NUL terminated. Side note: At first I actually thought the ...
Also update selftests to reflect that verification of "tagger" now happens _before_ verification of type name, object sha1 and tag name. Signed-off-by: Johan Herland <johan@herland.net> --- mktag.c | 16 ++++++++-------- t/t3800-mktag.sh | 3 +++ tag.c | 6 +++--- 3 files changed, 14 insertions(+), 11 deletions(-) diff --git a/mktag.c b/mktag.c index 0bc20c8..4dbefab 100644 --- a/mktag.c +++ b/mktag.c @@ -62,12 +62,18 @@ static int verify_tag(char *data, unsigned long size) /* Verify tag-line */ tag_line = strchr(type_line, '\n'); - if (!tag_line) + if (!tag_line++) return error("char" PD_FMT ": could not find next \"\\n\"", type_line - data); - tag_line++; if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n') return error("char" PD_FMT ": no \"tag \" found", tag_line - data); + /* Verify the tagger line */ + tagger_line = strchr(tag_line, '\n'); + if (!tagger_line++) + return error("char" PD_FMT ": could not find next \"\\n\"", tag_line - data); + if (memcmp(tagger_line, "tagger ", 7) || (tagger_line[7] == '\n')) + return error("char" PD_FMT ": could not find \"tagger\"", tagger_line - data); + /* Get the actual type */ type_len = tag_line - type_line - strlen("type \n"); if (type_len >= sizeof(type)) @@ -90,12 +96,6 @@ static int verify_tag(char *data, unsigned long size) return error("char" PD_FMT ": could not verify tag name", tag_line - data); } - /* Verify the tagger line */ - tagger_line = tag_line; - - if (memcmp(tagger_line, "tagger", 6) || (tagger_line[6] == '\n')) - return error("char" PD_FMT ": could not find \"tagger\"", tagger_line - data); - /* TODO: check for committer info + blank line? */ /* Also, the minimum length is probably + "tagger .", or 63+8=71 */ diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh index 7c7e433..b4edb4d 100755 --- a/t/t3800-mktag.sh +++ b/t/t3800-mktag.sh @@ -133,6 +133,7 @@ cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 ...
This is in preparation for unifying verify_tag() and
parse_tag_buffer_internal().
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 54 +++++++++++++++++++++++++++++-------------------------
1 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/tag.c b/tag.c
index 19c66cd..b134967 100644
--- a/tag.c
+++ b/tag.c
@@ -46,9 +46,11 @@ static int parse_tag_buffer_internal(struct tag *item, const char *data, const u
const char *type_line, *tag_line, *tagger_line;
unsigned long type_len, tag_len;
- if (item->object.parsed)
- return 0;
- item->object.parsed = 1;
+ if (item) {
+ if (item->object.parsed)
+ return 0;
+ item->object.parsed = 1;
+ }
if (size < 64)
return error("failed preliminary size check");
@@ -85,28 +87,30 @@ static int parse_tag_buffer_internal(struct tag *item, const char *data, const u
memcpy(type, type_line + 5, type_len);
type[type_len] = '\0';
- tag_len = tagger_line - tag_line - strlen("tag \n");
- item->tag = xmalloc(tag_len + 1);
- memcpy(item->tag, tag_line + 4, tag_len);
- item->tag[tag_len] = '\0';
-
- if (!strcmp(type, blob_type)) {
- item->tagged = &lookup_blob(sha1)->object;
- } else if (!strcmp(type, tree_type)) {
- item->tagged = &lookup_tree(sha1)->object;
- } else if (!strcmp(type, commit_type)) {
- item->tagged = &lookup_commit(sha1)->object;
- } else if (!strcmp(type, tag_type)) {
- item->tagged = &lookup_tag(sha1)->object;
- } else {
- error("Unknown type %s", type);
- item->tagged = NULL;
- }
-
- if (item->tagged && track_object_refs) {
- struct object_refs *refs = alloc_object_refs(1);
- refs->ref[0] = item->tagged;
- set_object_refs(&item->object, refs);
+ if (item) {
+ tag_len = tagger_line - tag_line - strlen("tag \n");
+ item->tag = xmalloc(tag_len + 1);
+ memcpy(item->tag, tag_line + 4, tag_len);
+ item->tag[tag_len] = '\0';
+
+ if (!strcmp(type, blob_type)) {
+ item->tagged = &lookup_blob(sha1)->object;
+ } else if (!strcmp(type, tree_type)) ...Hi, The patch would have been so much simpler, and so much more descriptive of what you're doing, had you just said if (!item) return 0; Ciao, Dscho -
Signed-off-by: Johan Herland <johan@herland.net>
---
mktag.c | 29 ++++++++++++++++-------------
1 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/mktag.c b/mktag.c
index 4dbefab..2e70504 100644
--- a/mktag.c
+++ b/mktag.c
@@ -81,19 +81,22 @@ static int verify_tag(char *data, unsigned long size)
memcpy(type, type_line + 5, type_len);
type[type_len] = '\0';
- /* Verify that the object matches */
- if (verify_object(sha1, type))
- return error("char%d: could not verify object %s", 7, sha1_to_hex(sha1));
-
- /* Verify the tag-name: we don't allow control characters or spaces in it */
- tag_line += 4;
- for (;;) {
- unsigned char c = *tag_line++;
- if (c == '\n')
- break;
- if (c > ' ')
- continue;
- return error("char" PD_FMT ": could not verify tag name", tag_line - data);
+ {
+ unsigned long i;
+
+ /* Verify that the object matches */
+ if (verify_object(sha1, type))
+ return error("char%d: could not verify object %s", 7, sha1_to_hex(sha1));
+
+ /* Verify the tag-name: we don't allow control characters or spaces in it */
+ for (i = 4;;) {
+ unsigned char c = tag_line[i++];
+ if (c == '\n')
+ break;
+ if (c > ' ')
+ continue;
+ return error("char" PD_FMT ": could not verify tag name", tag_line + i - data);
+ }
}
/* TODO: check for committer info + blank line? */
--
1.5.2
-
What is this change good for? How did you justify the type selection for your loop index variable? IOW, the patch looks very useless. -
I agree. By itself, the patch is useless. However, if you look at the next patch, you'll see that this exact piece of code is moved from verify_tag() to parse_and_verify_tag_buffer(), and in the new context, we can't increment tag_line, since the code that follows depends on tag_line not being moved. In other words this patch is here so that the next patch will be easier to follow. because it's _literally_ moving copying code from verify_tag() and pasting it in parse_and_verify_tag_buffer(). I'm sorry if this is not clear from the patches. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Hi, Then it shouldn't be there. It seems that you do not place the cuts between patches at the _conceptual_ layer. Therefore, they seem intrusive and often the meaning evades me. So, if I understood the purpose of this patch series correctly, namely to use the same verification routines both for creation as for validation of tags, you could have - moved one function into the library (the stricter one), saying "move this_function() into libgit.a to make it usable from git-bla" in the commit body, - used that from the other program, removing the now-unused function, - and then changed the behaviour to be more chatty or some such. As it is, you have a mix of conceptually different changes in almost every patch, and some changes that conceptually belong into the same patch, are not. Be that as may, I think it is not a good change to reuse the same function like you did, exactly because one version _should_ be more forgiving than the other. Ciao, Dscho -
Hi, Do you realize that half of your diff consists of reindenting, just because you introduced this ugly construct, instead of being a good boy and put the declarations where they belong -- at the beginning of the Yes, you can write this construct. That does not change the fact that it gives me eye cancer. Ciao, Dscho -
Rename parse_tag_buffer_internal() to parse_and_verify_tag_buffer() since
it now does tag object verification as well.
Add a new parameter 'thorough_verify' for turning on/off the extra code
to be run when verifying tag objects (as opposed to general parsing).
verify_tag() and parse_and_verify_tag_buffer() are now functionally
equivalent, provided that parse_and_verify_tag_buffer() is called with
item == NULL and thorough_verification != 0.
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
1 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/tag.c b/tag.c
index b134967..3896e45 100644
--- a/tag.c
+++ b/tag.c
@@ -33,7 +33,26 @@ struct tag *lookup_tag(const unsigned char *sha1)
return (struct tag *) obj;
}
-static int parse_tag_buffer_internal(struct tag *item, const char *data, const unsigned long size)
+/*
+ * We refuse to tag something we can't verify. Just because.
+ */
+static int verify_object(unsigned char *sha1, const char *expected_type)
+{
+ int ret = -1;
+ enum object_type type;
+ unsigned long size;
+ void *buffer = read_sha1_file(sha1, &type, &size);
+
+ if (buffer) {
+ if (type == type_from_string(expected_type))
+ ret = check_sha1_signature(sha1, buffer, size, expected_type);
+ free(buffer);
+ }
+ return ret;
+}
+
+static int parse_and_verify_tag_buffer(struct tag *item,
+ const char *data, const unsigned long size, int thorough_verify)
{
#ifdef NO_C99_FORMAT
#define PD_FMT "%d"
@@ -79,6 +98,10 @@ static int parse_tag_buffer_internal(struct tag *item, const char *data, const u
tagger_line = strchr(tag_line, '\n');
if (!tagger_line++)
return error("char" PD_FMT ": could not find next \"\\n\"", tag_line - data);
+ if (thorough_verify) {
+ if (memcmp(tagger_line, "tagger ", 7) || (tagger_line[7] == '\n'))
+ return error("char" PD_FMT ": could not find \"tagger\"", tagger_line - data);
+ }
/* Get the actual type */
type_len = ...This looks very familiar. Haven't you just made a very useless patch which had this very same code? How about putting it in its own function and just call it from these two places? And what problem do you have with pointers?! -
I just answered your comment on the previous patch, and that answer should apply here as well. I'm probably splitting this up into too small pieces, since I keep getting comments that fail to see the overall picture of what I'm trying to do, namely taking two similar pieces of code and slowly unifying them to the point where I can replace one of them by a call to the other (see the two next patches). Hope this helps, ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Hi, Maybe I said that your patch was too large. But then, I said something much more important: hard-to-review. These small patches, split in a manner making it even more difficult to understand what you want to accomplish, do not help. Yes, you should make small patches. Even small patch series. But in such a fashion that a reviewer can see that it is a good patch[*1*]. Just lean back, look at your patches, and ask yourself how you would have reacted if you had reviewed them. Ciao, Dscho *1* A good patch follows the immortal words of Saint Exupery: A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away. -
This involves exposing parse_and_verify_tag_buffer() in the tag API
(tag.h).
Also synchronize selftest with change in error message.
Signed-off-by: Johan Herland <johan@herland.net>
---
mktag.c | 5 ++---
t/t3800-mktag.sh | 2 +-
tag.c | 2 +-
tag.h | 2 ++
4 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/mktag.c b/mktag.c
index 2e70504..5780f33 100644
--- a/mktag.c
+++ b/mktag.c
@@ -125,9 +125,8 @@ int main(int argc, char **argv)
}
buffer[size] = 0;
- /* Verify it for some basic sanity: it needs to start with
- "object <sha1>\ntype\ntagger " */
- if (verify_tag(buffer, size) < 0)
+ /* Verify tag object data */
+ if (parse_and_verify_tag_buffer(0, buffer, size, 1))
die("invalid tag signature file");
if (write_sha1_file(buffer, size, tag_type, result_sha1) < 0)
diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh
index b4edb4d..0e87d2a 100755
--- a/t/t3800-mktag.sh
+++ b/t/t3800-mktag.sh
@@ -34,7 +34,7 @@ too short for a tag
EOF
cat >expect.pat <<EOF
-^error: .*size wrong.*$
+^error: .*size.*$
EOF
check_verify_failure 'Tag object length check'
diff --git a/tag.c b/tag.c
index 3896e45..e12b9fc 100644
--- a/tag.c
+++ b/tag.c
@@ -51,7 +51,7 @@ static int verify_object(unsigned char *sha1, const char *expected_type)
return ret;
}
-static int parse_and_verify_tag_buffer(struct tag *item,
+int parse_and_verify_tag_buffer(struct tag *item,
const char *data, const unsigned long size, int thorough_verify)
{
#ifdef NO_C99_FORMAT
diff --git a/tag.h b/tag.h
index 7a0cb00..f341b7f 100644
--- a/tag.h
+++ b/tag.h
@@ -13,6 +13,8 @@ struct tag {
};
extern struct tag *lookup_tag(const unsigned char *sha1);
+extern int parse_and_verify_tag_buffer(struct tag *item,
+ const char *data, const unsigned long size, int thorough_verify);
extern int parse_tag_buffer(struct tag *item, void *data, unsigned long size);
extern int parse_tag(struct tag *item);
extern struct object ...Signed-off-by: Johan Herland <johan@herland.net>
---
mktag.c | 94 ---------------------------------------------------------------
1 files changed, 0 insertions(+), 94 deletions(-)
diff --git a/mktag.c b/mktag.c
index 5780f33..4f226ab 100644
--- a/mktag.c
+++ b/mktag.c
@@ -14,100 +14,6 @@
* shortest single-character-tag line.
*/
-/*
- * We refuse to tag something we can't verify. Just because.
- */
-static int verify_object(unsigned char *sha1, const char *expected_type)
-{
- int ret = -1;
- enum object_type type;
- unsigned long size;
- void *buffer = read_sha1_file(sha1, &type, &size);
-
- if (buffer) {
- if (type == type_from_string(expected_type))
- ret = check_sha1_signature(sha1, buffer, size, expected_type);
- free(buffer);
- }
- return ret;
-}
-
-static int verify_tag(char *data, unsigned long size)
-{
-#ifdef NO_C99_FORMAT
-#define PD_FMT "%d"
-#else
-#define PD_FMT "%td"
-#endif
-
- unsigned char sha1[20];
- char type[20];
- const char *type_line, *tag_line, *tagger_line;
- unsigned long type_len;
-
- if (size < 64)
- return error("wanna fool me ? you obviously got the size wrong !");
-
- /* Verify object line */
- if (memcmp(data, "object ", 7))
- return error("char%d: does not start with \"object \"", 0);
-
- if (get_sha1_hex(data + 7, sha1))
- return error("char%d: could not get SHA1 hash", 7);
-
- /* Verify type line */
- type_line = data + 48;
- if (memcmp(type_line - 1, "\ntype ", 6))
- return error("char%d: could not find \"\\ntype \"", 47);
-
- /* Verify tag-line */
- tag_line = strchr(type_line, '\n');
- if (!tag_line++)
- return error("char" PD_FMT ": could not find next \"\\n\"", type_line - data);
- if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n')
- return error("char" PD_FMT ": no \"tag \" found", tag_line - data);
-
- /* Verify the tagger line */
- tagger_line = strchr(tag_line, '\n');
- if (!tagger_line++)
- return error("char" PD_FMT ": could not find next \"\\n\"", tag_line - ...So, you do some useless changes just to remove the function completely afterwards? -
Yes. Basically so that people can follow my process. If you don't want the intermediary/useless states, just look at my first patch series that was replaced by this series because it was too bulky and disruptive to follow. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Signed-off-by: Johan Herland <johan@herland.net>
---
mktag.c | 13 ++++++++-----
1 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/mktag.c b/mktag.c
index 4f226ab..31eadd8 100644
--- a/mktag.c
+++ b/mktag.c
@@ -21,7 +21,7 @@ int main(int argc, char **argv)
unsigned char result_sha1[20];
if (argc != 1)
- usage("git-mktag < signaturefile");
+ usage("git-mktag < tag_data_file");
setup_git_directory();
@@ -32,14 +32,17 @@ int main(int argc, char **argv)
buffer[size] = 0;
/* Verify tag object data */
- if (parse_and_verify_tag_buffer(0, buffer, size, 1))
- die("invalid tag signature file");
+ if (parse_and_verify_tag_buffer(0, buffer, size, 1)) {
+ free(buffer);
+ die("invalid tag data file");
+ }
- if (write_sha1_file(buffer, size, tag_type, result_sha1) < 0)
+ if (write_sha1_file(buffer, size, tag_type, result_sha1) < 0) {
+ free(buffer);
die("unable to write tag file");
+ }
free(buffer);
-
printf("%s\n", sha1_to_hex(result_sha1));
return 0;
}
--
1.5.2
-
This, and the similar one below are useless. You're destroying the process, what do you free that buffer for? Either handle the error case or do not needlessly complicate your change, which really also absolutely unneeded. -
Well, I was taught to treat my memory with care. Right now it doesn't make any difference in practice (except that Valgrind might be a bit happier with it), but in the future -- with the libifaction effort and whatnot -- you never know what might happen to this piece of code, and I'd like to stay on the safe side. Feel free to drop this patch from the series if I'm the only one thinking like this. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
How do you treat your performance? Besides, was that systems with common address space So that people have to check your free as well (they will have to, they come looking for die-calls). You just made more work for them. -
Hopefully with care, as well. However, I tend to look at performance _after_ Nope. Never programmed on either. I thought care with memory was generally considered a good principle. If I'm wrong, please point me at the relevant Ok. Drop it. This isn't particularily important to me. I just try to follow good principles when I can. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Hi, On Sat, 9 Jun 2007, Johan Herland wrote: relate with your commit message? You might understand that people _could_ get the impression that the patches were not very carefully crafted. Or even worse, the impression, that it was tried to slip some changes by, with a totally unrelated "official" purpose. (Again, reminds me of politics: it's like slipping an intrusive privacy law into an agriculture related law at the 11th hour, just before Christmas, when people do not really pay attention.) Ciao, Dscho -
Also update selftests to reflect new error messages. Signed-off-by: Johan Herland <johan@herland.net> --- t/t3800-mktag.sh | 22 +++++++++++----------- tag.c | 44 +++++++++++++++++++++++++++++--------------- 2 files changed, 40 insertions(+), 26 deletions(-) diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh index 0e87d2a..3bce5e0 100755 --- a/t/t3800-mktag.sh +++ b/t/t3800-mktag.sh @@ -49,7 +49,7 @@ tag mytag EOF cat >expect.pat <<EOF -^error: char0: .*"object "$ +^error: .*char 0.*"object ".*$ EOF check_verify_failure '"object" line label check' @@ -64,7 +64,7 @@ tag mytag EOF cat >expect.pat <<EOF -^error: char7: .*SHA1 hash$ +^error: .*char 7.*SHA1 hash.*$ EOF check_verify_failure '"object" line SHA1 check' @@ -79,7 +79,7 @@ tag mytag EOF cat >expect.pat <<EOF -^error: char47: .*"[\]ntype "$ +^error: .*char 47.*"[\]ntype ".*$ EOF check_verify_failure '"type" line label check' @@ -91,7 +91,7 @@ echo "object 779e9b33986b1c2670fff52c5067603117b3e895" >tag.sig printf "type tagsssssssssssssssssssssssssssssss" >>tag.sig cat >expect.pat <<EOF -^error: char48: .*"[\]n"$ +^error: .*char 48.*"[\]n".*$ EOF check_verify_failure '"type" line eol check' @@ -106,7 +106,7 @@ xxx mytag EOF cat >expect.pat <<EOF -^error: char57: no "tag " found$ +^error: .*char 57.*"tag ".*$ EOF check_verify_failure '"tag" line label check #1' @@ -121,7 +121,7 @@ tag EOF cat >expect.pat <<EOF -^error: char87: no "tag " found$ +^error: .*char 87.*"tag ".*$ EOF check_verify_failure '"tag" line label check #2' @@ -137,7 +137,7 @@ tagger a EOF cat >expect.pat <<EOF -^error: char53: type too long$ +^error: .*char 53.*Type too long.*$ EOF check_verify_failure '"type" line type-name length check' @@ -153,7 +153,7 @@ tagger a EOF cat >expect.pat <<EOF -^error: char7: could not verify object.*$ +^error: .*char 7.*Could not verify tagged object.*$ EOF check_verify_failure 'verify object ...
Hi, Had you split the patch conceptually, the changes to the tests would have been included where appropriate. I consider a patch _breaking_ a test case, with a follow up patch to the test case, a _bug_. And I consider a change in error messages not good, _unless_ the changes have a real value in practice. Which I maintain they do not, in this patch-series' case. Ciao, Dscho -
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 15 +++++++++++----
1 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/tag.c b/tag.c
index c7c75b3..ac76ec0 100644
--- a/tag.c
+++ b/tag.c
@@ -51,6 +51,13 @@ static int verify_object(unsigned char *sha1, const char *expected_type)
return ret;
}
+/*
+ * Perform parsing and verification of tag object data.
+ *
+ * The 'item' parameter may be set to NULL if only verification is desired.
+ *
+ * The given data _must_ be null-terminated.
+ */
int parse_and_verify_tag_buffer(struct tag *item,
const char *data, const unsigned long size, int thorough_verify)
{
@@ -75,7 +82,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
return error("Tag object failed preliminary size check");
/* Verify object line */
- if (memcmp(data, "object ", 7))
+ if (prefixcmp(data, "object "))
return error("Tag object (@ char 0): "
"Does not start with \"object \"");
@@ -84,7 +91,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
/* Verify type line */
type_line = data + 48;
- if (memcmp(type_line - 1, "\ntype ", 6))
+ if (prefixcmp(type_line - 1, "\ntype "))
return error("Tag object (@ char 47): "
"Could not find \"\\ntype \"");
@@ -94,7 +101,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
return error("Tag object (@ char " PD_FMT "): "
"Could not find \"\\n\" after \"type\"",
type_line - data);
- if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n')
+ if (prefixcmp(tag_line, "tag ") || tag_line[4] == '\n')
return error("Tag object (@ char " PD_FMT "): "
"Could not find \"tag \"", tag_line - data);
@@ -105,7 +112,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
"Could not find \"\\n\" after \"tag\"",
tag_line - data);
if (thorough_verify) {
- if (memcmp(tagger_line, "tagger ", 7) || (tagger_line[7] == '\n'))
+ if (prefixcmp(tagger_line, "tagger ") || (tagger_line[7] == '\n'))
return error("Tag object (@ char " ...This hunk really belongs into commit which introduced the function parse_and_verify_tag_buffer. -
Yes. I'm sorry it slipped out of that patch and into this one. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Hi, FWIW I think that _these_ changes are actually somewhat worth it. And they should have come _instead_ of moving the order of the memcmp() around (you know which patch that was). Ciao, Dscho -
For each of the parsed lines we at some point skip past its initial
identifier ("type ", "tag ", etc.). We also at some point calculate the
length of the remaining line. This patch moves these calculations into
one place. This provides _one_ place for all header lines where their
respective pointers start pointing at the header value (instead of the
start of the line), and their lengths are calculated.
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 21 ++++++++++++++-------
1 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/tag.c b/tag.c
index ac76ec0..9a6924f 100644
--- a/tag.c
+++ b/tag.c
@@ -118,12 +118,20 @@ int parse_and_verify_tag_buffer(struct tag *item,
tagger_line - data);
}
+ /*
+ * Advance header field pointers past their initial identifier.
+ * Calculate lengths of header fields.
+ */
+ type_line += strlen("type ");
+ type_len = tag_line - type_line - 1;
+ tag_line += strlen("tag ");
+ tag_len = tagger_line - tag_line - 1;
+
/* Get the actual type */
- type_len = tag_line - type_line - strlen("type \n");
if (type_len >= sizeof(type))
return error("Tag object (@ char " PD_FMT "): "
- "Type too long", type_line + 5 - data);
- memcpy(type, type_line + 5, type_len);
+ "Type too long", type_line - data);
+ memcpy(type, type_line, type_len);
type[type_len] = '\0';
if (thorough_verify) {
@@ -136,7 +144,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
sha1_to_hex(sha1));
/* Verify tag name: disallow control characters or spaces */
- for (i = 4;;) {
+ for (i = 0;;) {
unsigned char c = tag_line[i++];
if (c == '\n')
break;
@@ -154,9 +162,8 @@ int parse_and_verify_tag_buffer(struct tag *item,
}
if (item) {
- tag_len = tagger_line - tag_line - strlen("tag \n");
item->tag = xmalloc(tag_len + 1);
- memcpy(item->tag, tag_line + 4, tag_len);
+ memcpy(item->tag, tag_line, tag_len);
item->tag[tag_len] = '\0';
if (!strcmp(type, blob_type)) ...Hi, This change does not clarify anything. It is exactly as confusing as I know you introduced this in another patch. This is an ugly construct. If you want people to review your patches, you might as well put in the effort to make the reviewing more pleasant. Ciao, Dscho -
Be explicit about the fact that the "tagger" line is now considered a
mandatory part of the tag object. There are however old tags (from before
July 2005) that don't have a "tagger" line. We therefore consider the
"tagger" line _optional_ when parsing tags without thorough_verify set.
In practice this means that verification of a missing "tagger" line will
only fail when adding or fsck-ing tags.
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 48 ++++++++++++++++++++++++++++++++++--------------
1 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/tag.c b/tag.c
index 9a6924f..c373c86 100644
--- a/tag.c
+++ b/tag.c
@@ -69,8 +69,9 @@ int parse_and_verify_tag_buffer(struct tag *item,
unsigned char sha1[20];
char type[20];
- const char *type_line, *tag_line, *tagger_line;
- unsigned long type_len, tag_len;
+ const char *type_line, *tag_line, *tagger_line;
+ unsigned long type_len, tag_len, tagger_len;
+ const char *header_end;
if (item) {
if (item->object.parsed)
@@ -81,7 +82,7 @@ int parse_and_verify_tag_buffer(struct tag *item,
if (size < 64)
return error("Tag object failed preliminary size check");
- /* Verify object line */
+ /* Verify mandatory object line */
if (prefixcmp(data, "object "))
return error("Tag object (@ char 0): "
"Does not start with \"object \"");
@@ -89,13 +90,13 @@ int parse_and_verify_tag_buffer(struct tag *item,
if (get_sha1_hex(data + 7, sha1))
return error("Tag object (@ char 7): Could not get SHA1 hash");
- /* Verify type line */
+ /* Verify mandatory type line */
type_line = data + 48;
if (prefixcmp(type_line - 1, "\ntype "))
return error("Tag object (@ char 47): "
"Could not find \"\\ntype \"");
- /* Verify tag-line */
+ /* Verify mandatory tag line */
tag_line = strchr(type_line, '\n');
if (!tag_line++)
return error("Tag object (@ char " PD_FMT "): "
@@ -105,27 +106,46 @@ int parse_and_verify_tag_buffer(struct tag *item,
return ...Hi, No. The "before July 2005" part is _not_ the reason that we consider this line optional. The fact that it is bad to fail on a fetch, just because you happen to have an invalid tag in your repository, is a good reason not to. The fact that it is bad to fail on a git branch, just because you happen to have an invalid tag in your repository, is a good reason not to. The fact that it is bad to fail on an fsck, just because you happen to have an invalid tag in your repository, is a good reason not to. And yes, if I remember correctly, your original patch did exactly that. The paradigm to follow is: fail gracefully. I could have an invalid _commit_ in my repository, and would still want _every_ Git operation to succeed, _as long_ as it does not touch that bad object. And I damned well want git-fsck to not crash, just because some assumptions are made. Since this is a fundamental critique on your patch series, I will do the detailed review on _this_ patch in another mail. Ciao, Dscho -
Hi, Hmph. I think everybody reading C code understands that this is mandatory. I even think that the comment is useless. It is sort of a I maintain that even with thorough checking, it is _wrong_ to error on a missing tagger. Since we have to deal with tagger-less tags _anyway_, and since it is not like you could really do something about it (the tag is Does that really make sense? You effectively treat a missing field as if Hmpf. Ciao, Dscho -
The tag line is now optional. If not given in the tag object data, it defaults to the empty string ("") in the parsed tag object. The patch also adds a change to git-show; when asked to display a tag object with no name (missing "tag" header), we will show the tag's sha1 instead of an empty string. Finally the patch includes some tweaks to the selftests to make them work with optional tag names. Signed-off-by: Johan Herland <johan@herland.net> --- builtin-log.c | 2 +- t/t3800-mktag.sh | 6 +++--- tag.c | 51 +++++++++++++++++++++++++++++---------------------- tag.h | 2 +- 4 files changed, 34 insertions(+), 27 deletions(-) diff --git a/builtin-log.c b/builtin-log.c index 212cdfc..8a238c7 100644 --- a/builtin-log.c +++ b/builtin-log.c @@ -181,7 +181,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) printf("%stag %s%s\n\n", diff_get_color(rev.diffopt.color_diff, DIFF_COMMIT), - t->tag, + *(t->tag) ? t->tag : name, diff_get_color(rev.diffopt.color_diff, DIFF_RESET)); ret = show_object(o->sha1, 1); diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh index 3bce5e0..3381239 100755 --- a/t/t3800-mktag.sh +++ b/t/t3800-mktag.sh @@ -106,7 +106,7 @@ xxx mytag EOF cat >expect.pat <<EOF -^error: .*char 57.*"tag ".*$ +^error: .*char 57.*$ EOF check_verify_failure '"tag" line label check #1' @@ -121,7 +121,7 @@ tag EOF cat >expect.pat <<EOF -^error: .*char 87.*"tag ".*$ +^error: .*char 87.*$ EOF check_verify_failure '"tag" line label check #2' @@ -169,7 +169,7 @@ tagger a EOF cat >expect.pat <<EOF -^error: .*char 67.*Could not verify tag name.*$ +^error: .*char 66.*Could not verify tag name.*$ EOF check_verify_failure 'verify tag-name check' diff --git a/tag.c b/tag.c index c373c86..fb678d7 100644 --- a/tag.c +++ b/tag.c @@ -96,15 +96,21 @@ int parse_and_verify_tag_buffer(struct tag *item, return error("Tag object (@ char 47): " ...
Hi, If you don't actually _test_ missing tag names, you might just as well This is misleading. What you wanted to say is t->tag[0] == '\0', or *(t->tag) == '\0'. As you wrote it, you have to think a couple of times why it is okay to dereference t->tag, to check if you say t->tag. Besides, it breaks if you _do_ have an empty tag. In that case, I _want_ to see that it is actually empty, and _not_ the SHA1 substituted for it. Ciao, Dscho -
This patch introduces a new optional header line to the tag object, called
"keywords". The "keywords" line may contain a comma-separated list of
custom keywords associated with the tag object. There are two "special"
keywords, however: "tag" and "note": When the "keywords" header is
missing, its default value is set to "tag" if a "tag" header is
present; else the default "keywords" value is set to "note". The
"keywords" header is meant to be used by porcelains for classifying
different types of tag objects. This classification may then be used to
filter tag objects in the presentation layer (e.g. by implementing
extra filter options to --decorate, etc.)
Signed-off-by: Johan Herland <johan@herland.net>
---
tag.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
tag.h | 1 +
2 files changed, 54 insertions(+), 7 deletions(-)
diff --git a/tag.c b/tag.c
index fb678d7..1caec19 100644
--- a/tag.c
+++ b/tag.c
@@ -69,8 +69,8 @@ int parse_and_verify_tag_buffer(struct tag *item,
unsigned char sha1[20];
char type[20];
- const char *type_line, *tag_line, *tagger_line;
- unsigned long type_len, tag_len, tagger_len;
+ const char *type_line, *tag_line, *keywords_line, *tagger_line;
+ unsigned long type_len, tag_len, keywords_len, tagger_len;
const char *header_end;
if (item) {
@@ -103,15 +103,26 @@ int parse_and_verify_tag_buffer(struct tag *item,
"Could not find \"\\n\" after \"type\"",
type_line - data);
if (prefixcmp(tag_line, "tag ")) /* no tag name given */
- tagger_line = tag_line;
+ keywords_line = tag_line;
else { /* tag name given */
- tagger_line = strchr(tag_line, '\n');
- if (!tagger_line++)
+ keywords_line = strchr(tag_line, '\n');
+ if (!keywords_line++)
return error("Tag object (@ char " PD_FMT "): "
"Could not find \"\\n\" after \"tag\"",
tag_line - data);
}
+ /* Verify optional keywords line */
+ if (prefixcmp(keywords_line, "keywords ")) /* no ...Who frees the keywords and what's wrong with strndup? -
Hmm. Same as for the "tag" line, and the rest of the tag object, I guess. I agree this should probably be specified. What are the rules for other The code that should free the tag name should also free the keywords. No code appear to do this at the moment (regardless of whether you use my patch series or not), which is a shame. We should make sure all objects are properly deallocated. This is currently a general problem in git, isn't it? ...Johan -- Johan Herland, <johan@herland.net> www.herland.net -
Using xstrndup() yields more compact and readable code than using
xmalloc(), memcpy() and manual NUL termination.
Thanks to Alex Riesen <raa.lkml@gmail.com> for suggesting this.
Also fixes a buglet where item->keywords would always be set to "tag",
even if item->tag was empty.
Signed-off-by: Johan Herland <johan@herland.net>
---
Nothing.
...Johan
tag.c | 27 ++++++++++-----------------
1 files changed, 10 insertions(+), 17 deletions(-)
diff --git a/tag.c b/tag.c
index c3a2855..2307ec9 100644
--- a/tag.c
+++ b/tag.c
@@ -219,26 +219,19 @@ int parse_and_verify_tag_buffer(struct tag *item,
}
if (item) { /* Store parsed information into item */
- if (tag_len) { /* optional tag name was given */
- item->tag = xmalloc(tag_len + 1);
- memcpy(item->tag, tag_line, tag_len);
- item->tag[tag_len] = '\0';
- }
- else { /* optional tag name not given */
- item->tag = xmalloc(1);
- item->tag[0] = '\0';
- }
+ if (tag_len) /* optional tag name was given */
+ item->tag = xstrndup(tag_line, tag_len);
+ else /* optional tag name not given */
+ item->tag = xstrndup("", 0);
- if (keywords_len) { /* optional keywords string was given */
- item->keywords = xmalloc(keywords_len + 1);
- memcpy(item->keywords, keywords_line, keywords_len);
- item->keywords[keywords_len] = '\0';
- }
+ if (keywords_len) /* optional keywords string was given */
+ item->keywords = xstrndup(keywords_line, keywords_len);
else { /* optional keywords string not given. Set default */
/* if tag name is set, use "tag"; else use "note" */
- const char *default_kw = item->tag ? "tag" : "note";
- item->keywords = xmalloc(strlen(default_kw) + 1);
- memcpy(item->keywords, default_kw, strlen(default_kw) + 1);
+ if (*(item->tag))
+ item->keywords = xstrndup("tag", 3);
+ else
+ item->keywords = xstrndup("note", 4);
}
if (!strcmp(type, blob_type)) {
--
1.5.2.1.144.gabc40
-
It is DEL, but as the code uses uchar, it probably also error on 0x80 or higher, if the intent is "printable ASCII". -
Signed-off-by: Johan Herland <johan@herland.net>
---
Is this better?
...Johan
tag.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tag.c b/tag.c
index 2307ec9..2b92465 100644
--- a/tag.c
+++ b/tag.c
@@ -183,8 +183,8 @@ int parse_and_verify_tag_buffer(struct tag *item,
/* Verify tag name: disallow control characters or spaces */
if (tag_len) { /* tag name was given */
for (i = 0; i < tag_len; ++i) {
- unsigned char c = tag_line[i];
- if (c > ' ' && c != 0x7f)
+ char c = tag_line[i];
+ if (c > ' ' && c < 0x7f)
continue;
return FAIL("Tag object (@ char " PD_FMT "): "
"Could not verify tag name",
@@ -198,13 +198,13 @@ int parse_and_verify_tag_buffer(struct tag *item,
*/
if (keywords_len) { /* keywords line was given */
for (i = 0; i < keywords_len; ++i) {
- unsigned char c = keywords_line[i];
+ char c = keywords_line[i];
if (c == ',' && keywords_line[i + 1] == ',')
/* consecutive commas */
return FAIL("Tag object (@ char "
PD_FMT "): Found empty keyword",
keywords_line + i - data);
- if (c > ' ' && c != 0x7f)
+ if (c > ' ' && c < 0x7f)
continue;
return FAIL("Tag object (@ char " PD_FMT "): "
"Could not verify keywords",
--
1.5.2.1.144.gabc40
-
I said "*if* the intent is to limit to printable ASCII". It is still unclear what use cases the "keywords" need to support (e.g. do we want to have "pick tags that have keyword that matches this pattern"? if so, what kind of pattern language do we want to use? Glob? Regexp? Would it be more convenient if the keywords are treated case insensitively? Is there a useful use case if you are allowed to use people's names as keywords? Is it reasonable to assume that the keywords can be split with a comma? Do we want to allow a comma as part of keywords? If so, how would we quote a comma that is inside a keyword? etc.). It depends on the answers to these questions if "printable ASCII" is a good set. As a general rule, if you make the initially allowed set of values too wide, it gets _extremely_ hard to tighten it later. So if we are to stick to printable ASCII (iow, "no people's names or names of location as keywords"), I would even suggest to limit the valid set further, say "^[-A-Za-z0-9_+.]+$", at least initially. -
Also update minimum tag object length to the new minimum length after refactoring. Signed-off-by: Johan Herland <johan@herland.net> --- mktag.c | 30 ++++++++++++++++++++++-------- tag.c | 2 +- 2 files changed, 23 insertions(+), 9 deletions(-) diff --git a/mktag.c b/mktag.c index 31eadd8..af0cfa6 100644 --- a/mktag.c +++ b/mktag.c @@ -2,16 +2,30 @@ #include "tag.h" /* - * A signature file has a very simple fixed format: four lines - * of "object <sha1>" + "type <typename>" + "tag <tagname>" + + * Tag object data has the following format: two mandatory lines of + * "object <sha1>" + "type <typename>", plus two optional lines of + * "tag <tagname>" + "keywords <keywords>", plus a mandatory line of * "tagger <committer>", followed by a blank line, a free-form tag - * message and a signature block that git itself doesn't care about, - * but that can be verified with gpg or similar. + * message and an optional signature block that git itself doesn't + * care about, but that can be verified with gpg or similar. * - * The first three lines are guaranteed to be at least 63 bytes: - * "object <sha1>\n" is 48 bytes, "type tag\n" at 9 bytes is the - * shortest possible type-line, and "tag .\n" at 6 bytes is the - * shortest single-character-tag line. + * <sha1> represents the object pointed to by this tag, <typename> is + * the type of the object pointed to ("tag", "blob", "tree" or "commit"), + * <tagname> is the name of this tag object (and must correspond to the + * name of the corresponding ref (if any) in '.git/refs/'). <keywords> is + * a comma-separated list of keywords associated with this tag object, and + * <committer> holds the "name <email>" of the tag creator and timestamp + * of when the tag object was created (analogous to "committer" in commit + * objects). + * + * The first two lines are guaranteed to be at least 57 bytes: + * "object <sha1>\n" is 48 bytes, and "type tag\n" at 9 bytes is + * the shortest possible "type" line. The tagger line ...
Teach git-fsck to do the same kind of verification on tag objects that is
already done by git-mktag.
Signed-off-by: Johan Herland <johan@herland.net>
---
builtin-fsck.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/builtin-fsck.c b/builtin-fsck.c
index 944a496..7b4c36b 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -359,11 +359,26 @@ static int fsck_commit(struct commit *commit)
static int fsck_tag(struct tag *tag)
{
struct object *tagged = tag->tagged;
+ enum object_type type;
+ unsigned long size;
+ char *data = (char *) read_sha1_file(tag->object.sha1, &type, &size);
if (verbose)
fprintf(stderr, "Checking tag %s\n",
sha1_to_hex(tag->object.sha1));
+ if (!data)
+ return objerror(&tag->object, "could not read tag");
+ if (type != OBJ_TAG) {
+ free(data);
+ return objerror(&tag->object, "not a tag (internal error)");
+ }
+ if (parse_and_verify_tag_buffer(0, data, size, 1)) { /* thoroughly verify tag object */
+ free(data);
+ return objerror(&tag->object, "failed thorough tag object verification");
+ }
+ free(data);
+
if (!tagged) {
return objerror(&tag->object, "could not load tagged object");
}
--
1.5.2
-
The new structure of tag objects is documented. Also some much-needed cleanup is done. E.g. remove the paragraph on the 8kB limit, since this limit was removed ages ago. Signed-off-by: Johan Herland <johan@herland.net> --- Documentation/git-mktag.txt | 41 ++++++++++++++++++++++++++++++----------- 1 files changed, 30 insertions(+), 11 deletions(-) diff --git a/Documentation/git-mktag.txt b/Documentation/git-mktag.txt index 0ac3be1..411105d 100644 --- a/Documentation/git-mktag.txt +++ b/Documentation/git-mktag.txt @@ -8,38 +8,57 @@ git-mktag - Creates a tag object SYNOPSIS -------- -'git-mktag' < signature_file +[verse] +'git-mktag' < tag_data_file + DESCRIPTION ----------- -Reads a tag contents on standard input and creates a tag object +Reads tag object data on standard input and creates a tag object that can also be used to sign other objects. The output is the new tag's <object> identifier. -Tag Format + +DISCUSSION ---------- -A tag signature file has a very simple fixed format: three lines of +Tag object data has the following format +[verse] object <sha1> type <typename> - tag <tagname> + tag <tagname> (optional) + keywords <keywords> (optional) + tagger <committer> + +followed by a blank line and a free-form message and an optional signature +that git itself doesn't care about, but that may be verified with gpg or +similar. -followed by some 'optional' free-form signature that git itself -doesn't care about, but that can be verified with gpg or similar. +In the above listing, `<sha1>` represents the object pointed to by this tag, +`<typename>` is the type of the object pointed to ("tag", "blob", "tree" or +"commit"), `<tagname>` is the name of this tag object (and must correspond +to the name of the corresponding ref (if any) in `.git/refs/`). `<keywords>` +is a comma-separated list of keywords associated with this tag object, and +`<committer>` holds the "`name <email>`" of the tag creator ...
Some more tests are added to test the new "keywords" header, and to test the more thorough verification routine. Signed-off-by: Johan Herland <johan@herland.net> --- t/t3800-mktag.sh | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 200 insertions(+), 12 deletions(-) diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh index 3381239..b3b3121 100755 --- a/t/t3800-mktag.sh +++ b/t/t3800-mktag.sh @@ -46,6 +46,8 @@ cat >tag.sig <<EOF xxxxxx 139e9b33986b1c2670fff52c5067603117b3e895 type tag tag mytag +tagger foo + EOF cat >expect.pat <<EOF @@ -61,6 +63,8 @@ cat >tag.sig <<EOF object zz9e9b33986b1c2670fff52c5067603117b3e895 type tag tag mytag +tagger foo + EOF cat >expect.pat <<EOF @@ -76,6 +80,8 @@ cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 xxxx tag tag mytag +tagger foo + EOF cat >expect.pat <<EOF @@ -103,6 +109,8 @@ cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 type tag xxx mytag +tagger foo + EOF cat >expect.pat <<EOF @@ -118,6 +126,9 @@ cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 type taggggggggggggggggggggggggggggggg tag +keywords foo +tagger bar@baz.com + EOF cat >expect.pat <<EOF @@ -127,13 +138,15 @@ EOF check_verify_failure '"tag" line label check #2' ############################################################ -# 8. type line type-name length check +# 8. type line type name length check cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 type taggggggggggggggggggggggggggggggg tag mytag -tagger a +keywords foo +tagger bar@baz.com + EOF cat >expect.pat <<EOF @@ -149,7 +162,9 @@ cat >tag.sig <<EOF object 779e9b33986b1c2670fff52c5067603117b3e895 type tagggg tag mytag -tagger a +keywords foo +tagger bar@baz.com + EOF cat >expect.pat <<EOF @@ -159,13 +174,15 @@ EOF check_verify_failure 'verify object (SHA1/type) check' ...
This patch adds the check described by Junio.
It also includes a bugfix when tag object parsing fails, from
Johannes Schindelin <Johannes.Schindelin@gmx.de>.
Signed-off-by: Johan Herland <johan@herland.net>
---
builtin-fsck.c | 26 ++++++++++++++++++++++++++
1 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/builtin-fsck.c b/builtin-fsck.c
index 7b4c36b..a47976c 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -542,6 +542,30 @@ static int fsck_handle_reflog(const char *logname, const unsigned char *sha1, in
return 0;
}
+static void fsck_verify_ref_to_tag_object(const char *refname, struct object *obj)
+{
+ /* Verify that refname matches the name stored in obj's "tag" header */
+ struct tag *tagobj = (struct tag *) parse_object(obj->sha1);
+ size_t tagname_len;
+ size_t refname_len = strlen(refname);
+
+ if (!tagobj->tag) {
+ error("%s: Failed to parse tag object %s", refname, sha1_to_hex(obj->sha1));
+ return;
+ }
+ tagname_len = strlen(tagobj->tag);
+ if (!tagname_len) return; /* No tag name stored in tagobj. Nothing to do. */
+
+ if (tagname_len < refname_len &&
+ !memcmp(tagobj->tag, refname + (refname_len - tagname_len), tagname_len) &&
+ refname[(refname_len - tagname_len) - 1] == '/') {
+ /* OK: tag name is "$name", and refname ends with "/$name" */
+ return;
+ }
+ else
+ error("%s: Mismatch between tag ref and tag object's name: '%s'", refname, tagobj->tag);
+}
+
static int fsck_handle_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
{
struct object *obj;
@@ -556,6 +580,8 @@ static int fsck_handle_ref(const char *refname, const unsigned char *sha1, int f
/* We'll continue with the rest despite the error.. */
return 0;
}
+ if (obj->type == OBJ_TAG) /* ref to tag object */
+ fsck_verify_ref_to_tag_object(refname, obj);
default_refs++;
obj->used = 1;
mark_reachable(obj, REACHABLE);
--
1.5.2
-
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: R |
