aboutsummaryrefslogtreecommitdiff
path: root/sha1_file.c
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2017-11-21 18:17:39 -0500
committerJunio C Hamano <gitster@pobox.com>2017-11-22 10:50:11 +0900
commit87b5e236a14d4783b063041588c2f77f3cc6ee89 (patch)
treefe370056cc19a5e0e6ae39fab7c2172ad0113369 /sha1_file.c
parentc291293b2ecec8ca77dfd218fa820dd7a0137a2b (diff)
downloadgit-87b5e236a14d4783b063041588c2f77f3cc6ee89.tar.gz
git-87b5e236a14d4783b063041588c2f77f3cc6ee89.tar.xz
sha1_file: fast-path null sha1 as a missing object
In theory nobody should ever ask the low-level object code for a null sha1. It's used as a sentinel for "no such object" in lots of places, so leaking through to this level is a sign that the higher-level code is not being careful about its error-checking. In practice, though, quite a few code paths seem to rely on the null sha1 lookup failing as a way to quietly propagate non-existence (e.g., by feeding it to lookup_commit_reference_gently(), which then returns NULL). When this happens, we do two inefficient things: 1. We actually search for the null sha1 in packs and in the loose object directory. 2. When we fail to find it, we re-scan the pack directory in case a simultaneous repack happened to move it from loose to packed. This can be very expensive if you have a large number of packs. Only the second one actually causes noticeable performance problems, so we could treat them independently. But for the sake of simplicity (both of code and of reasoning about it), it makes sense to just declare that the null sha1 cannot be a real on-disk object, and looking it up will always return "no such object". There's no real loss of functionality to do so Its use as a sentinel value means that anybody who is unlucky enough to hit the 2^-160th chance of generating an object with that sha1 is already going to find the object largely unusable. In an ideal world, we'd simply fix all of the callers to notice the null sha1 and avoid passing it to us. But a simple experiment to catch this with a BUG() shows that there are a large number of code paths that do so. So in the meantime, let's fix the performance problem by taking a fast exit from the object lookup when we see a null sha1. p5551 shows off the improvement (when a fetched ref is new, the "old" sha1 is 0{40}, which ends up being passed for fast-forward checks, the status table abbreviations, etc): Test HEAD^ HEAD -------------------------------------------------------- 5551.4: fetch 5.51(5.03+0.48) 0.17(0.10+0.06) -96.9% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'sha1_file.c')
-rw-r--r--sha1_file.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/sha1_file.c b/sha1_file.c
index bd5f82e66..fbb73f557 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2971,6 +2971,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
lookup_replace_object(sha1) :
sha1;
+ if (is_null_sha1(real))
+ return -1;
+
if (!oi)
oi = &blank_oi;