aboutsummaryrefslogtreecommitdiff
path: root/t/t5302-pack-index.sh
Commit message (Collapse)AuthorAge
* pack-objects: name pack files after trailer hashJeff King2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our current scheme for naming packfiles is to calculate the sha1 hash of the sorted list of objects contained in the packfile. This gives us a unique name, so we are reasonably sure that two packs with the same name will contain the same objects. It does not, however, tell us that two such packs have the exact same bytes. This makes things awkward if we repack the same set of objects. Due to run-to-run variations, the bytes may not be identical (e.g., changed zlib or git versions, different source object reuse due to new packs in the repository, or even different deltas due to races during a multi-threaded delta search). In theory, this could be helpful to a program that cares that the packfile contains a certain set of objects, but does not care about the particular representation. In practice, no part of git makes use of that, and in many cases it is potentially harmful. For example, if a dumb http client fetches the .idx file, it must be sure to get the exact .pack that matches it. Similarly, a partial transfer of a .pack file cannot be safely resumed, as the actual bytes may have changed. This could also affect a local client which opened the .idx and .pack files, closes the .pack file (due to memory or file descriptor limits), and then re-opens a changed packfile. In all of these cases, git can detect the problem, as we have the sha1 of the bytes themselves in the pack trailer (which we verify on transfer), and the .idx file references the trailer from the matching packfile. But it would be simpler and more efficient to actually get the correct bytes, rather than noticing the problem and having to restart the operation. This patch simply uses the pack trailer sha1 as the pack name. It should be similarly unique, but covers the exact representation of the objects. Other parts of git should not care, as the pack name is returned by pack-objects and is essentially opaque. One test needs to be updated, because it actually corrupts a pack and expects that re-packing the corrupted bytes will use the same name. It won't anymore, but we can easily just use the name that pack-objects hands back. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: do not accept "--index-version=version,"Nguyễn Thái Ngọc Duy2012-02-01
| | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* verify-pack: use index-pack --verifyJunio C Hamano2011-06-05
| | | | | | | | | | | | | | | This finally gets rid of the inefficient verify-pack implementation that walks objects in the packfile in their object name order and replaces it with a call to index-pack --verify. As a side effect, it also removes packed_object_info_detail() API which is rather expensive. As this changes the way errors are reported (verify-pack used to rely on the usual runtime error detection routine unpack_entry() to diagnose the CRC errors in an entry in the *.idx file; index-pack --verify checks the whole *.idx file in one go), update a test that expected the string "CRC" to appear in the error message. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* index-pack --verify: read anomalous offsets from v2 idx fileJunio C Hamano2011-02-27
| | | | | | | | | | | | | A pack v2 .idx file usually records offset using 64-bit representation only when the offset does not fit within 31-bit, but you can handcraft your .idx file to record smaller offset using 64-bit, storing all zero in the upper 4-byte. By inspecting the original idx file when running index-pack --verify, encode such low offsets that do not need to be in 64-bit but are encoded using 64-bit just like the original idx file so that we can still validate the pack/idx pair by comparing the idx file recomputed with the original. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* index-pack: --verifyJunio C Hamano2011-02-27
| | | | | | | | | | | | | | | | | | | | Given an existing .pack file and the .idx file that describes it, this new mode of operation reads and re-index the packfile and makes sure the existing .idx file matches the result byte-for-byte. All the objects in the .pack file are validated during this operation as well. Unlike verify-pack, which visits each object described in the .idx file in the SHA-1 order, index-pack efficiently exploits the delta-chain to avoid rebuilding the objects that are used as the base of deltified objects over and over again while validating the objects, resulting in much quicker verification of the .pack file and its .idx file. This version however cannot verify a .pack/.idx pair with a handcrafted v2 index that uses 64-bit offset representation for offsets that would fit within 31-bit. You can create such an .idx file by giving a custom offset to --index-version option to the command. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tests: add missing &&Jonathan Nieder2010-11-09
| | | | | | | | | | | Breaks in a test assertion's && chain can potentially hide failures from earlier commands in the chain. Commands intended to fail should be marked with !, test_must_fail, or test_might_fail. The examples in this patch do not require that. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tests: Skip tests in a way that makes sense under TAPÆvar Arnfjörð Bjarmason2010-06-25
| | | | | | | | | | SKIP messages are now part of the TAP plan. A TAP harness now knows why a particular test was skipped and can report that information. The non-TAP harness built into Git's test-lib did nothing special with these messages, and is unaffected by these changes. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* t5302: Use prerequisite tags to skip 64-bit offset testsJohannes Sixt2009-03-22
| | | | | | | | | | The effects of this patch can be tested on Linux by commenting out #define _FILE_OFFSET_BITS 64 in git-compat-util.h. Signed-off-by: Johannes Sixt <j6t@kdbg.org>
* t5300, t5302, t5303: Do not use /dev/zeroJohannes Sixt2009-03-19
| | | | | | | | | | We do not have /dev/zero on Windows. This replaces it by data generated with printf, perl, or echo. Most of the cases do not depend on that the data is a stream of zero bytes, so we use something printable; nor is an unlimited stream of data needed, so we produce only as many bytes as the test cases need. Signed-off-by: Johannes Sixt <j6t@kdbg.org>
* Force t5302 to use a single threadJohannes Schindelin2008-12-15
| | | | | | | | | If the packs are made using multiple threads, they are no longer identical on the 4-core Xeon I tested on. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'np/pack-safer'Junio C Hamano2008-11-12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * np/pack-safer: t5303: fix printf format string for portability t5303: work around printf breakage in dash pack-objects: don't leak pack window reference when splitting packs extend test coverage for latest pack corruption resilience improvements pack-objects: allow "fixing" a corrupted pack without a full repack make find_pack_revindex() aware of the nasty world make check_object() resilient to pack corruptions make packed_object_info() resilient to pack corruptions make unpack_object_header() non fatal better validation on delta base object offsets close another possibility for propagating pack corruption
| * close another possibility for propagating pack corruptionNicolas Pitre2008-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Abstract -------- With index v2 we have a per object CRC to allow quick and safe reuse of pack data when repacking. This, however, doesn't currently prevent a stealth corruption from being propagated into a new pack when _not_ reusing pack data as demonstrated by the modification to t5302 included here. The Context ----------- The Git database is all checksummed with SHA1 hashes. Any kind of corruption can be confirmed by verifying this per object hash against corresponding data. However this can be costly to perform systematically and therefore this check is often not performed at run time when accessing the object database. First, the loose object format is entirely compressed with zlib which already provide a CRC verification of its own when inflating data. Any disk corruption would be caught already in this case. Then, packed objects are also compressed with zlib but only for their actual payload. The object headers and delta base references are not deflated for obvious performance reasons, however this leave them vulnerable to potentially undetected disk corruptions. Object types are often validated against the expected type when they're requested, and deflated size must always match the size recorded in the object header, so those cases are pretty much covered as well. Where corruptions could go unnoticed is in the delta base reference. Of course, in the OBJ_REF_DELTA case, the odds for a SHA1 reference to get corrupted so it actually matches the SHA1 of another object with the same size (the delta header stores the expected size of the base object to apply against) are virtually zero. In the OBJ_OFS_DELTA case, the reference is a pack offset which would have to match the start boundary of a different base object but still with the same size, and although this is relatively much more "probable" than in the OBJ_REF_DELTA case, the probability is also about zero in absolute terms. Still, the possibility exists as demonstrated in t5302 and is certainly greater than a SHA1 collision, especially in the OBJ_OFS_DELTA case which is now the default when repacking. Again, repacking by reusing existing pack data is OK since the per object CRC provided by index v2 guards against any such corruptions. What t5302 failed to test is a full repack in such case. The Solution ------------ As unlikely as this kind of stealth corruption can be in practice, it certainly isn't acceptable to propagate it into a freshly created pack. But, because this is so unlikely, we don't want to pay the run time cost associated with extra validation checks all the time either. Furthermore, consequences of such corruption in anything but repacking should be rather visible, and even if it could be quite unpleasant, it still has far less severe consequences than actively creating bad packs. So the best compromize is to check packed object CRC when unpacking objects, and only during the compression/writing phase of a repack, and only when not streaming the result. The cost of this is minimal (less than 1% CPU time), and visible only with a full repack. Someone with a stats background could provide an objective evaluation of this, but I suspect that it's bad RAM that has more potential for data corruptions at this point, even in those cases where this extra check is not performed. Still, it is best to prevent a known hole for corruption when recreating object data into a new pack. What about the streamed pack case? Well, any client receiving a pack must always consider that pack as untrusty and perform full validation anyway, hence no such stealth corruption could be propagated to remote repositoryes already. It is therefore worthless doing local validation in that case. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'np/index-pack'Junio C Hamano2008-11-02
|\ \ | |/ |/| | | | | | | | | | | | | * np/index-pack: index-pack: don't leak leaf delta result improve index-pack tests fix multiple issues in index-pack index-pack: smarter memory usage during delta resolution index-pack: rationalize delta resolution code
| * improve index-pack testsNicolas Pitre2008-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9441b61dc5 introduced serious bugs in index-pack which are described and fixed by commit ce3f6dc655. However, despite the boldness of those bugs, the test suite still passed. This improves t5302-pack-index.sh so to ensure a much better code path coverage. With commit ce3f6dc655 reverted, 17 of the 26 tests do fail now. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | rehabilitate 'git index-pack' inside the object storeNicolas Pitre2008-10-21
|/ | | | | | | | Before commit d0b92a3f6e it was possible to run 'git index-pack' directly in the .git/objects/pack/ directory. Restore that ability. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tests: use "git xyzzy" form (t3600 - t6999)Nanako Shiraishi2008-09-03
| | | | | | | Converts tests between t3600-t6300. Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* t/: Use "test_must_fail git" instead of "! git"Stephan Beyer2008-07-13
| | | | | | | | | | | | | | | This patch changes every occurrence of "! git" -- with the meaning that a git call has to gracefully fail -- into "test_must_fail git". This is useful to - make sure the test does not fail because of a signal, e.g. SIGSEGV, and - advertise the use of "test_must_fail" for new tests. Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* verify-pack: test for detection of index v2 object CRC mismatchNicolas Pitre2008-06-24
| | | | | Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fix bsd shell negationJeff King2008-05-13
| | | | | | | | | | | | | | | | | | | | | | | | On some shells (notably /bin/sh on FreeBSD 6.1), the construct foo && ! bar | baz is true if foo && baz whereas for most other shells (such as bash) is true if foo && ! baz We can work around this by specifying foo && ! (bar | baz) which works everywhere. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* remove use of "tail -n 1" and "tail -1"Jeff King2008-03-13
| | | | | | | | | | | | | | | The "-n" syntax is not supported by System V versions of tail (which prefer "tail -1"). Unfortunately "tail -1" is not actually POSIX. We had some of both forms in our scripts. Since neither form works everywhere, this patch replaces both with the equivalent sed invocation: sed -ne '$p' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Sane use of test_expect_failureJunio C Hamano2008-02-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, test_expect_failure was designed to be the opposite of test_expect_success, but this was a bad decision. Most tests run a series of commands that leads to the single command that needs to be tested, like this: test_expect_{success,failure} 'test title' ' setup1 && setup2 && setup3 && what is to be tested ' And expecting a failure exit from the whole sequence misses the point of writing tests. Your setup$N that are supposed to succeed may have failed without even reaching what you are trying to test. The only valid use of test_expect_failure is to check a trivial single command that is expected to fail, which is a minority in tests of Porcelain-ish commands. This large-ish patch rewrites all uses of test_expect_failure to use test_expect_success and rewrites the condition of what is tested, like this: test_expect_success 'test title' ' setup1 && setup2 && setup3 && ! this command should fail ' test_expect_failure is redefined to serve as a reminder that that test *should* succeed but due to a known breakage in git it currently does not pass. So if git-foo command should create a file 'bar' but you discovered a bug that it doesn't, you can write a test like this: test_expect_failure 'git-foo should create bar' ' rm -f bar && git foo && test -f bar ' This construct acts similar to test_expect_success, but instead of reporting "ok/FAIL" like test_expect_success does, the outcome is reported as "FIXED/still broken". Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rehabilitate some t5302 tests on 32-bit off_t machinesNicolas Pitre2007-11-15
| | | | | | | | | Commit 8ed2fca458d085f12c3c6808ef4ddab6aa40ef14 was a bit draconian in skipping certain tests which should be perfectly valid even on platform with a 32-bit off_t. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* t5302-pack-index: Skip tests of 64-bit offsets if necessary.Johannes Sixt2007-11-14
| | | | | | | | There are platforms where off_t is not 64 bits wide. In this case many tests are doomed to fail. Let's skip them. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Rewrite "git-frotz" to "git frotz"Junio C Hamano2007-07-02
| | | | | | This uses the remove-dashes target to replace "git-frotz" to "git frotz". Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Don't use seq in tests, not everyone has itShawn O. Pearce2007-05-02
| | | | | | | For example Mac OS X lacks the seq command. So we cannot use it there. A good old while loop works just as good. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* t5302: avoid using tail -cJunio C Hamano2007-04-23
| | | | | | | A Large Angry SCM (gitzilla) noticed that on an unnamed platform, tail -c wants its byte count as part of the option, not as a separate argument. Signed-off-by: Junio C Hamano <junkio@cox.net>
* tests for various pack index featuresNicolas Pitre2007-04-11
This is a fairly complete list of tests for various aspects of pack index versions 1 and 2. Tests on index v2 include 32-bit and 64-bit offsets, as well as a nice demonstration of the flawed repacking integrity checks that index version 2 intend to solve over index version 1 with the per object CRC. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>