aboutsummaryrefslogtreecommitdiff
path: root/t/t1450-fsck.sh
Commit message (Collapse)AuthorAge
* Merge branch 'maint-1.8.5' into maint-1.9Junio C Hamano2015-01-07
|\ | | | | | | | | * maint-1.8.5: is_hfs_dotgit: loosen over-eager match of \u{..47}
| * is_hfs_dotgit: loosen over-eager match of \u{..47}Jeff King2014-12-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our is_hfs_dotgit function relies on the hackily-implemented next_hfs_char to give us the next character that an HFS+ filename comparison would look at. It's hacky because it doesn't implement the full case-folding table of HFS+; it gives us just enough to see if the path matches ".git". At the end of next_hfs_char, we use tolower() to convert our 32-bit code point to lowercase. Our tolower() implementation only takes an 8-bit char, though; it throws away the upper 24 bits. This means we can't have any false negatives for is_hfs_dotgit. We only care about matching 7-bit ASCII characters in ".git", and we will correctly process 'G' or 'g'. However, we _can_ have false positives. Because we throw away the upper bits, code point \u{0147} (for example) will look like 'G' and get downcased to 'g'. It's not known whether a sequence of code points whose truncation ends up as ".git" is meaningful in any language, but it does not hurt to be more accurate here. We can just pass out the full 32-bit code point, and compare it manually to the upper and lowercase characters we care about. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Sync with v1.8.5.6Junio C Hamano2014-12-17
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * maint-1.8.5: Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
| * fsck: complain about NTFS ".git" aliases in treesJohannes Schindelin2014-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the index can block pathnames that can be mistaken to mean ".git" on NTFS and FAT32, it would be helpful for fsck to notice such problematic paths. This lets servers which use receive.fsckObjects block them before the damage spreads. Note that the fsck check is always on, even for systems without core.protectNTFS set. This is technically more restrictive than we need to be, as a set of users on ext4 could happily use these odd filenames without caring about NTFS. However, on balance, it's helpful for all servers to block these (because the paths can be used for mischief, and servers which bother to fsck would want to stop the spread whether they are on NTFS themselves or not), and hardly anybody will be affected (because the blocked names are variants of .git or git~1, meaning mischief is almost certainly what the tree author had in mind). Ideally these would be controlled by a separate "fsck.protectNTFS" flag. However, it would be much nicer to be able to enable/disable _any_ fsck flag individually, and any scheme we choose should match such a system. Given the likelihood of anybody using such a path in practice, it is not unreasonable to wait until such a system materializes. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fsck: complain about HFS+ ".git" aliases in treesJeff King2014-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the index can block pathnames that case-fold to ".git" on HFS+, it would be helpful for fsck to notice such problematic paths. This lets servers which use receive.fsckObjects block them before the damage spreads. Note that the fsck check is always on, even for systems without core.protectHFS set. This is technically more restrictive than we need to be, as a set of users on ext4 could happily use these odd filenames without caring about HFS+. However, on balance, it's helpful for all servers to block these (because the paths can be used for mischief, and servers which bother to fsck would want to stop the spread whether they are on HFS+ themselves or not), and hardly anybody will be affected (because the blocked names are variants of .git with invisible Unicode code-points mixed in, meaning mischief is almost certainly what the tree author had in mind). Ideally these would be controlled by a separate "fsck.protectHFS" flag. However, it would be much nicer to be able to enable/disable _any_ fsck flag individually, and any scheme we choose should match such a system. Given the likelihood of anybody using such a path in practice, it is not unreasonable to wait until such a system materializes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fsck: notice .git case-insensitivelyJeff King2014-12-17
| | | | | | | | | | | | | | | | | | | | We complain about ".git" in a tree because it cannot be loaded into the index or checked out. Since we now also reject ".GIT" case-insensitively, fsck should notice the same, so that errors do not propagate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * t1450: refactor ".", "..", and ".git" fsck testsJeff King2014-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We check that fsck notices and complains about confusing paths in trees. However, there are a few shortcomings: 1. We check only for these paths as file entries, not as intermediate paths (so ".git" and not ".git/foo"). 2. We check "." and ".." together, so it is possible that we notice only one and not the other. 3. We repeat a lot of boilerplate. Let's use some loops to be more thorough in our testing, and still end up with shorter code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsck: report integer overflow in author timestampsJeff King2014-02-24
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we check commit objects, we complain if commit->date is ULONG_MAX, which is an indication that we saw integer overflow when parsing it. However, we do not do any check at all for author lines, which also contain a timestamp. Let's actually check the timestamps on each ident line with strtoul. This catches both author and committer lines, and we can get rid of the now-redundant commit->date check. Note that like the existing check, we compare only against ULONG_MAX. Now that we are calling strtoul at the site of the check, we could be slightly more careful and also check that errno is set to ERANGE. However, this will make further refactoring in future patches a little harder, and it doesn't really matter in practice. For 32-bit systems, one would have to create a commit at the exact wrong second in 2038. But by the time we get close to that, all systems will hopefully have moved to 64-bit (and if they haven't, they have a real problem one second later). For 64-bit systems, by the time we get close to ULONG_MAX, all systems will hopefully have been consumed in the fiery wrath of our expanding Sun. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: warn about ".git" in treesJeff King2012-11-28
| | | | | | | | | | | | | | | | | | | Having a ".git" entry inside a tree can cause confusing results on checkout. At the top-level, you could not checkout such a tree, as it would complain about overwriting the real ".git" directory. In a subdirectory, you might check it out, but performing operations in the subdirectory would confusingly consider the in-tree ".git" directory as the repository. The regular git tools already make it hard to accidentally add such an entry to a tree, and do not allow such entries to enter the index at all. Teaching fsck about it provides an additional safety check, and let's us avoid propagating any such bogosity when transfer.fsckObjects is on. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: warn about '.' and '..' in treesJeff King2012-11-28
| | | | | | | | | | | | | | | | | A tree with meta-paths like '.' or '..' does not work well with git; the index will refuse to load it or check it out to the filesystem (and even if we did not have that safety, it would look like we were overwriting an untracked directory). For the same reason, it is difficult to create such a tree with regular git. Let's warn about these dubious entries during fsck, just in case somebody has created a bogus tree (and this also lets us prevent them from propagating when transfer.fsckObjects is set). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jc/maint-t1450-fsck-order-fix' into maintJunio C Hamano2012-10-17
|\ | | | | | | | | * jc/maint-t1450-fsck-order-fix: t1450: the order the objects are checked is undefined
| * t1450: the order the objects are checked is undefinedJunio C Hamano2012-10-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a tag T points at an object X that is of a type that is different from what the tag records as, fsck should report it as an error. However, depending on the order X and T are checked individually, the actual error message can be different. If X is checked first, fsck remembers X's type and then when it checks T, it notices that T records X as a wrong type (i.e. the complaint is about a broken tag T). If T is checked first, on the other hand, fsck remembers that we need to verify X is of the type tag records, and when it later checks X, it notices that X is of a wrong type (i.e. the complaint is about a broken object X). The important thing is that fsck notices such an error and diagnoses the issue on object X, but the test was expecting that we happen to check objects in the order to make us detect issues with tag T, not with object X. Remove this unwarranted assumption. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/maint-null-in-trees' into maint-1.7.11Junio C Hamano2012-09-10
|\ \ | |/ |/| | | | | | | | | | | | | | | | | "git diff" had a confusion between taking data from a path in the working tree and taking data from an object that happens to have name 0{40} recorded in a tree. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value
| * fsck: detect null sha1 in tree entriesJeff King2012-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Short of somebody happening to beat the 1 in 2^160 odds of actually generating content that hashes to the null sha1, we should never see this value in a tree entry. So let's have fsck warn if it it seen. As in the previous commit, we test both blob and submodule entries to future-proof the test suite against the implementation depending on connectivity to notice the error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsck: --no-dangling omits "dangling object" informationJunio C Hamano2012-02-28
|/ | | | | | | | | | | | | | | | The default output from "fsck" is often overwhelmed by informational message on dangling objects, especially if you do not repack often, and a real error can easily be buried. Add "--no-dangling" option to omit them, and update the user manual to demonstrate its use. Based on a patch by Clemens Buchacher, but reverted the part to change the default to --no-dangling, which is unsuitable for the first patch. The usual three-step procedure to break the backward compatibility over time needs to happen on top of this, if we were to go in that direction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git rev-list: fix invalid typecastClemens Buchacher2012-02-13
| | | | | | | | | git rev-list passes rev_list_info, not rev_list objects. Without this fix, rev-list enables or disables the --verify-objects option depending on a read from an undefined memory location. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: improve committer/author checkDmitry Ivankov2011-08-11
| | | | | | | | | | | | | | | | | | | | fsck allows a name with > character in it like "name> <email>". Also for "name email>" fsck says "missing space before email". More precisely, it seeks for a first '<', checks that ' ' preceeds it. Then seeks to '<' or '>' and checks that it is the '>'. Missing space is reported if either '<' is not found or it's not preceeded with ' '. Change it to following. Seek to '<' or '>', check that it is '<' and is preceeded with ' '. Seek to '<' or '>' and check that it is '>'. So now "name> <email>" is rejected as "bad name". More strict name check is the only change in what is accepted. Report 'missing space' only if '<' is found and is not preceeded with a space. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: add a few committer name testsDmitry Ivankov2011-08-11
| | | | | | | | | | | | | fsck reports "missing space before <email>" for committer string equal to "name email>" or to "". It'd be nicer to say "missing email" for the second string and "name is bad" (has > in it) for the first one. Add a failing test for these messages. For "name> <email>" no error is reported. Looks like a bug, so add such a failing test." Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tests: add missing &&Jonathan Nieder2010-11-09
| | | | | | | | | | | Breaks in a test assertion's && chain can potentially hide failures from earlier commands in the chain. Commands intended to fail should be marked with !, test_must_fail, or test_might_fail. The examples in this patch do not require that. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* t1450 (fsck): remove dangling objectsJonathan Nieder2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fsck test is generally careful to remove the corrupt objects it inserts, but dangling objects are left behind due to some typos and omissions. It is better to clean up more completely, to simplify the addition of later tests. So: - guard setup and cleanup with test_expect_success to catch typos and errors; - check both stdout and stderr when checking for empty fsck output; - use test_cmp empty file in place of test $(wc -l <file) = 0, for better debugging output when running tests with -v; - add a remove_object () helper and use it to replace broken object removal code that forgot about the fanout in .git/objects; - disable gc.auto, to avoid tripping up object removal if the number of objects ever reaches that threshold. - use test_when_finished to ensure cleanup tasks are run and succeed when tests fail; - add a new final test that no breakage or dangling objects was left behind. While at it, add a brief description to test_description of the history that is expected to persist between tests. Part of a campaign to clean up subshell usage in tests. Cc: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: fix bogus commit header checkJonathan Nieder2010-05-28
| | | | | | | | | | | | | | | | | | | | | | | | daae1922 (fsck: check ident lines in commit objects, 2010-04-24) taught fsck to expect commit objects to have the form tree <object name> <parents> author <valid ident string> committer <valid ident string> log message The check is overly strict: for example, it errors out with the message “expected blank line” for perfectly valid commits with an "encoding ISO-8859-1" line. Later it might make sense to teach fsck about the rest of the header and warn about unrecognized header lines, but for simplicity, let’s accept arbitrary trailing lines for now. Reported-by: Tuncer Ayaz <tuncer.ayaz@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: check ident lines in commit objectsJonathan Nieder2010-05-01
| | | | | | | | | | Check that email addresses do not contain <, >, or newline so they can be quickly scanned without trouble. The copy() function in ident.c already ensures that ordinary git commands will not write email addresses without this property. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* t1450: fix testcases that were wrongly expecting failureThomas Rast2010-02-19
| | | | | | | | | | | | Almost exactly a year ago in 02a6552 (Test fsck a bit harder), I introduced two testcases that were expecting failure. However, the only bug was that the testcases wrote *blobs* because I forgot to pass -t tag to hash-object. Fix this, and then adjust the rest of the test to properly check the result. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Test fsck a bit harderThomas Rast2009-02-20
| | | | | | | | | | | | | | | git-fsck, of all tools, has very few tests. This adds some more: * a corrupted object; * a branch pointing to a non-commit; * a tag pointing to a nonexistent object; * and a tag pointing to an object of a type other than what the tag itself claims. Only the first two are caught. At least the third probably should, too, but currently slips through. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: check loose objects from alternate object stores by defaultJunio C Hamano2009-01-30
| | | | | | | | | | | | | | | "git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsck: HEAD is part of refsJunio C Hamano2009-01-30
By default we looked at all refs but not HEAD. The only thing that made fsck not lose sight of commits that are only reachable from a detached HEAD was the reflog for the HEAD. This fixes it, with a new test. Signed-off-by: Junio C Hamano <gitster@pobox.com>