aboutsummaryrefslogtreecommitdiff
path: root/builtin/prune.c
Commit message (Collapse)AuthorAge
* prune: turn on ref_paranoia flagJeff King2015-03-20
| | | | | | | | | | | | | | | | | | | | | | | Prune should know about broken objects at the tips of refs, so that we can feed them to our traversal rather than ignoring them. It's better for us to abort the operation on the broken object than it is to start deleting objects with an incomplete view of the reachability namespace. Note that for missing objects, aborting is the best we can do. For a badly-named ref, we technically could use its sha1 as a reachability tip. However, the iteration code just feeds us a null sha1, so there would be a reasonable amount of code involved to pass down our wishes. It's not really worth trying to do better, because this is a case that should happen extremely rarely, and the message we provide: fatal: unable to parse object: refs/heads/bogus:name is probably enough to point the user in the right direction. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* prune: keep objects reachable from recent objectsJeff King2014-10-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our current strategy with prune is that an object falls into one of three categories: 1. Reachable (from ref tips, reflogs, index, etc). 2. Not reachable, but recent (based on the --expire time). 3. Not reachable and not recent. We keep objects from (1) and (2), but prune objects in (3). The point of (2) is that these objects may be part of an in-progress operation that has not yet updated any refs. However, it is not always the case that objects for an in-progress operation will have a recent mtime. For example, the object database may have an old copy of a blob (from an abandoned operation, a branch that was deleted, etc). If we create a new tree that points to it, a simultaneous prune will leave our tree, but delete the blob. Referencing that tree with a commit will then work (we check that the tree is in the object database, but not that all of its referred objects are), as will mentioning the commit in a ref. But the resulting repo is corrupt; we are missing the blob reachable from a ref. One way to solve this is to be more thorough when referencing a sha1: make sure that not only do we have that sha1, but that we have objects it refers to, and so forth recursively. The problem is that this is very expensive. Creating a parent link would require traversing the entire object graph! Instead, this patch pushes the extra work onto prune, which runs less frequently (and has to look at the whole object graph anyway). It creates a new category of objects: objects which are not recent, but which are reachable from a recent object. We do not prune these objects, just like the reachable and recent ones. This lets us avoid the recursive check above, because if we have an object, even if it is unreachable, we should have its referent. We can make a simple inductive argument that with this patch, this property holds (that there are no objects with missing referents in the repository): 0. When we have no objects, we have nothing to refer or be referred to, so the property holds. 1. If we add objects to the repository, their direct referents must generally exist (e.g., if you create a tree, the blobs it references must exist; if you create a commit to point at the tree, the tree must exist). This is already the case before this patch. And it is not 100% foolproof (you can make bogus objects using `git hash-object`, for example), but it should be the case for normal usage. Therefore for any sequence of object additions, the property will continue to hold. 2. If we remove objects from the repository, then we will not remove a child object (like a blob) if an object that refers to it is being kept. That is the part implemented by this patch. Note, however, that our reachability check and the actual pruning are not atomic. So it _is_ still possible to violate the property (e.g., an object becomes referenced just as we are deleting it). This patch is shooting for eliminating problems where the mtimes of dependent objects differ by hours or days, and one is dropped without the other. It does nothing to help with short races. Naively, the simplest way to implement this would be to add all recent objects as tips to the reachability traversal. However, this does not perform well. In a recently-packed repository, all reachable objects will also be recent, and therefore we have to look at each object twice. This patch instead performs the reachability traversal, then follows up with a second traversal for recent objects, skipping any that have already been marked. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* prune: factor out loose-object directory traversalJeff King2014-10-16
| | | | | | | | | | | | | | | | | | | | | | | | Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'mh/replace-refs-variable-rename'Junio C Hamano2014-03-14
|\ | | | | | | | | | | | | * mh/replace-refs-variable-rename: Document some functions defined in object.c Add docstrings for lookup_replace_object() and do_lookup_replace_object() rename read_replace_refs to check_replace_refs
| * rename read_replace_refs to check_replace_refsMichael Haggerty2014-02-20
| | | | | | | | | | | | | | | | | | | | | | | | The semantics of this flag was changed in commit e1111cef23 inline lookup_replace_object() calls but wasn't renamed at the time to minimize code churn. Rename it now, and add a comment explaining its use. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | i18n: mark all progress lines for translationNguyễn Thái Ngọc Duy2014-02-24
|/ | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'nd/shallow-clone'Junio C Hamano2014-01-17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
| * prune: clean .git/shallow after pruning objectsNguyễn Thái Ngọc Duy2013-12-10
| | | | | | | | | | | | | | | | This patch teaches "prune" to remove shallow roots that are no longer reachable from any refs (e.g. when the relevant refs are removed). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'mh/path-max'Junio C Hamano2014-01-10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | A few places where we relied on a fixed length buffer to hold pathnames in these two programs have been converted to use strbuf. * mh/path-max: builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX prune-packed: use strbuf to avoid having to worry about PATH_MAX
| * | builtin/prune.c: use strbuf to avoid having to worry about PATH_MAXJeff King2013-12-18
| |/ | | | | | | | | | | | | | | | | | | | | | | While at it, rename prune_tmp_object(), which used to be a helper to remove temporary files that were created to become loose object files, to prune_tmp_file(), as the function is also used to remove any random cruft whose name begins with tmp_ directly in .git/object or .git/object/pack directories these days. Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | replace {pre,suf}fixcmp() with {starts,ends}_with()Christian Couder2013-12-05
|/ | | | | | | | | | | | | | | | | | | | | | | Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'nd/prune-packed-dryrun-verbose'Junio C Hamano2013-06-06
|\ | | | | | | | | * nd/prune-packed-dryrun-verbose: prune-packed: avoid implying "1" is DRY_RUN in prune_packed_objects()
| * prune-packed: avoid implying "1" is DRY_RUN in prune_packed_objects()Nguyễn Thái Ngọc Duy2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b60daf0 (Make git-prune-packed a bit more chatty. - 2007-01-12) changes the meaning of prune_packed_objects()'s argument, from "dry run or not dry run" to a bitmap. It however forgot to update prune_packed_objects() caller in builtin/prune.c to use new DRY_RUN macro. It's fine (for a long time!) but there is a risk that someday someone may change the value of DRY_RUN to something else and builtin/prune.c suddenly breaks. Avoid that possibility. While at there, change "opts == VERBOSE" to "opts & VERBOSE" as there is no obvious reason why we only be chatty when DRY_RUN is not set. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Merge branch 'jk/fully-peeled-packed-ref'Junio C Hamano2013-03-25
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not that we do not actively encourage having annotated tags outside refs/tags/ hierarchy, but they were not advertised correctly to the ls-remote and fetch with recent version of Git. * jk/fully-peeled-packed-ref: pack-refs: add fully-peeled trait pack-refs: write peeled entry for non-tags use parse_object_or_die instead of die("bad object") avoid segfaults on parse_object failure
* | | prune: introduce OPT_EXPIRY_DATE() and use itJunio C Hamano2013-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier we added support for --expire=all (or --expire=now) that considers all crufts, regardless of their age, as eligible for garbage collection by turning command argument parsers that use approxidate() to use parse_expiry_date(), but "git prune" used a built-in parse-options facility OPT_DATE() and did not benefit from the new function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/fully-peeled-packed-ref' into maint-1.8.1Junio C Hamano2013-04-03
|\ \ \ | |/ / |/| / | |/ | | | | | | | | * jk/fully-peeled-packed-ref: pack-refs: add fully-peeled trait pack-refs: write peeled entry for non-tags use parse_object_or_die instead of die("bad object") avoid segfaults on parse_object failure
| * use parse_object_or_die instead of die("bad object")Jeff King2013-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some call-sites do: o = parse_object(sha1); if (!o) die("bad object %s", some_name); We can now handle that as a one-liner, and get more consistent output. In the third case of this patch, it looks like we are losing information, as the existing message also outputs the sha1 hex; however, parse_object will already have written a more specific complaint about the sha1, so there is no point in repeating it here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'rj/path-cleanup'Junio C Hamano2012-09-14
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | * rj/path-cleanup: Call mkpathdup() rather than xstrdup(mkpath(...)) Call git_pathdup() rather than xstrdup(git_path("...")) path.c: Use vsnpath() in the implementation of git_path() path.c: Don't discard the return value of vsnpath() path.c: Remove the 'git_' prefix from a file scope function
| * | Call mkpathdup() rather than xstrdup(mkpath(...))Ramsay Jones2012-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | In addition to updating the xstrdup(mkpath(...)) call sites with mkpathdup(), we also fix a memory leak (in merge_3way()) caused by neglecting to free the memory allocated to the 'base_name' variable. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'nd/i18n-parseopt-help'Junio C Hamano2012-09-07
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A lot of i18n mark-up for the help text from "git <cmd> -h". * nd/i18n-parseopt-help: (66 commits) Use imperative form in help usage to describe an action Reduce translations by using same terminologies i18n: write-tree: mark parseopt strings for translation i18n: verify-tag: mark parseopt strings for translation i18n: verify-pack: mark parseopt strings for translation i18n: update-server-info: mark parseopt strings for translation i18n: update-ref: mark parseopt strings for translation i18n: update-index: mark parseopt strings for translation i18n: tag: mark parseopt strings for translation i18n: symbolic-ref: mark parseopt strings for translation i18n: show-ref: mark parseopt strings for translation i18n: show-branch: mark parseopt strings for translation i18n: shortlog: mark parseopt strings for translation i18n: rm: mark parseopt strings for translation i18n: revert, cherry-pick: mark parseopt strings for translation i18n: rev-parse: mark parseopt strings for translation i18n: reset: mark parseopt strings for translation i18n: rerere: mark parseopt strings for translation i18n: status: mark parseopt strings for translation i18n: replace: mark parseopt strings for translation ...
| * | i18n: prune: mark parseopt strings for translationNguyễn Thái Ngọc Duy2012-08-20
| |/ | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | prune.c: only print informational message in show_only or verbose modeBrandon Casey2012-08-07
|/ | | | | | | | | | | | | | | | | | | "git prune" reports removal of loose object files that are no longer necessary only under the "-v" option, but unconditionally reports removal of temporary files that are no longer needed. The original thinking was that the presence of a leftover temporary file should be an unusual occurrence that may indicate an earlier failure of some sort, and the user may want to be reminded of it. Removing an unnecessary loose object file, on the other hand, is just part of the normal operation. That is why the former is always printed out and the latter only when -v is used. But neither report is particularly useful. Hide both of these behind the "-v" option for consistency. Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fix deletion of .git/objects sub-directories in git-prune/repackKarsten Blees2012-03-07
| | | | | | | | | | | | | | | Both git-prune and git-repack (and thus, git-gc) try to rmdir while holding a DIR* handle on the directory. This can leave dangling empty directories in the .git/objects on platforms where directory cannot be removed while they are open. First call closedir() and then rmdir(); that is more logical ordering. Reported-by: John Chen <john0312@gmail.com> Reported-by: Stefan Naewe <stefan.naewe@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Improved-and-Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* prune: handle --progress/no-progressJeff King2011-11-07
| | | | | | | And have "git gc" pass no-progress when quiet. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* prune: show progress while marking reachable objectsNguyễn Thái Ngọc Duy2011-11-07
| | | | | | | | | prune already shows progress meter while pruning. The marking part may take a few seconds or more, depending on repository size. Show progress meter during this time too. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add description parameter to OPT__DRY_RUNRené Scharfe2010-11-15
| | | | | | | | | Allows better help text to be defined than "dry run". Also make use of the macro in places that already had a different description. No object code changes intended. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add description parameter to OPT__VERBOSERené Scharfe2010-11-15
| | | | | | | | | Allows better help text to be defined than "be verbose". Also make use of the macro in places that already had a different description. No object code changes intended. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* prune: allow --dry-run for -n and --verbose for -vRené Scharfe2010-08-09
| | | | | | | | For consistency with other git commands, let git prune accept the long options --dry-run and --verbose for the respective short ones -n and -v. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'lt/deepen-builtin-source'Junio C Hamano2010-03-10
| | | | | | | | * lt/deepen-builtin-source: Move 'builtin-*' into a 'builtin/' subdirectory Conflicts: Makefile
* Move 'builtin-*' into a 'builtin/' subdirectoryLinus Torvalds2010-02-22
This shrinks the top-level directory a bit, and makes it much more pleasant to use auto-completion on the thing. Instead of [torvalds@nehalem git]$ em buil<tab> Display all 180 possibilities? (y or n) [torvalds@nehalem git]$ em builtin-sh builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o [torvalds@nehalem git]$ em builtin-shor<tab> builtin-shortlog.c builtin-shortlog.o [torvalds@nehalem git]$ em builtin-shortlog.c you get [torvalds@nehalem git]$ em buil<tab> [type] builtin/ builtin.h [torvalds@nehalem git]$ em builtin [auto-completes to] [torvalds@nehalem git]$ em builtin/sh<tab> [type] shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o [torvalds@nehalem git]$ em builtin/sho [auto-completes to] [torvalds@nehalem git]$ em builtin/shor<tab> [type] shortlog.c shortlog.o [torvalds@nehalem git]$ em builtin/shortlog.c which doesn't seem all that different, but not having that annoying break in "Display all 180 possibilities?" is quite a relief. NOTE! If you do this in a clean tree (no object files etc), or using an editor that has auto-completion rules that ignores '*.o' files, you won't see that annoying 'Display all 180 possibilities?' message - it will just show the choices instead. I think bash has some cut-off around 100 choices or something. So the reason I see this is that I'm using an odd editory, and thus don't have the rules to cut down on auto-completion. But you can simulate that by using 'ls' instead, or something similar. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>