aboutsummaryrefslogtreecommitdiff
path: root/cache.h
Commit message (Collapse)AuthorAge
* Merge branch 'jc/shared-literally' into maintJunio C Hamano2009-04-08
|\ | | | | | | | | | | | | | | | | * jc/shared-literally: t1301: loosen test for forced modes set_shared_perm(): sometimes we know what the final mode bits should look like move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath Move chmod(foo, 0444) into move_temp_to_file() "core.sharedrepository = 0mode" should set, not loosen
| * set_shared_perm(): sometimes we know what the final mode bits should look likeJunio C Hamano2009-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | adjust_shared_perm() first obtains the mode bits from lstat(2), expecting to find what the result of applying user's umask is, and then tweaks it as necessary. When the file to be adjusted is created with mkstemp(3), however, the mode thusly obtained does not have anything to do with user's umask, and we would need to start from 0444 in such a case and there is no point running lstat(2) for such a path. This introduces a new API set_shared_perm() to bypass the lstat(2) and instead force setting the mode bits to the desired value directly. adjust_shared_perm() becomes a thin wrapper to the function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jc/maint-1.6.0-keep-pack' into maintJunio C Hamano2009-04-08
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jc/maint-1.6.0-keep-pack: pack-objects: don't loosen objects available in alternate or kept packs t7700: demonstrate repack flaw which may loosen objects unnecessarily Remove --kept-pack-only option and associated infrastructure pack-objects: only repack or loosen objects residing in "local" packs git-repack.sh: don't use --kept-pack-only option to pack-objects t7700-repack: add two new tests demonstrating repacking flaws is_kept_pack(): final clean-up Simplify is_kept_pack() Consolidate ignore_packed logic more has_sha1_kept_pack(): take "struct rev_info" has_sha1_pack(): refactor "pretend these packs do not exist" interface git-repack: resist stray environment variable Conflicts: t/t7700-repack.sh
| * Remove --kept-pack-only option and associated infrastructureBrandon Casey2009-03-20
| | | | | | | | | | | | | | | | | | | | | | This option to pack-objects/rev-list was created to improve the -A and -a options of repack. It was found to be lacking in that it did not provide the ability to differentiate between local and non-local kept packs, and found to be unnecessary since objects residing in local kept packs can be filtered out by the --honor-pack-keep option. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * is_kept_pack(): final clean-upJunio C Hamano2009-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now is_kept_pack() is just a member lookup into a structure, we can write it as such. Also rewrite the sole caller of has_sha1_kept_pack() to switch on the criteria the callee uses (namely, revs->kept_pack_only) between calling has_sha1_kept_pack() and has_sha1_pack(), so that these two callees do not have to take a pointer to struct rev_info as an argument. This removes the header file dependency issue temporarily introduced by the earlier commit, so we revert changes associated to that as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Consolidate ignore_packed logic moreJunio C Hamano2009-02-28
| | | | | | | | | | | | | | | | | | | | | | This refactors three loops that check if a given packfile is on the ignore_packed list into a function is_kept_pack(). The function returns false for a pack on the list, and true for a pack not on the list, because this list is solely used by "git repack" to pass list of packfiles that do not have corresponding .keep files, i.e. a packfile not on the list is "kept". Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * has_sha1_kept_pack(): take "struct rev_info"Junio C Hamano2009-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Its "ignore_packed" parameter always comes from struct rev_info. This patch makes the function take a pointer to the surrounding structure, so that the refactoring in the next patch becomes easier to review. There is an unfortunate header file dependency and the easiest workaround is to temporarily move the function declaration from cache.h to revision.h; this will be moved back to cache.h once the function loses this "ignore_packed" parameter altogether in the later part of the series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * has_sha1_pack(): refactor "pretend these packs do not exist" interfaceJunio C Hamano2009-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | Most of the callers of this function except only one pass NULL to its last parameter, ignore_packed. Introduce has_sha1_kept_pack() function that has the function signature and the semantics of this function, and convert the sole caller that does not pass NULL to call this new function. All other callers and has_sha1_pack() lose the ignore_packed parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Support 'raw' date formatLinus Torvalds2009-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Talking about --date, one thing I wanted for the 1234567890 date was to get things in the raw format. Sure, you get them with --pretty=raw, but it felt a bit sad that you couldn't just ask for the date in raw format. So here's a throw-away patch (meaning: I won't be re-sending it, because I really don't think it's a big deal) to add "--date=raw". It just prints out the internal raw git format - seconds since epoch plus timezone (put another way: 'date +"%s %z"' format) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint'Junio C Hamano2009-02-19
|\ \ | | | | | | | | | | | | | | | | | | * maint: More friendly message when locking the index fails. Document git blame --reverse. Documentation: Note file formats send-email accepts
| * | More friendly message when locking the index fails.Matthieu Moy2009-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Just saying that index.lock exists doesn't tell the user _what_ to do to fix the problem. We should give an indication that it's normally safe to delete index.lock after making sure git isn't running here. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Introduce the function strip_path_suffix()Johannes Schindelin2009-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function strip_path_suffix() will try to strip a given suffix from a given path. The suffix must start at a directory boundary (i.e. "core" is not a path suffix of "libexec/git-core", but "git-core" is). Arbitrary runs of directory separators ("slashes") are assumed identical. Example: strip_path_suffix("C:\\msysgit/\\libexec\\git-core", "libexec///git-core", &prefix) will set prefix to "C:\\msysgit" and return 0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'ms/mailmap'Junio C Hamano2009-02-15
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * ms/mailmap: Move mailmap documentation into separate file Change current mailmap usage to do matching on both name and email of author/committer. Add map_user() and clear_mailmap() to mailmap Add find_insert_index, insert_at_index and clear_func functions to string_list Add mailmap.file as configurational option for mailmap location
| * | | Add mailmap.file as configurational option for mailmap locationMarius Storm-Olsen2009-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to augment the repo mailmap file, and to use mailmap files elsewhere than the repository root. Meaning that the entries in mailmap.file will override the entries in "./.mailmap", should they match. Signed-off-by: Marius Storm-Olsen <marius@trolltech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Revert "Merge branch 'js/notes'"Junio C Hamano2009-02-10
| | | | | | | | | | | | | | | | | | | | This reverts commit 7b75b331f6744fbf953fe8913703378ef86a2189, reversing changes made to 5d680a67d7909c89af96eba4a2d77abed606292b.
* | | | Merge branch 'js/maint-1.6.0-path-normalize'Junio C Hamano2009-02-10
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * js/maint-1.6.0-path-normalize: Remove unused normalize_absolute_path() Test and fix normalize_path_copy() Fix GIT_CEILING_DIRECTORIES on Windows Move sanitary_path_copy() to path.c and rename it to normalize_path_copy() Make test-path-utils more robust against incorrect use
| * | | | Remove unused normalize_absolute_path()Johannes Sixt2009-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is now superseded by normalize_path_copy(). Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | Move sanitary_path_copy() to path.c and rename it to normalize_path_copy()Johannes Sixt2009-02-07
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function and normalize_absolute_path() do almost the same thing. The former already works on Windows, but the latter crashes. In subsequent changes we will remove normalize_absolute_path(). Here we make the replacement function reusable. On the way we rename it to reflect that it does some path normalization. Apart from that this is only moving around code. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'maint'Junio C Hamano2009-02-10
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | * maint: Clear the delta base cache during fast-import checkpoint
| * | | Merge branch 'maint-1.6.0' into maintJunio C Hamano2009-02-10
| |\ \ \ | | | | | | | | | | | | | | | | | | | | * maint-1.6.0: Clear the delta base cache during fast-import checkpoint
| | * | | Clear the delta base cache during fast-import checkpointShawn O. Pearce2009-02-10
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we may reuse the same memory address for a totally different "struct packed_git", and a previously cached object from the prior occupant might be returned when trying to unpack an object from the new pack. Found-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | Merge branch 'lt/maint-wrap-zlib' into maintJunio C Hamano2009-02-05
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c
* | \ \ \ Merge branch 'js/notes'Junio C Hamano2009-02-05
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * js/notes: git-notes: fix printing of multi-line notes notes: fix core.notesRef documentation Add an expensive test for git-notes Speed up git notes lookup Add a script to edit/inspect notes Introduce commit notes Conflicts: pretty.c
| * | | | Introduce commit notesJohannes Schindelin2008-12-21
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit notes are blobs which are shown together with the commit message. These blobs are taken from the notes ref, which you can configure by the config variable core.notesRef, which in turn can be overridden by the environment variable GIT_NOTES_REF. The notes ref is a branch which contains "files" whose names are the names of the corresponding commits (i.e. the SHA-1). The rationale for putting this information into a ref is this: we want to be able to fetch and possibly union-merge the notes, maybe even look at the date when a note was introduced, and we want to store them efficiently together with the other objects. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'tr/previous-branch'Junio C Hamano2009-01-28
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * tr/previous-branch: t1505: remove debugging cruft Simplify parsing branch switching events in reflog Introduce for_each_recent_reflog_ent(). interpret_nth_last_branch(): plug small memleak Fix reflog parsing for a malformed branch switching entry Fix parsing of @{-1}@{1} interpret_nth_last_branch(): avoid traversing the reflog twice checkout: implement "-" abbreviation, add docs and tests sha1_name: support @{-N} syntax in get_sha1() sha1_name: tweak @{-N} lookup checkout: implement "@{-N}" shortcut name for N-th last branch Conflicts: sha1_name.c
| * | | | checkout: implement "@{-N}" shortcut name for N-th last branchJunio C Hamano2009-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a shortcut @{-N} for the N-th last branch checked out, that works by parsing the reflog for the message added by previous git-checkout invocations. We expand the @{-N} to the branch name, so that you end up on an attached HEAD on that branch. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'kb/lstat-cache'Junio C Hamano2009-01-25
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kb/lstat-cache: lstat_cache(): introduce clear_lstat_cache() function lstat_cache(): introduce invalidate_lstat_cache() function lstat_cache(): introduce has_dirs_only_path() function lstat_cache(): introduce has_symlink_or_noent_leading_path() function lstat_cache(): more cache effective symlink/directory detection
| * | | | | lstat_cache(): introduce clear_lstat_cache() functionKjetil Barvik2009-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you want to completely clear the contents of the lstat_cache(), then call this new function. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | lstat_cache(): introduce invalidate_lstat_cache() functionKjetil Barvik2009-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases it could maybe be necessary to say to the cache that "Hey, I deleted/changed the type of this pathname and if you currently have it inside your cache, you should deleted it". This patch introduce a function which support this. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | lstat_cache(): introduce has_dirs_only_path() functionKjetil Barvik2009-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The create_directories() function in entry.c currently calls stat() or lstat() for each path component of the pathname 'path' each and every time. For the 'git checkout' command, this function is called on each file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get lots and lots of calls. To fix this, we make a new wrapper to the lstat_cache() function, and call the wrapper function instead of the calls to the stat() or the lstat() functions. Since the paths given to the create_directories() function, is sorted alphabetically, the new wrapper would be very cache effective in this situation. To support it we must update the lstat_cache() function to be able to say that "please test the complete length of 'name'", and also to give it the length of a prefix, where the cache should use the stat() function instead of the lstat() function to test each path component. Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable comments to this patch! Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | lstat_cache(): introduce has_symlink_or_noent_leading_path() functionKjetil Barvik2009-01-18
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases, especially inside the unpack-trees.c file, and inside the verify_absent() function, we can avoid some unnecessary calls to lstat(), if the lstat_cache() function can also be told to keep track of non-existing directories. So we update the lstat_cache() function to handle this new fact, introduce a new wrapper function, and the result is that we save lots of lstat() calls for a removed directory which previously contained lots of files, when we call this new wrapper of lstat_cache() instead of the old one. We do similar changes inside the unlink_entry() function, since if we can already say that the leading directory component of a pathname does not exist, it is not necessary to try to remove a pathname below it! Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable comments to this patch! Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'cb/add-pathspec'Junio C Hamano2009-01-25
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cb/add-pathspec: remove pathspec_match, use match_pathspec instead clean up pathspec matching
| * | | | | remove pathspec_match, use match_pathspec insteadClemens Buchacher2009-01-14
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'lt/maint-wrap-zlib'Junio C Hamano2009-01-21
|\ \ \ \ \ | |_|/ / / |/| | / / | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | * lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c
| * | | Wrap inflate and other zlib routines for better error reportingLinus Torvalds2009-01-11
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | R. Tyler Ballance reported a mysterious transient repository corruption; after much digging, it turns out that we were not catching and reporting memory allocation errors from some calls we make to zlib. This one _just_ wraps things; it doesn't do the "retry on low memory error" part, at least not yet. It is an independent issue from the reporting. Some of the errors are expected and passed back to the caller, but we die when zlib reports it failed to allocate memory for now. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | sha1_file: make "read_object" staticChristian Couder2009-01-13
| |/ |/| | | | | | | | | | | | | | | | | | | This function is only used from "sha1_file.c". And as we want to add a "replace_object" hook in "read_sha1_file", we must not let people bypass the hook using something other than "read_sha1_file". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint'Junio C Hamano2008-12-11
|\ \ | |/ | | | | | | | | * maint: fsck: reduce stack footprint make sure packs to be replaced are closed beforehand
| * make sure packs to be replaced are closed beforehandNicolas Pitre2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Especially on Windows where an opened file cannot be replaced, make sure pack-objects always close packs it is about to replace. Even on non Windows systems, this could save potential bad results if ever objects were to be read from the new pack file using offset from the old index. This should fix t5303 on Windows. Signed-off-by: Nicolas Pitre <nico@cam.org> Tested-by: Johannes Sixt <j6t@kdbg.org> (MinGW) Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Merge branch 'bc/maint-keep-pack' into maintJunio C Hamano2008-12-02
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * bc/maint-keep-pack: repack: only unpack-unreachable if we are deleting redundant packs t7700: test that 'repack -a' packs alternate packed objects pack-objects: extend --local to mean ignore non-local loose objects too sha1_file.c: split has_loose_object() into local and non-local counterparts t7700: demonstrate mishandling of loose objects in an alternate ODB builtin-gc.c: use new pack_keep bitfield to detect .keep file existence repack: do not fall back to incremental repacking with [-a|-A] repack: don't repack local objects in packs with .keep file pack-objects: new option --honor-pack-keep packed_git: convert pack_local flag into a bitfield and add pack_keep t7700: demonstrate mishandling of objects in packs with a .keep file
* | | git add --intent-to-add: fix removal of cached emptinessJunio C Hamano2008-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses the extended index flag mechanism introduced earlier to mark the entries added to the index via "git add -N" with CE_INTENT_TO_ADD. The logic to detect an "intent to add" entry for the purpose of allowing "git rm --cached $path" is tightened to check not just for a staged empty blob, but with the CE_INTENT_TO_ADD bit. This protects an empty blob that was explicitly added and then modified in the work tree from being dropped with this sequence: $ >empty $ git add empty $ echo "non empty" >empty $ git rm --cached empty Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'nd/narrow' (early part) into jc/add-i-t-aJunio C Hamano2008-11-28
|\ \ \ | | | | | | | | | | | | | | | | * 'nd/narrow' (early part): Extend index to save more flags
| * | | Extend index to save more flagsNguyễn Thái Ngọc Duy2008-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The on-disk format of index only saves 16 bit flags, nearly all have been used. The last bit (CE_EXTENDED) is used to for future extension. This patch extends index entry format to save more flags in future. The new entry format will be used when CE_EXTENDED bit is 1. Because older implementation may not understand CE_EXTENDED bit and misread the new format, if there is any extended entry in index, index header version will turn 3, which makes it incompatible for older git. If there is none, header version will return to 2 again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | | | Merge branch 'lt/preload-lstat'Junio C Hamano2008-11-27
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | * lt/preload-lstat: Fix index preloading for racy dirty case Add cache preload facility
| * | | | Add cache preload facilityLinus Torvalds2008-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can do the lstat() storm in parallel, giving potentially much improved performance for cold-cache cases or things like NFS that have weak metadata caching. Just use "read_cache_preload()" instead of "read_cache()" to force an optimistic preload of the index stat data. The function takes a pathspec as its argument, allowing us to preload only the relevant portion of the index. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'jk/commit-v-strip'Junio C Hamano2008-11-16
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | * jk/commit-v-strip: status: show "-v" diff even for initial commit wt-status: refactor initial commit printing define empty tree sha1 as a macro
| * | | | define empty tree sha1 as a macroJeff King2008-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can potentially be used in a few places, so let's make it available to all parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'np/pack-safer'Junio C Hamano2008-11-12
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * np/pack-safer: t5303: fix printf format string for portability t5303: work around printf breakage in dash pack-objects: don't leak pack window reference when splitting packs extend test coverage for latest pack corruption resilience improvements pack-objects: allow "fixing" a corrupted pack without a full repack make find_pack_revindex() aware of the nasty world make check_object() resilient to pack corruptions make packed_object_info() resilient to pack corruptions make unpack_object_header() non fatal better validation on delta base object offsets close another possibility for propagating pack corruption
| * | | | | make unpack_object_header() non fatalNicolas Pitre2008-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to have pack corruption in the object header. Currently unpack_object_header() simply die() on them instead of letting the caller deal with that gracefully. So let's have unpack_object_header() return an error instead, and find a better name for unpack_object_header_gently() in that context. All callers of unpack_object_header() are ready for it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | close another possibility for propagating pack corruptionNicolas Pitre2008-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Abstract -------- With index v2 we have a per object CRC to allow quick and safe reuse of pack data when repacking. This, however, doesn't currently prevent a stealth corruption from being propagated into a new pack when _not_ reusing pack data as demonstrated by the modification to t5302 included here. The Context ----------- The Git database is all checksummed with SHA1 hashes. Any kind of corruption can be confirmed by verifying this per object hash against corresponding data. However this can be costly to perform systematically and therefore this check is often not performed at run time when accessing the object database. First, the loose object format is entirely compressed with zlib which already provide a CRC verification of its own when inflating data. Any disk corruption would be caught already in this case. Then, packed objects are also compressed with zlib but only for their actual payload. The object headers and delta base references are not deflated for obvious performance reasons, however this leave them vulnerable to potentially undetected disk corruptions. Object types are often validated against the expected type when they're requested, and deflated size must always match the size recorded in the object header, so those cases are pretty much covered as well. Where corruptions could go unnoticed is in the delta base reference. Of course, in the OBJ_REF_DELTA case, the odds for a SHA1 reference to get corrupted so it actually matches the SHA1 of another object with the same size (the delta header stores the expected size of the base object to apply against) are virtually zero. In the OBJ_OFS_DELTA case, the reference is a pack offset which would have to match the start boundary of a different base object but still with the same size, and although this is relatively much more "probable" than in the OBJ_REF_DELTA case, the probability is also about zero in absolute terms. Still, the possibility exists as demonstrated in t5302 and is certainly greater than a SHA1 collision, especially in the OBJ_OFS_DELTA case which is now the default when repacking. Again, repacking by reusing existing pack data is OK since the per object CRC provided by index v2 guards against any such corruptions. What t5302 failed to test is a full repack in such case. The Solution ------------ As unlikely as this kind of stealth corruption can be in practice, it certainly isn't acceptable to propagate it into a freshly created pack. But, because this is so unlikely, we don't want to pay the run time cost associated with extra validation checks all the time either. Furthermore, consequences of such corruption in anything but repacking should be rather visible, and even if it could be quite unpleasant, it still has far less severe consequences than actively creating bad packs. So the best compromize is to check packed object CRC when unpacking objects, and only during the compression/writing phase of a repack, and only when not streaming the result. The cost of this is minimal (less than 1% CPU time), and visible only with a full repack. Someone with a stats background could provide an objective evaluation of this, but I suspect that it's bad RAM that has more potential for data corruptions at this point, even in those cases where this extra check is not performed. Still, it is best to prevent a known hole for corruption when recreating object data into a new pack. What about the streamed pack case? Well, any client receiving a pack must always consider that pack as untrusty and perform full validation anyway, hence no such stealth corruption could be propagated to remote repositoryes already. It is therefore worthless doing local validation in that case. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'bc/maint-keep-pack'Junio C Hamano2008-11-12
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * bc/maint-keep-pack: t7700: test that 'repack -a' packs alternate packed objects pack-objects: extend --local to mean ignore non-local loose objects too sha1_file.c: split has_loose_object() into local and non-local counterparts t7700: demonstrate mishandling of loose objects in an alternate ODB builtin-gc.c: use new pack_keep bitfield to detect .keep file existence repack: do not fall back to incremental repacking with [-a|-A] repack: don't repack local objects in packs with .keep file pack-objects: new option --honor-pack-keep packed_git: convert pack_local flag into a bitfield and add pack_keep t7700: demonstrate mishandling of objects in packs with a .keep file