aboutsummaryrefslogtreecommitdiff
path: root/patch-ids.c
Commit message (Collapse)AuthorAge
* rebase: avoid computing unnecessary patch IDsKevin Willford2016-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `rebase` family of Git commands avoid applying patches that were already integrated upstream. They do that by using the revision walking option that computes the patch IDs of the two sides of the rebase (local-only patches vs upstream-only ones) and skipping those local patches whose patch ID matches one of the upstream ones. In many cases, this causes unnecessary churn, as already the set of paths touched by a given commit would suffice to determine that an upstream patch has no local equivalent. This hurts performance in particular when there are a lot of upstream patches, and/or large ones. Therefore, let's introduce the concept of a "diff-header-only" patch ID, compare those first, and only evaluate the "full" patch ID lazily. Please note that in contrast to the "full" patch IDs, those "diff-header-only" patch IDs are prone to collide with one another, as adjacent commits frequently touch the very same files. Hence we now have to be careful to allow multiple hash entries with the same hash. We accomplish that by using the hashmap_add() function that does not even test for hash collisions. This also allows us to evaluate the full patch ID lazily, i.e. only when we found commits with matching diff-header-only patch IDs. We add a performance test that demonstrates ~1-6% improvement. In practice this will depend on various factors such as how many upstream changes and how big those changes are along with whether file system caches are cold or warm. As Git's test suite has no way of catching performance regressions, we also add a regression test that verifies that the full patch ID computation is skipped when the diff-header-only computation suffices. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* patch-ids: add flag to create the diff patch id using header only dataKevin Willford2016-07-29
| | | | | | | | This will allow a diff patch id to be created using only the header data so that the contents of the file will not have to be loaded. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* patch-ids: replace the seen indicator with a commit pointerKevin Willford2016-07-29
| | | | | | | | | | The cherry_pick_list was looping through the original side checking the seen indicator and setting the cherry_flag on the commit. If we save off the commit in the patch_id we can set the cherry_flag on the correct commit when running through the other side when a patch_id match is found. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* patch-ids: stop using a hand-rolled hashmap implementationKevin Willford2016-07-29
| | | | | | | | | This change will use the hashmap from the hashmap.h to keep track of the patch_ids that have been encountered instead of using an internal implementation. This simplifies the implementation of the patch ids. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* patch-ids: make commit_patch_id() a public helper functionXiaolong Ye2016-04-26
| | | | | | | Make commit_patch_id() available to other builtins. Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Remove get_object_hash.brian m. carlson2015-11-20
| | | | | | | | | Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* Add several uses of get_object_hash.brian m. carlson2015-11-20
| | | | | | | | | | | Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* patch-ids.c: use ALLOC_GROW() in add_commit()Dmitry S. Dolzhenko2014-03-03
| | | | | Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff_setup_done(): return voidThomas Rast2012-08-03
| | | | | | | | | | | | | | | | | diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* patch-ids: use the new generic "sha1_pos" function to lookup sha1Christian Couder2009-04-04
| | | | | | | instead of the specific one from which the new one has been copied. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make the diff_options bitfields be an unsigned with explicit masks.Pierre Habouzit2007-11-11
| | | | | | | reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Refactor patch-id filtering out of git-cherry and git-format-patch.Junio C Hamano2007-04-11
This implements the patch-id computation and recording library, patch-ids.c, and rewrites the get_patch_ids() function used in cherry and format-patch to use it, so that they do not pollute the object namespace. Earlier code threw non-objects into the in-core object database, and hoped for not getting bitten by SHA-1 collisions. While it may be practically Ok, it still was an ugly hack. Signed-off-by: Junio C Hamano <junkio@cox.net>