aboutsummaryrefslogtreecommitdiff
path: root/combine-diff.c
Commit message (Collapse)AuthorAge
* use st_add and st_mult for allocation size computationJeff King2016-02-22
| | | | | | | | | | | | | If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* convert trivial cases to FLEX_ARRAY macrosJeff King2016-02-22
| | | | | | | | | | Using FLEX_ARRAY macros reduces the amount of manual computation size we have to do. It also ensures we don't overflow size_t, and it makes sure we write the same number of bytes that we allocated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* use xmallocz to avoid size arithmeticJeff King2016-02-22
| | | | | | | | | | | | | | | | | | | | We frequently allocate strings as xmalloc(len + 1), where the extra 1 is for the NUL terminator. This can be done more simply with xmallocz, which also checks for integer overflow. There's no case where switching xmalloc(n+1) to xmallocz(n) is wrong; the result is the same length, and malloc made no guarantees about what was in the buffer anyway. But in some cases, we can stop manually placing NUL at the end of the allocated buffer. But that's only safe if it's clear that the contents will always fill the buffer. In each case where this patch does so, I manually examined the control flow, and I tried to err on the side of caution. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* convert trivial cases to ALLOC_ARRAYJeff King2016-02-22
| | | | | | | | | | | | | | | Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Remove get_object_hash.brian m. carlson2015-11-20
| | | | | | | | | Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* Add several uses of get_object_hash.brian m. carlson2015-11-20
| | | | | | | | | | | Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* Sync with 2.4.10Junio C Hamano2015-09-28
|\
| * Sync with 2.3.10Junio C Hamano2015-09-28
| |\
| | * react to errors in xdi_diffJeff King2015-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we call into xdiff to perform a diff, we generally lose the return code completely. Typically by ignoring the return of our xdi_diff wrapper, but sometimes we even propagate that return value up and then ignore it later. This can lead to us silently producing incorrect diffs (e.g., "git log" might produce no output at all, not even a diff header, for a content-level diff). In practice this does not happen very often, because the typical reason for xdiff to report failure is that it malloc() failed (it uses straight malloc, and not our xmalloc wrapper). But it could also happen when xdiff triggers one our callbacks, which returns an error (e.g., outf() in builtin/rerere.c tries to report a write failure in this way). And the next patch also plans to add more failure modes. Let's notice an error return from xdiff and react appropriately. In most of the diff.c code, we can simply die(), which matches the surrounding code (e.g., that is what we do if we fail to load a file for diffing in the first place). This is not that elegant, but we are probably better off dying to let the user know there was a problem, rather than simply generating bogus output. We could also just die() directly in xdi_diff, but the callers typically have a bit more context, and can provide a better message (and if we do later decide to pass errors up, we're one step closer to doing so). There is one interesting case, which is in diff_grep(). Here if we cannot generate the diff, there is nothing to match, and we silently return "no hits". This is actually what the existing code does already, but we make it a little more explicit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/color-diff-plain-is-context'Junio C Hamano2015-06-11
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | "color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"
| * | diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXTJeff King2015-05-27
| |/ | | | | | | | | | | | | | | | | The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | diff: convert struct combine_diff_path to object_idbrian m. carlson2015-03-13
|/ | | | | | | Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/pretty-empty-format'Junio C Hamano2014-09-02
|\ | | | | | | | | | | | | | | | | | | | | "git log --pretty/format=" with an empty format string did not mean the more obvious "No output whatsoever" but "Use default format", which was counterintuitive. * jk/pretty-empty-format: pretty: make empty userformats truly empty pretty: treat "--format=" as an empty userformat revision: drop useless string offset when parsing "--pretty"
| * pretty: make empty userformats truly emptyJeff King2014-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user provides an empty format with "--format=", we end up putting in extra whitespace that the user cannot prevent. This comes from two places: 1. If the format is missing a terminating newline, we add one automatically. This makes sense for --format=%h, but not for a truly empty format. 2. We add an extra newline between the pretty-printed format and a diff or diffstat. If the format is empty, there's no point in doing so if there's nothing to separate. With this patch, one can get a diff with no other cruft out of "diff-tree --format= $commit". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/diff-tree-t-fix'Junio C Hamano2014-08-26
|\ \ | |/ |/| | | | | | | | | Fix (rarely used) "git diff-tree -t" regression in 2.0. * jk/diff-tree-t-fix: intersect_paths: respect mode in git's tree-sort
| * intersect_paths: respect mode in git's tree-sortJeff King2014-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we do a combined diff, we individually diff against each parent, and then use intersect_paths to do a parallel walk through the sorted results and come up with a final list of interesting paths. The sort order here is that returned by the diffs, which means it is in git's tree-order which sorts sub-trees as if their paths have "/" at the end. When we do our parallel walk, we need to use a comparison function which provides the same order. Since 8518ff8 (combine-diff: optimize combine_diff_path sets intersection, 2014-01-20), we use a simple strcmp to compare the pathnames, and get this wrong. It's somewhat hard to trigger because normally a diff does not produce tree entries at all, and therefore the sort order is the same as a strcmp. However, if the "-t" option is used with the diff, then we will produce diff_filepairs for both trees and files. We can use base_name_compare to do the comparison, just as the tree-diff code does. Even though what we have are not technically base names (they are full paths within the tree), the end result is the same (we do not care about interior slashes at all, only about the final character). However, since we do not have the length of each path stored, we take a slight shortcut: if neither of the entries is a sub-tree then the comparison is equivalent to a strcmp. This lets us skip the extra strlen calls in the common case without having to reimplement base_name_compare from scratch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'mk/show-s-no-extra-blank-line-for-merges'Junio C Hamano2014-06-06
|\ \ | | | | | | | | | | | | * mk/show-s-no-extra-blank-line-for-merges: git-show: fix 'git show -s' to not add extra terminator after merge commit
| * | git-show: fix 'git show -s' to not add extra terminator after merge commitMax Kirillov2014-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When git show -s is called for merge commit it prints extra newline after any merge commit. This differs from output for commits with one parent. Fix it by more thorough checking that diff output is disabled. The code in question exists since commit 3969cf7db1. The additional newline is really needed for cases when patch is requested, test t4013-diff-various.sh contains cases which can demonstrate behavior when the condition is restricted further. Tests: Added merge commit to 'set up a bit of history' case in t7007-show.sh to cover the fix. Existing tests are updated to demonstrate the new behaviour. Earlier, the tests that used "git show -s --pretty=format:%s", even though "--pretty=format:%s" calls for item separator semantics and does not ask for the terminating newline after the last item, expected the output to end with such a newline. They were relying on the buggy behaviour. Use of "--format=%s", which is equivalent to "--pretty=tformat:%s" that asks for a terminating newline after each item, is a more realistic way to use the command. In the test 'merge log messages' the expected data is changed, because it was explicitly listing the extra newline. Also the msg.nologff and msg.nolognoff expected files are replaced by one msg.nolog, because they were diffing because of the bug, and now there should be no difference. Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | combine-diff: speed it up, by using multiparent diff tree-walker directlyKirill Smelkov2014-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As was recently shown in "combine-diff: optimize combine_diff_path sets intersection", combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(*), we can call multiparent tree-walker directly and get ready paths. (*) because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | combine-diff: move changed-paths scanning logic into its own functionKirill Smelkov2014-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move code for finding paths for which diff(commit,parent_i) is not-empty for all parents to separate function - at present we have generic (and slow) code for this job, which translates 1 n-parent problem to n 1-parent problems and then intersect results, and will be adding another limited, but faster, paths scanning implementation in the next patch. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | combine-diff: move show_log_first logic/action out of paths scanningKirill Smelkov2014-02-24
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | Judging from sample outputs and tests nothing changes in diff -c output, and this change will help later patches, when we'll be refactoring paths scanning into its own function with several variants - the show_log_first logic / code will stay common to all of them. NOTE: only now we have to take care to explicitly not show anything if parents array is empty, as in fact there are some clients in Git code, which calls diff_tree_combined() in such a way. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | combine-diff: simplify intersect_paths() furtherJunio C Hamano2014-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linus once said: I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like if (prev) prev->next = entry->next; else list_head = entry->next; and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common. People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a "*pp = entry->next". Applying that simplification lets us lose 7 lines from this function even while adding 2 lines of comment. I was tempted to squash this into the original commit, but because the benchmarking described in the commit log is without this simplification, I decided to keep it a separate follow-up patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | combine-diff: combine_diff_path.len is not needed anymoreKirill Smelkov2014-02-24
| | | | | | | | | | | | | | | | | | | | | | | | The field was used in order to speed-up name comparison and also to mark removed paths by setting it to 0. Because the updated code does significantly less strcmp and also just removes paths from the list and free right after we know a path will not be needed, it is not needed anymore. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | combine-diff: optimize combine_diff_path sets intersectionKirill Smelkov2014-02-24
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When generating combined diff, for each commit, we intersect diff paths from diff(parent_0,commit) to diff(parent_i,commit) comparing all paths pairs, i.e. doing it the quadratic way. That is correct, but could be optimized. Paths come from trees in sorted (= tree) order, and so does diff_tree() emits resulting paths in that order too. Now if we look at diffcore transformations, all of them, except diffcore_order, preserve resulting path ordering: - skip_stat_unmatch, grep, pickaxe, filter -- just skip elements -> order stays preserved - break -- just breaks diff for a path, adding path dup after the path -> order stays preserved - detect rename/copy -- resulting paths are emitted sorted (verified empirically) So only diffcore_order changes diff paths ordering. But diffcore_order meaning affects only presentation - i.e. only how to show the diff, so we could do all the internal computations without paths reordering, and order only resultant paths set. This is faster, since, if we know two paths sets are all ordered, their intersection could be done in linear time. This patch does just that. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c before 1.9s 20.4s after 1.9s 16.6s navy.git (private repo) log log -c before 0.83s 15.6s after 0.83s 2.1s P.S. I think linux.git case is sped up not so much as the second one, since in navy.git, there are more exotic (subtree, etc) merges. P.P.S. My tracing showed that the rest of the time (16.6s vs 1.9s) is usually spent in computing huge diffs from commit to second parent. Will try to deal with it, if I'll have time. P.P.P.S. For combine_diff_path, ->len is not needed anymore - will remove it in the next noisy cleanup path, to maintain good signal/noise ratio here. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jl/submodule-mv'Junio C Hamano2013-09-09
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
| * remove diff_tree_{setup,release}_pathsNguyễn Thái Ngọc Duy2013-07-15
| | | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'tr/log-full-diff-keep-true-parents'Junio C Hamano2013-09-09
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Output from "git log --full-diff -- <pathspec>" looked strange, because comparison was done with the previous ancestor that touched the specified <pathspec>, causing the patches for paths outside the pathspec to show more than the single commit has changed. Tweak "git reflog -p" for the same reason using the same mechanism. * tr/log-full-diff-keep-true-parents: log: use true parents for diff when walking reflogs log: use true parents for diff even when rewriting
| * | log: use true parents for diff even when rewritingThomas Rast2013-08-01
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using pathspec filtering in combination with diff-based log output, parent simplification happens before the diff is computed. The diff is therefore against the *simplified* parents. This works okay, arguably by accident, in the normal case: simplification reduces to one parent as long as the commit is TREESAME to it. So the simplified parent of any given commit must have the same tree contents on the filtered paths as its true (unfiltered) parent. However, --full-diff breaks this guarantee, and indeed gives pretty spectacular results when comparing the output of git log --graph --stat ... git log --graph --full-diff --stat ... (--graph internally kicks in parent simplification, much like --parents). To fix it, store a copy of the parent list before simplification (in a slab) whenever --full-diff is in effect. Then use the stored parents instead of the simplified ones in the commit display code paths. The latter do not actually check for --full-diff to avoid duplicated code; they just grab the original parents if save_parents() has not been called for this revision walk. For ordinary commits it should be obvious that this is the right thing to do. Merge commits are a bit subtle. Observe that with default simplification, merge simplification is an all-or-nothing decision: either the merge is TREESAME to one parent and disappears, or it is different from all parents and the parent list remains intact. Redundant parents are not pruned, so the existing code also shows them as a merge. So if we do show a merge commit, the parent list just consists of the rewrite result on each parent. Running, e.g., --cc on this in --full-diff mode is not very useful: if any commits were skipped, some hunks will disagree with all sides of the merge (with one side, because commits were skipped; with the others, because they didn't have those changes in the first place). This triggers --cc showing these hunks spuriously. Therefore I believe that even for merge commits it is better to show the diffs wrt. the original parents. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | many small typofixesOndřej Bílka2013-07-29
|/ | | | | | Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Reviewed-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'cb/log-follow-with-combined'Junio C Hamano2013-06-11
|\ | | | | | | | | * cb/log-follow-with-combined: fix segfault with git log -c --follow
| * fix segfault with git log -c --followClemens Buchacher2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In diff_tree_combined we make a copy of diffopts. In try_to_follow_renames, called via diff_tree_sha1, we free and re-initialize diffopts->pathspec->items. Since we did not make a deep copy of diffopts in diff_tree_combined, the original diffopts does not get the update. By the time we return from diff_tree_combined, rev->diffopt->pathspec->items points to an invalid memory address. We get a segfault next time we try to access that pathspec. Instead, along with the copy of diffopts, make a copy pathspec->items as well. We would also have to make a copy of pathspec->raw to keep it consistent with pathspec->items, but nobody seems to rely on that. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'mk/combine-diff-context-horizon-fix'Junio C Hamano2013-06-02
|\ \ | | | | | | | | | | | | | | | | | | | | | "git diff -c -p" was not showing a deleted line from a hunk when another hunk immediately begins where the earlier one ends. * mk/combine-diff-context-horizon-fix: combine-diff.c: Fix output when changes are exactly 3 lines apart
| * | combine-diff.c: Fix output when changes are exactly 3 lines apartMatthijs Kooijman2013-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a deletion is followed by exactly 3 (or whatever the number of context lines) unchanged lines, followed by another change, the combined diff output would hide the first deletion, resulting in a malformed diff. This happened because the 3 lines before each change are painted interesting, but also marked as no_pre_delete to prevent showing deletes that were previously marked as uninteresting. This behaviour was introduced in c86fbe53 (diff -c/--cc: do not include uninteresting deletion before leading context). However, as a side effect, this could also mark deletes that were already interesting as no_pre_delete. This would happen only if the delete was exactly 3 lines away from the next change, since lines farther away would not be touched by the "paint three lines before the change" code and lines closer would be painted by the "merge two adjacent hunks" code instead, which does not set the no_pre_delete flag. This commit fixes this problem by only setting the no_pre_delete flag for changes that were previously uninteresting. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | combine-diff: coalesce lost lines optimallyAntoine Pelisse2013-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the greedy implementation to coalesce lost lines by using dynamic programming to find the Longest Common Subsequence. The O(n²) time complexity is obviously bigger than previous implementation but it can produce shorter diff results (and most likely easier to read). List of lost lines is now doubly-linked because we reverse-read it when reading the direction matrix. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Allow combined diff to ignore white-spacesAntoine Pelisse2013-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The combined diff --cc output does not honor options to ignore whitespace changes (-b, -w, and --ignore-space-at-eol). Correct this by passing diff flags to diff engine, so that combined diff behaves as normal diff does with spaces, and by coalescing lines that are removed from both (or more) parents, honoring the same rule to ignore whitespace changes. With this change, a conflict-less merge done using a ignore-* strategy option will not show any conflict if shown in combined-diff using the same option. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/diff-graph-cleanup'Junio C Hamano2013-02-14
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactors a lot of repetitive code sequence from the graph drawing code and adds it to the combined diff output. * jk/diff-graph-cleanup: combine-diff.c: teach combined diffs about line prefix diff.c: use diff_line_prefix() where applicable diff: add diff_line_prefix function diff.c: make constant string arguments const diff: write prefix to the correct file graph: output padding for merge subsequent parents
| * | combine-diff.c: teach combined diffs about line prefixJohn Keeping2013-02-12
| |/ | | | | | | | | | | | | | | | | | | | | | | | | When running "git log --graph --cc -p" the diff output for merges is not indented by the graph structure, unlike the diffs of non-merge commits (added in commit 7be5761 - diff.c: Output the text graph padding before each diff line). Fix this by teaching the combined diff code to output diff_line_prefix() before each line. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Merge branch 'jk/maint-null-in-trees'Junio C Hamano2012-08-27
| |\ | | | | | | | | | | | | | | | | | | | | | | | | We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value
* | | combine-diff: lift 32-way limit of combined diffJunio C Hamano2013-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "raw" format of combine-diff output is supposed to have as many colons as there are parents at the beginning, then blob modes for these parents, and then object names for these parents. We weren't however prepared to handle a more than 32-way merge and did not show the correct number of colons in such a case. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/maint-null-in-trees' into maint-1.7.11Junio C Hamano2012-09-10
|\ \ \ | |/ / |/| / | |/ | | | | | | | | | | | | | | "git diff" had a confusion between taking data from a path in the working tree and taking data from an object that happens to have name 0{40} recorded in a tree. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value
| * diff: do not use null sha1 as a sentinel valueJeff King2012-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'rs/combine-diff-zero-context-at-the-beginning'Junio C Hamano2012-04-16
|\ \ | |/ |/| | | | | | | | | | | | | Fixes an age old corner case bug in combine diff (only triggered with -U0 and the hunk at the beginning of the file needs to be shown). By René Scharfe * rs/combine-diff-zero-context-at-the-beginning: combine-diff: fix loop index underflow
| * combine-diff: fix loop index underflowRené Scharfe2012-03-25
| | | | | | | | | | | | | | | | | | | | If both la and context are zero at the start of the loop, la wraps around and we end up reading from memory far away. Skip the loop in that case instead. Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | pass struct commit to diff_tree_combined_merge()René Scharfe2011-12-17
| | | | | | | | | | | | | | | | Instead of passing the hash of a commit and then searching that same commit in the single caller, simply pass the commit directly. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | use struct sha1_array in diff_tree_combined()René Scharfe2011-12-17
| | | | | | | | | | | | | | | | | | Maintaining an array of hashes is easier using sha1_array than open-coding it. This patch also fixes a leak of the SHA1 array in diff_tree_combined_merge(). Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/color-and-pager'Junio C Hamano2011-08-28
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jk/color-and-pager: want_color: automatically fallback to color.ui diff: don't load color config in plumbing config: refactor get_colorbool function color: delay auto-color decision until point of use git_config_colorbool: refactor stdout_is_tty handling diff: refactor COLOR_DIFF from a flag into an int setup_pager: set GIT_PAGER_IN_USE t7006: use test_config helpers test-lib: add helper functions for config t7006: modernize calls to unset Conflicts: builtin/commit.c parse-options.c
| * | diff: refactor COLOR_DIFF from a flag into an intJeff King2011-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This lets us store more than just a bit flag for whether we want color; we can also store whether we want automatic colors. This can be useful for making the automatic-color decision closer to the point of use. This mostly just involves replacing DIFF_OPT_* calls with manipulations of the flag. The biggest exception is that calls to DIFF_OPT_TST must check for "o->use_color > 0", which lets an "unknown" value (i.e., the default) stay at "no color". In the previous code, a value of "-1" was not propagated at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jc/combine-diff-callback'Junio C Hamano2011-08-28
|\ \ \ | | | | | | | | | | | | | | | | * jc/combine-diff-callback: combine-diff: support format_callback
| * | | combine-diff: support format_callbackJunio C Hamano2011-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This teaches combine-diff machinery to feed a combined merge to a callback function when DIFF_FORMAT_CALLBACK is specified. So far, format callback functions are not used for anything but 2-way diffs. A callback is given a diff_queue_struct, which is an array of diff_filepair. As its name suggests, a diff_filepair is a _pair_ of diff_filespec that represents a single preimage and a single postimage. Since "diff -c" is to compare N parents with a single merge result and filter out any paths whose result match one (or more) of the parent(s), its output has to be able to represent N preimages and 1 postimage. For this reason, a callback function that inspects a diff_filepair that results from this new infrastructure can and is expected to view the preimage side (i.e. pair->one) as an array of diff_filespec. Each element in the array, except for the last one, is marked with "has_more_entries" bit, so that the same callback function can be used for 2-way diffs and combined diffs. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'jc/maint-combined-diff-work-tree'Junio C Hamano2011-08-17
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | * jc/maint-combined-diff-work-tree: diff -c/--cc: do not mistake "resolved as deletion" as "use working tree" Conflicts: combine-diff.c