aboutsummaryrefslogtreecommitdiff
path: root/line-log.c
Commit message (Collapse)AuthorAge
* Merge branch 'vn/line-log-memcpy-size-fix'Junio C Hamano2017-03-12
|\ | | | | | | | | | | | | | | The command-line parsing of "git log -L" copied internal data structures using incorrect size on ILP32 systems. * vn/line-log-memcpy-size-fix: line-log: use COPY_ARRAY to fix mis-sized memcpy
| * line-log: use COPY_ARRAY to fix mis-sized memcpyVegard Nossum2017-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This memcpy meant to get the sizeof a "struct range", not a "range_set", as the former is what our array holds. Rather than swap out the types, let's convert this site to COPY_ARRAY, which avoids the problem entirely (and confirms that the src and dst types match). Note for curiosity's sake that this bug doesn't trigger on I32LP64 systems, but does on ILP32 systems. The mistaken "struct range_set" has two ints and a pointer. That's 16 bytes on LP64, or 12 on ILP32. The correct "struct range" type has two longs, which is also 16 on LP64, but only 8 on ILP32. Likewise an IL32P64 system would experience the bug. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'ax/line-log-range-merge-fix'Junio C Hamano2017-03-12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | The code to parse "git log -L..." command line was buggy when there are many ranges specified with -L; overrun of the allocated buffer has been fixed. * ax/line-log-range-merge-fix: line-log.c: prevent crash during union of too many ranges
| * | line-log.c: prevent crash during union of too many rangesAllan Xavier2017-03-03
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation of range_set_union does not correctly reallocate memory, leading to a heap overflow when it attempts to union more than 24 separate line ranges. For struct range_set *out to grow correctly it must have out->nr set to the current size of the buffer when it is passed to range_set_grow. However, the existing implementation of range_set_union only updates out->nr at the end of the function, meaning that it is always zero before this. This results in range_set_grow never growing the buffer, as well as some of the union logic itself being incorrect as !out->nr is always true. The reason why 24 is the limit is that the first allocation of size 1 ends up allocating a buffer of size 24 (due to the call to alloc_nr in ALLOC_GROW). This goes some way to explain why this hasn't been caught before. Fix the problem by correctly updating out->nr after reallocating the range_set. As this results in out->nr containing the same value as the variable o, replace o with out->nr as well. Finally, add a new test to help prevent the problem reoccurring in the future. Thanks to Vegard Nossum for writing the test. Signed-off-by: Allan Xavier <allan.x.xavier@oracle.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | use QSORTRené Scharfe2016-09-29
|/ | | | | | | | | Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'bc/cocci'Junio C Hamano2016-07-19
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conversion from unsigned char sha1[20] to struct object_id continues. * bc/cocci: diff: convert prep_temp_blob() to struct object_id merge-recursive: convert merge_recursive_generic() to object_id merge-recursive: convert leaf functions to use struct object_id merge-recursive: convert struct merge_file_info to object_id merge-recursive: convert struct stage_data to use object_id diff: rename struct diff_filespec's sha1_valid member diff: convert struct diff_filespec to struct object_id coccinelle: apply object_id Coccinelle transformations coccinelle: convert hashcpy() with null_sha1 to hashclr() contrib/coccinelle: add basic Coccinelle transforms hex: add oid_to_hex_r()
| * diff: rename struct diff_filespec's sha1_valid memberbrian m. carlson2016-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * diff: convert struct diff_filespec to struct object_idbrian m. carlson2016-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'js/log-to-diffopt-file'Junio C Hamano2016-07-19
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commands in the "log/diff" family have had an FILE* pointer in the data structure they pass around for a long time, but some codepaths used to always write to the standard output. As a preparatory step to make "git format-patch" available to the internal callers, these codepaths have been updated to consistently write into that FILE* instead. * js/log-to-diffopt-file: mingw: fix the shortlog --output=<file> test diff: do not color output when --color=auto and --output=<file> is given t4211: ensure that log respects --output=<file> shortlog: respect the --output=<file> setting format-patch: use stdout directly format-patch: avoid freopen() format-patch: explicitly switch off color when writing to files shortlog: support outputting to streams other than stdout graph: respect the diffopt.file setting line-log: respect diffopt's configured output file stream log-tree: respect diffopt's configured output file stream log: prepare log/log-tree to reuse the diffopt.close_file attribute
| * | line-log: respect diffopt's configured output file streamJohannes Schindelin2016-06-24
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | The diff machinery can optionally output to a file stream other than stdout, by overriding diffopt.file. In such a case, the rest of the log tree machinery should also write to that stream. Currently, there is no user of the line level log that wants to redirect output to a file. Therefore, one might argue that it is superfluous to support that now. However, it is better to be consistent now, rather than to face hard-to-debug problems later. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jc/deref-tag'Junio C Hamano2016-06-27
|\ \ | |/ |/| | | | | | | | | Code clean-up. * jc/deref-tag: blame, line-log: do not loop around deref_tag()
| * blame, line-log: do not loop around deref_tag()Junio C Hamano2016-06-14
| | | | | | | | | | | | | | | | These callers appear to expect that deref_tag() is to peel one layer of a tag, but the function does not work that way; it has its own loop to unwrap tags until an object that is not a tag appears. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | convert trivial cases to ALLOC_ARRAYJeff King2016-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | convert manual allocations to argv_arrayJeff King2016-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are many manual argv allocations that predate the argv_array API. Switching to that API brings a few advantages: 1. We no longer have to manually compute the correct final array size (so it's one less thing we can screw up). 2. In many cases we had to make a separate pass to count, then allocate, then fill in the array. Now we can do it in one pass, making the code shorter and easier to follow. 3. argv_array handles memory ownership for us, making it more obvious when things should be free()d and and when not. Most of these cases are pretty straightforward. In some, we switch from "run_command_v" to "run_command" which lets us directly use the argv_array embedded in "struct child_process". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Remove get_object_hash.brian m. carlson2015-11-20
| | | | | | | | | | | | | | | | | | Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* | Add several uses of get_object_hash.brian m. carlson2015-11-20
|/ | | | | | | | | | | Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
* Sync with 2.3.10Junio C Hamano2015-09-28
|\
| * react to errors in xdi_diffJeff King2015-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we call into xdiff to perform a diff, we generally lose the return code completely. Typically by ignoring the return of our xdi_diff wrapper, but sometimes we even propagate that return value up and then ignore it later. This can lead to us silently producing incorrect diffs (e.g., "git log" might produce no output at all, not even a diff header, for a content-level diff). In practice this does not happen very often, because the typical reason for xdiff to report failure is that it malloc() failed (it uses straight malloc, and not our xmalloc wrapper). But it could also happen when xdiff triggers one our callbacks, which returns an error (e.g., outf() in builtin/rerere.c tries to report a write failure in this way). And the next patch also plans to add more failure modes. Let's notice an error return from xdiff and react appropriately. In most of the diff.c code, we can simply die(), which matches the surrounding code (e.g., that is what we do if we fail to load a file for diffing in the first place). This is not that elegant, but we are probably better off dying to let the user know there was a problem, rather than simply generating bogus output. We could also just die() directly in xdi_diff, but the callers typically have a bit more context, and can provide a better message (and if we do later decide to pass errors up, we're one step closer to doing so). There is one interesting case, which is in diff_grep(). Here if we cannot generate the diff, there is nothing to match, and we silently return "no hits". This is actually what the existing code does already, but we make it a little more explicit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/color-diff-plain-is-context' into maintJunio C Hamano2015-06-25
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | "color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"
| * | diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXTJeff King2015-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'sb/line-log-plug-pairdiff-leak' into maintJunio C Hamano2015-05-13
|\ \ \ | | | | | | | | | | | | | | | | * sb/line-log-plug-pairdiff-leak: line-log.c: fix a memleak
| * | | line-log.c: fix a memleakStefan Beller2015-03-30
| |/ / | | | | | | | | | | | | | | | | | | | | | The `filepair` is assigned new memory with any iteration via process_diff_filepair, so free it before the current iteration ends. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Sync with 2.3.8Junio C Hamano2015-05-11
|\ \ \ | | |/ | |/| | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | log -L: improve error message on malformed argumentMatthieu Moy2015-04-20
| |/ | | | | | | | | | | | | | | | | | | | | The old message did not mention the :regex:file form. To avoid overly long lines, split the message into two lines (in case item->string is long, it will be the only part truncated in a narrow terminal). Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | line-log.c: make line_log_data_init() staticJunio C Hamano2015-01-15
|/ | | | | | No external callers exist. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'tm/line-log-first-parent'Junio C Hamano2014-11-06
|\ | | | | | | | | | | | | "git log --first-parent -L..." used to crash. * tm/line-log-first-parent: line-log: fix crash when --first-parent is used
| * line-log: fix crash when --first-parent is usedTzvetan Mikov2014-11-04
| | | | | | | | | | | | | | | | | | | | | | | | line-log tries to access all parents of a commit, but only the first parent has been loaded if "--first-parent" is specified, resulting in a crash. Limit the number of parents to one if "--first-parent" is specified. Reported-by: Eric N. Vander Weele <ericvw@gmail.com> Signed-off-by: Tzvetan Mikov <tmikov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | use REALLOC_ARRAY for changing the allocation size of arraysRené Scharfe2014-09-18
| | | | | | | | | | Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | use commit_list_count() to count the members of commit_listsRené Scharfe2014-07-17
| | | | | | | | | | | | | | Call commit_list_count() instead of open-coding it repeatedly. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | line-log: use commit_list_append() instead of duplicating its codeRené Scharfe2014-07-08
| | | | | | | | | | Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | line-log: convert to using diff_tree_sha1()Kirill Smelkov2014-02-05
| | | | | | | | | | | | | | | | | | | | Since diff_tree_sha1() can now accept empty trees via NULL sha1, we could just call it without manually reading trees into tree_desc and duplicating code. Cc: Thomas Rast <tr@thomasrast.ch> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'nd/magic-pathspec'Junio C Hamano2013-10-30
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers to parse_pathspec() must choose between getting no pathspec or one path that is limited to the current directory when there is no paths given on the command line, but there were two callers that violated this rule, triggering a BUG(). * nd/magic-pathspec: Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flags
| * | Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flagsNguyễn Thái Ngọc Duy2013-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parse_pathspec() is called with no paths, the behavior could be either return no paths, or return one path that is cwd. Some commands do the former, some the latter. parse_pathspec() itself does not make either the default and requires the caller to specify either flag if it may run into this situation. I've grep'd through all parse_pathspec() call sites. Some pass neither, but those are guaranteed never pass empty path to parse_pathspec(). There are two call sites that may pass empty path and are fixed with this patch. [jc: added a test from Antoine's bug report] Reported-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jl/submodule-mv'Junio C Hamano2013-09-09
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
| * | line-log: convert to use parse_pathspecNguyễn Thái Ngọc Duy2013-07-15
| | | | | | | | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | log: teach -L/RE/ to search from end of previous -L rangeEric Sunshine2013-08-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is complicated slightly by having to remember the previous -L range for each file specified via -L<range>:file. The existing implementation coalesces ranges for each file as each -L is parsed which makes it impossible to refer back to the previous -L range for any particular file. Re-implement to instead store each file's set of -L ranges verbatim, and then coalesce the ranges in a post-processing step. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | line-range: teach -L/RE/ to search relative to anchor pointEric Sunshine2013-08-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Range specification -L/RE/ for blame/log unconditionally begins searching at line one. Mailing list discussion [1] suggests that, in the presence of multiple -L options, -L/RE/ should search relative to the endpoint of the previous -L range, if any. Teach the parsing machinery underlying blame's and log's -L options to accept a start point for -L/RE/ searches. Follow-up patches will upgrade blame and log to take advantage of this ability. [1]: http://thread.gmane.org/gmane.comp.version-control.git/229755/focus=229966 Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | range-set: publish API for re-use by git-blame -LEric Sunshine2013-08-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git-blame is slated to accept multiple -L ranges. git-log already accepts multiple -L's but its implementation of range-set, which organizes and normalizes -L ranges, is private. Publish the small subset of range-set API which is needed for git-blame multiple -L support. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | log: fix -L bounds checking bugEric Sunshine2013-08-05
| |/ |/| | | | | | | | | | | | | | | | | When 12da1d1f added -L support to git-log, a broken bounds check was copied from git-blame -L which incorrectly allows -LX to extend one line past end of file without reporting an error. Instead, it generates an empty range. Fix this bug. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | line-log: fix "log -LN" crash when N is last line of fileEric Sunshine2013-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | range-set invariants are: ranges must be (1) non-empty, (2) disjoint, (3) sorted in ascending order. line_log_data_insert() breaks the non-empty invariant under the following conditions: the incoming range is empty and the pathname attached to the range has not yet been encountered. In this case, line_log_data_insert() assigns the empty range to a new line_log_data record without taking any action to ensure that the empty range is eventually folded out. Subsequent range-set functions crash or throw an assertion failure upon encountering such an anomaly. Fix this bug. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Acked-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | range-set: satisfy non-empty ranges invariantEric Sunshine2013-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | range-set invariants are: ranges must be (1) non-empty, (2) disjoint, (3) sorted in ascending order. During processing, various range-set utility functions break the invariants (for instance, by adding empty ranges), with the expectation that a finalizing sort_and_merge_range_set() will restore sanity. sort_and_merge_range_set(), however, neglects to fold out empty ranges, thus it fails to satisfy the non-empty constraint. Subsequent range-set functions crash or throw an assertion failure upon encountering such an anomaly. Rectify the situation by having sort_and_merge_range_set() fold out empty ranges. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Acked-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | range-set: fix sort_and_merge_range_set() corner case bugEric Sunshine2013-07-23
| | | | | | | | | | | | | | | | | | | | | | When handed an empty range_set (range_set.nr == 0), sort_and_merge_range_set() incorrectly sets range_set.nr to 1 at exit. Subsequent range_set functions then access the bogus range at element zero and crash or throw an assertion failure. Fix this bug. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Acked-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | range_set: fix coalescing bug when range is a subset of anotherEric Sunshine2013-07-09
|/ | | | | | | | | | | | | When coalescing ranges, sort_and_merge_range_set() unconditionally assumes that the end of a range being folded into a preceding range should become the end of the coalesced range. This assumption, however, is invalid when one range is a subset of another. For example, given ranges 1-5 and 2-3 added via range_set_append_unsafe(), sort_and_merge_range_set() incorrectly coalesces them to range 1-3 rather than the correct union range 1-5. Fix this bug. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* log -L: improve comments in process_all_files()Thomas Rast2013-04-12
| | | | | | | | | The funny range assignment in process_all_files() had me sidetracked while investigating what led to the previous commit. Let's improve the comments. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* log -L: store the path instead of a diff_filespecThomas Rast2013-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | line_log_data has held a diff_filespec* since the very early versions of the code. However, the only place in the code where we actually need the full filespec is parse_range_arg(); in all other cases, we are only interested in the path, so there is hardly a reason to store a filespec. Even worse, it causes a lot of redundant ->spec->path pointer dereferencing. And *even* worse, it caused the following bug. If you merge a rename with a modification to the old filename, like so: * Merge | \ | * Modify foo | | * | Rename foo->bar | / * Create foo we internally -- in process_ranges_merge_commit() -- scan all parents. We are mainly looking for one that doesn't have any modifications, so that we can assign all the blame to it and simplify away the merge. In doing so, we run the normal machinery on all parents in a loop. For each parent, we prepare a "working set" line_log_data by making a copy with line_log_data_copy(), which does *not* make a copy of the spec. Now suppose the rename is the first parent. The diff machinery tells us that the filepair is ('foo', 'bar'). We duly update the path we are interested in: rg->spec->path = xstrdup(pair->one->path); But that 'struct spec' is shared between the output line_log_data and the original input line_log_data. So we just wrecked the state of process_ranges_merge_commit(). When we get around to the second parent, the ranges tell us we are interested in a file 'foo' while the commits touch 'bar'. So most of this patch is just s/->spec->path/->path/ and associated management changes. This implicitly fixes the bug because we removed the shared parts between input and output of line_log_data_copy(); it is now safe to overwrite the path in the copy. There's one only somewhat related change: the comment in process_all_files() explains the reasoning behind using 'range' there. That bit of half-correct code had me sidetracked for a while. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* log -L: fix overlapping input rangesThomas Rast2013-04-05
| | | | | | | | | | | | | | | | The existing code was too defensive, and would trigger the assert in range_set_append() if the user gave overlapping ranges. The intent was always to define overlapping ranges as just the union of all of them, as evidenced by the call to sort_and_merge_range_set(). (Which was already used, unlike what the comment said.) Fix by splitting out the meat of range_set_append() to a new _unsafe() function that lacks the paranoia. sort_and_merge_range_set will fix up the ranges, so we don't need the checks there. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* log -L: check range set invariants when we look it upThomas Rast2013-04-05
| | | | | | | | | lookup_line_range() is a good place to check that the range sets satisfy the invariants: they have been computed and set in earlier iterations, and we now start working with them. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Speed up log -L... -MThomas Rast2013-03-28
| | | | | | | | | | | | | | | | | | | | | | So far log -L only used the implicit diff filtering by pathspec. If the user specifies -M, we cannot do that, and so we simply handed the whole diff queue (which is approximately 'git show --raw') to diffcore_std(). Unfortunately this is very slow. We can optimize a lot if we throw out files that we know cannot possibly be interesting, in the same spirit that the pathspec filtering reduces the number of files. However, in this case, we have to be more careful. Because we want to look out for renames, we need to keep all filepairs where something was deleted. This is a bit hacky and should really be replaced by equivalent support in --follow, and just using that. However, in the meantime it speeds up 'log -M -L' by an order of magnitude. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* log -L: :pattern:file syntax to find by funcnameThomas Rast2013-03-28
| | | | | | | | | | | | | This new syntax finds a funcname matching /pattern/, and then takes from there up to (but not including) the next funcname. So you can say git log -L:main:main.c and it will dig up the main() function and show its line-log, provided there are no other funcnames matching 'main'. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Implement line-history search (git log -L)Thomas Rast2013-03-28
This is a rewrite of much of Bo's work, mainly in an effort to split it into smaller, easier to understand routines. The algorithm is built around the struct range_set, which encodes a series of line ranges as intervals [a,b). This is used in two contexts: * A set of lines we are tracking (which will change as we dig through history). * To encode diffs, as pairs of ranges. The main routine is range_set_map_across_diff(). It processes the diff between a commit C and some parent P. It determines which diff hunks are relevant to the ranges tracked in C, and computes the new ranges for P. The algorithm is then simply to process history in topological order from newest to oldest, computing ranges and (partial) diffs. At branch points, we need to merge the ranges we are watching. We will find that many commits do not affect the chosen ranges, and mark them TREESAME (in addition to those already filtered by pathspec limiting). Another pass of history simplification then gets rid of such commits. This is wired as an extra filtering pass in the log machinery. This currently only reduces code duplication, but should allow for other simplifications and options to be used. Finally, we hook a diff printer into the output chain. Ideally we would wire directly into the diff logic, to optionally use features like word diff. However, that will require some major reworking of the diff chain, so we completely replace the output with our own diff for now. As this was a GSoC project, and has quite some history by now, many people have helped. In no particular order, thanks go to Jakub Narebski <jnareb@gmail.com> Jens Lehmann <Jens.Lehmann@web.de> Jonathan Nieder <jrnieder@gmail.com> Junio C Hamano <gitster@pobox.com> Ramsay Jones <ramsay@ramsay1.demon.co.uk> Will Palmer <wmpalmer@gmail.com> Apologies to everyone I forgot. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>