aboutsummaryrefslogtreecommitdiff
path: root/diff.c
Commit message (Collapse)AuthorAge
* diff: do not chomp hunk-header in the middle of a characterJunio C Hamano2008-01-06
| | | | | | | | | | | | | We truncate hunk-header line at 80 bytes, but that 80th byte could be in the middle of a character, which is bad. This uses pick_one_utf8_char() function to make sure we do not cut a character in the middle. This assumes that the internal representation of the text is UTF-8. This needs to be extended in the future but the optimal direction has not been decided yet. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff: remove lazy config loadingJeff King2008-01-04
| | | | | | | | | | | | There is no point to this. Either: 1. The program has already loaded git_diff_ui_config, in which case this is a noop. 2. The program didn't, which means it is plumbing that does not _want_ git_diff_ui_config to be loaded. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff: load funcname patterns in "basic" configJeff King2008-01-04
| | | | | | | | | | | | | | The funcname patterns influence the "comment" on @@ lines of the diff. They are safe to use with plumbing since they don't fundamentally change the meaning of the diff in any way. Since all diff users call either diff_ui_config or diff_basic_config, we can get rid of the lazy reading of the config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add a "basic" diff config callbackJeff King2008-01-04
| | | | | | | | | | | | | | | | | | | | | | The diff porcelain uses git_diff_ui_config to set porcelain-ish config options, like automatically turning on color. The plumbing specifically avoids calling this function, since it doesn't want things like automatic color or rename detection. However, some diff options should be set for both plumbing and porcelain. For example, one can still turn on color in git-diff-files using the --color command line option. This means we want the color config from color.diff.* (so that once color is on, we use the user's preferred scheme), but _not_ the color.diff variable. We split the diff config into "ui" and "basic", where "basic" is suitable for use by plumbing (so _most_ things affecting the output should still go into the "ui" part). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix rewrite_diff() name quoting.Junio C Hamano2007-12-26
| | | | | | | | | This moves the logic to quote two paths (prefix + path) in C-style introduced in the previous commit from the dump_quoted_path() in combine-diff.c to quote.c, and uses it to fix rewrite_diff() that never C-quoted the pathnames correctly. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Teach diff machinery to display other prefixes than "a/" and "b/"Johannes Schindelin2007-12-20
| | | | | | | | | | | | | With the new options "--src-prefix=<prefix>", "--dst-prefix=<prefix>" and "--no-prefix", you can now control the path prefixes of the diff machinery. These used to by hardwired to "a/" for the source prefix and "b/" for the destination prefix. Initial patch by Pascal Obry. Sane option names suggested by Linus. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Support config variable diff.externalJohannes Schindelin2007-12-17
| | | | | | | | We had the diff.external variable in the documentation of the config file since its conception, but failed to respect it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make "diff --check" output match "git apply"Wincent Colaiuta2007-12-13
| | | | | | | | | | | | | | | For consistency, make the two tools report whitespace errors in the same way (the output of "diff --check" has been tweaked to match that of "git apply"). Note that although the textual content is basically the same only "git diff --check" provides a colorized version of the problematic lines; making "git apply" do colorization will require more extensive changes (figuring out the diff colorization preferences of the user) and so that will be a subject for another commit. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Unify whitespace checkingWincent Colaiuta2007-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit unifies three separate places where whitespace checking was performed: - the whitespace checking previously done in builtin-apply.c is extracted into a function in ws.c - the equivalent logic in "git diff" is removed - the emit_line_with_ws() function is also removed because that also rechecks the whitespace, and its functionality is rolled into ws.c The new function is called check_and_emit_line() and it does two things: checks a line for whitespace errors and optionally emits it. The checking is based on lines of content rather than patch lines (in other words, the caller must strip the leading "+" or "-"); this was suggested by Junio on the mailing list to allow for a future extension to "git show" to display whitespace errors in blobs. At the same time we teach it to report all classes of whitespace errors found for a given line rather than reporting only the first found error. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff --check: minor fixupsJunio C Hamano2007-12-13
| | | | | | | | | | | | | | There is no reason --exit-code and --check-diff must be mutually exclusive, so assign different bits to different results and allow them to be returned from the command. Introduce diff_result_code() to factor out the common code to decide final status code based on diffopt settings and use it everywhere. Update tests to match the above fix. Turning pager off when "diff --check" is used is a regression. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* "diff --check" should affect exit statusWincent Colaiuta2007-12-13
| | | | | | | | | | | | "git diff" has a --check option that can be used to check for whitespace problems but it only reported by printing warnings to the console. Now when the --check option is used we give a non-zero exit status, making "git diff --check" nicer to use in scripts and hooks. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* xdl_diff: identify call sites.Junio C Hamano2007-12-13
| | | | | | | | This inserts a new function xdi_diff() that currently does not do anything other than calling the underlying xdl_diff() to the callchain of current callers of xdl_diff() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix "diff --check" whitespace detectionWincent Colaiuta2007-12-12
| | | | | | | | | "diff --check" would only detect spaces before tabs if a tab was the last character in the leading indent. Fix that and add a test case to make sure the bug doesn't regress in the future. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git-diff --numstat -z: make it machine readableJunio C Hamano2007-12-12
| | | | | | | | | | | | | | | | | | | | | | | | The "-z" format is all about machine parsability, but showing renamed paths as "common/{a => b}/suffix" makes it impossible. The scripts would never have successfully parsed "--numstat -z -M" in the old format. This fixes the output format in a (hopefully minimally) backward incompatible way. * The output without -z is not changed. This has given a good way for humans to view added and deleted lines separately, and showing the path in combined, shorter way would preserve readability. * The output with -z is unchanged for paths that do not involve renames. Existing scripts that do not pass -M/-C are not affected at all. * The output with -z for a renamed path is shown in a format that can easily be distinguished from an unrenamed path. This is based on Jakub Narebski's patch. Bugs and documentation typos are mine. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Use "whitespace" consistentlyWincent Colaiuta2007-12-12
| | | | | | | | | For consistency, change "white space" and "whitespaces" to "whitespace", fixing a couple of adjacent grammar problems in the docs. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jc/spht'Junio C Hamano2007-12-09
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jc/spht: Use gitattributes to define per-path whitespace rule core.whitespace: documentation updates. builtin-apply: teach whitespace_rules builtin-apply: rename "whitespace" variables and fix styles core.whitespace: add test for diff whitespace error highlighting git-diff: complain about >=8 consecutive spaces in initial indent War on whitespace: first, a bit of retreat. Conflicts: cache.h config.c diff.c
| * Use gitattributes to define per-path whitespace ruleJunio C Hamano2007-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `core.whitespace` configuration variable allows you to define what `diff` and `apply` should consider whitespace errors for all paths in the project (See gitlink:git-config[1]). This attribute gives you finer control per path. For example, if you have these in the .gitattributes: frotz whitespace nitfol -whitespace xyzzy whitespace=-trailing all types of whitespace problems known to git are noticed in path 'frotz' (i.e. diff shows them in diff.whitespace color, and apply warns about them), no whitespace problem is noticed in path 'nitfol', and the default types of whitespace problems except "trailing whitespace" are noticed for path 'xyzzy'. A project with mixed Python and C might want to have: *.c whitespace *.py whitespace=-indent-with-non-tab in its toplevel .gitattributes file. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * git-diff: complain about >=8 consecutive spaces in initial indentJunio C Hamano2007-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a new whitespace error type, "indent-with-non-tab". The error is about starting a line with 8 or more SP, instead of indenting it with a HT. This is not enabled by default, as some projects employ an indenting policy to use only SPs and no HTs. The kernel folks and git contributors may want to enable this detection with: [core] whitespace = indent-with-non-tab Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * War on whitespace: first, a bit of retreat.Junio C Hamano2007-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces core.whitespace configuration variable that lets you specify the definition of "whitespace error". Currently there are two kinds of whitespace errors defined: * trailing-space: trailing whitespaces at the end of the line. * space-before-tab: a SP appears immediately before HT in the indent part of the line. You can specify the desired types of errors to be detected by listing their names (unique abbreviations are accepted) separated by comma. By default, these two errors are always detected, as that is the traditional behaviour. You can disable detection of a particular type of error by prefixing a '-' in front of the name of the error, like this: [core] whitespace = -trailing-space This patch teaches the code to output colored diff with DIFF_WHITESPACE color to highlight the detected whitespace errors to honor the new configuration. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | git config --get-colorboolJunio C Hamano2007-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds an option to help scripts find out color settings from the configuration file. git config --get-colorbool color.diff inspects color.diff variable, and exits with status 0 (i.e. success) if color is to be used. It exits with status 1 otherwise. If a script wants "true"/"false" answer to the standard output of the command, it can pass an additional boolean parameter to its command line, telling if its standard output is a terminal, like this: git config --get-colorbool color.diff true When called like this, the command outputs "true" to its standard output if color is to be used (i.e. "color.diff" says "always", "auto", or "true"), and "false" otherwise. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Fix "quote" misconversion for rewrite diff output.Junio C Hamano2007-11-21
| | | | | | | | | | | | | | | | | | 663af3422a648e87945e4d8c0cc3e13671f2bbde (Full rework of quote_c_style and write_name_quoted.) mistakenly used puts() when writing out a fixed string when it did not want to add a terminating LF. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Reorder diff_opt_parse options more logically per topics.Pierre Habouzit2007-11-11
| | | | | | | | | | | | | | This is a line reordering patch _only_. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Make the diff_options bitfields be an unsigned with explicit masks.Pierre Habouzit2007-11-11
|/ | | | | | | reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'js/forkexec'Junio C Hamano2007-11-01
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * js/forkexec: Use the asyncronous function infrastructure to run the content filter. Avoid a dup2(2) in apply_filter() - start_command() can do it for us. t0021-conversion.sh: Test that the clean filter really cleans content. upload-pack: Run rev-list in an asynchronous function. upload-pack: Move the revision walker into a separate function. Use the asyncronous function infrastructure in builtin-fetch-pack.c. Add infrastructure to run a function asynchronously. upload-pack: Use start_command() to run pack-objects in create_pack_file(). Have start_command() create a pipe to read the stderr of the child. Use start_comand() in builtin-fetch-pack.c instead of explicit fork/exec. Use run_command() to spawn external diff programs instead of fork/exec. Use start_command() to run content filters instead of explicit fork/exec. Use start_command() in git_connect() instead of explicit fork/exec. Change git_connect() to return a struct child_process instead of a pid_t. Conflicts: builtin-fetch-pack.c
| * Use run_command() to spawn external diff programs instead of fork/exec.Johannes Sixt2007-10-21
| | | | | | | | | | Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | copy vs rename detection: avoid unnecessary O(n*m) loopsLinus Torvalds2007-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core rename detection had some rather stupid code to check if a pathname was used by a later modification or rename, which basically walked the whole pathname space for all renames for each rename, in order to tell whether it was a pure rename (no remaining users) or should be considered a copy (other users of the source file remaining). That's really silly, since we can just keep a count of users around, and replace all those complex and expensive loops with just testing that simple counter (but this all depends on the previous commit that shared the diff_filespec data structure by using a separate reference count). Note that the reference count is not the same as the rename count: they behave otherwise rather similarly, but the reference count is tied to the allocation (and decremented at de-allocation, so that when it turns zero we can get rid of the memory), while the rename count is tied to the renames and is decremented when we find a rename (so that when it turns zero we know that it was a rename, not a copy). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Ref-count the filespecs used by diffcoreLinus Torvalds2007-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than copy the filespecs when introducing new versions of them (for rename or copy detection), use a refcount and increment the count when reusing the diff_filespec. This avoids unnecessary allocations, but the real reason behind this is a future enhancement: we will want to track shared data across the copy/rename detection. In order to efficiently notice when a filespec is used by a rename, the rename machinery wants to keep track of a rename usage count which is shared across all different users of the filespec. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Correct some sizeof(size_t) != sizeof(unsigned long) typing errorsRené Scharfe2007-10-22
|/ | | | | | | | Fix size_t vs. unsigned long pointer mismatch warnings introduced with the addition of strbuf_detach(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Merge branch 'ph/strbuf'Junio C Hamano2007-10-03
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * ph/strbuf: (44 commits) Make read_patch_file work on a strbuf. strbuf_read_file enhancement, and use it. strbuf change: be sure ->buf is never ever NULL. double free in builtin-update-index.c Clean up stripspace a bit, use strbuf even more. Add strbuf_read_file(). rerere: Fix use of an empty strbuf.buf Small cache_tree_write refactor. Make builtin-rerere use of strbuf nicer and more efficient. Add strbuf_cmp. strbuf_setlen(): do not barf on setting length of an empty buffer to 0 sq_quote_argv and add_to_string rework with strbuf's. Full rework of quote_c_style and write_name_quoted. Rework unquote_c_style to work on a strbuf. strbuf API additions and enhancements. nfv?asprintf are broken without va_copy, workaround them. Fix the expansion pattern of the pseudo-static path buffer. builtin-for-each-ref.c::copy_name() - do not overstep the buffer. builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion. Use xmemdupz() in many places. ...
| * strbuf change: be sure ->buf is never ever NULL.Pierre Habouzit2007-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For that purpose, the ->buf is always initialized with a char * buf living in the strbuf module. It is made a char * so that we can sloppily accept things that perform: sb->buf[0] = '\0', and because you can't pass "" as an initializer for ->buf without making gcc unhappy for very good reasons. strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf anymore. as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying ->buf isn't an option anymore, if ->buf is going to escape from the scope, and eventually be free'd. API changes: * strbuf_setlen now always works, so just make strbuf_reset a convenience macro. * strbuf_detatch takes a size_t* optional argument (meaning it can be NULL) to copy the buffer's len, as it was needed for this refactor to make the code more readable, and working like the callers. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Full rework of quote_c_style and write_name_quoted.Pierre Habouzit2007-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * quote_c_style works on a strbuf instead of a wild buffer. * quote_c_style is now clever enough to not add double quotes if not needed. * write_name_quoted inherits those advantages, but also take a different set of arguments. Now instead of asking for quotes or not, you pass a "terminator". If it's \0 then we assume you don't want to escape, else C escaping is performed. In any case, the terminator is also appended to the stream. It also no longer takes the prefix/prefix_len arguments, as it's seldomly used, and makes some optimizations harder. * write_name_quotedpfx is created to work like write_name_quoted and take the prefix/prefix_len arguments. Thanks to those API changes, diff.c has somehow lost weight, thanks to the removal of functions that were wrappers around the old write_name_quoted trying to give it a semantics like the new one, but performing a lot of allocations for this goal. Now we always write directly to the stream, no intermediate allocation is performed. As a side effect of the refactor in builtin-apply.c, the length of the bar graphs in diffstats are not affected anymore by the fact that the path was clipped. Signed-off-by: Pierre Habouzit <madcoder@debian.org>
| * Use xmemdupz() in many places.Pierre Habouzit2007-09-18
| | | | | | | | | | Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Merge branch 'master' into ph/strbufJunio C Hamano2007-09-18
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master: (94 commits) Fixed update-hook example allow-users format. Documentation/git-svn: updated design philosophy notes t/t4014: test "am -3" with mode-only change. git-commit.sh: Shell script cleanup preserve executable bits in zip archives Fix lapsus in builtin-apply.c git-push: documentation and tests for pushing only branches git-svnimport: Use separate arguments in the pipe for git-rev-parse contrib/fast-import: add perl version of simple example contrib/fast-import: add simple shell example rev-list --bisect: Bisection "distance" clean up. rev-list --bisect: Move some bisection code into best_bisection. rev-list --bisect: Move finding bisection into do_find_bisection. Document ls-files --with-tree=<tree-ish> git-commit: partial commit of paths only removed from the index git-commit: Allow partial commit of file removal. send-email: make message-id generation a bit more robust git-apply: fix whitespace stripping git-gui: Disable native platform text selection in "lists" apply --index-info: fall back to current index for mode changes ...
| * | Now that cache.h needs strbuf.h, remove useless includes.Pierre Habouzit2007-09-16
| | | | | | | | | | | | | | | | | | Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Rewrite convert_to_{git,working_tree} to use strbuf's.Pierre Habouzit2007-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Now, those functions take an "out" strbuf argument, where they store their result if any. In that case, it also returns 1, else it returns 0. * those functions support "in place" editing, in the sense that it's OK to call them this way: convert_to_git(path, sb->buf, sb->len, sb); When doable, conversions are done in place for real, else the strbuf content is just replaced with the new one, transparentely for the caller. If you want to create a new filter working this way, being the accumulation of filter1, filter2, ... filtern, then your meta_filter would be: int meta_filter(..., const char *src, size_t len, struct strbuf *sb) { int ret = 0; ret |= filter1(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } ret |= filter2(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } .... return ret | filtern(..., src, len, sb); } That's why subfilters the convert_to_* functions called were also rewritten to work this way. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Strbuf API extensions and fixes.Pierre Habouzit2007-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Merge branch 'master' into ph/strbufJunio C Hamano2007-09-10
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master: archive - leakfix for format_subst() Make --no-thin the default in git-push to save server resources fix doc for --compression argument to pack-objects git-tag -s must fail if gpg cannot sign the tag. git-svn: understand grafts when doing dcommit git-diff: don't squelch the new SHA1 in submodule diffs Define NO_MEMMEM on Darwin as it lacks the function git-svn: fix "Malformed network data" with svn:// servers (cvs|svn)import: Ask git-tag to overwrite old tags. git-rebase: fix -C option git-rebase: support --whitespace=<option> Documentation / grammer nit archive: rename attribute specfile to export-subst archive: specfile syntax change: "$Format:%PLCHLDR$" instead of just "%PLCHLDR" (take 2) add memmem() Remove unused function convert_sha1_file() archive: specfile support (--pretty=format: in archive files) Export format_commit_message()
| * | | Use strbuf API in apply, blame, commit-tree and diffPierre Habouzit2007-09-06
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | rename diff_free_filespec_data_large() to diff_free_filespec_blob()Junio C Hamano2007-10-02
| | | | | | | | | | | | | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | diffcore-rename: cache file deltasJeff King2007-10-02
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We find rename candidates by computing a fingerprint hash of each file, and then comparing those fingerprints. There are inherently O(n^2) comparisons, so it pays in CPU time to hoist the (rather expensive) computation of the fingerprint out of that loop (or to cache it once we have computed it once). Previously, we didn't keep the filespec information around because then we had the potential to consume a great deal of memory. However, instead of keeping all of the filespec data, we can instead just keep the fingerprint. This patch implements and uses diff_free_filespec_data_large to accomplish that goal. We also have to change estimate_similarity not to needlessly repopulate the filespec data when we already have the hash. Practical tests showed 4.5x speedup for a 10% memory usage increase. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Fix the rename detection limit checkingLinus Torvalds2007-09-14
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds more proper rename detection limits. Instead of just checking the limit against the number of potential rename destinations, we verify that the rename matrix (which is what really matters) doesn't grow ridiculously large, and we also make sure that we don't overflow when doing the matrix size calculation. This also changes the default limits from unlimited, to a rename matrix that is limited to 100 entries on a side. You can raise it with the config entry, or by using the "-l<n>" command line flag, but at least the default is now a sane number that avoids spending lots of time (and memory) in situations that likely don't merit it. The choice of default value is of course very debatable. Limiting the rename matrix to a 100x100 size will mean that even if you have just one obvious rename, but you also create (or delete) 10,000 files, the rename matrix will be so big that we disable the heuristics. Sounds reasonable to me, but let's see if people hit this (and, perhaps more importantly, actually *care*) in real life. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | git-diff: don't squelch the new SHA1 in submodule diffsSven Verdoolaege2007-09-09
|/ | | | | | | | | | | | | The code to squelch empty diffs introduced by commit fb13227e089f22dc31a3b1624559153821056848 would inadvertently populate filespec "two" of a submodule change using the uninitialized (null) SHA1, thereby replacing the submodule SHA1 by 0{40} in the output. This change teaches diffcore_skip_stat_unmatch to handle submodule changes correctly. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git-diff: resurrect the traditional empty "diff --git" behaviourJunio C Hamano2007-08-31
| | | | | | | | | | | | | | | | | | | | | | | | | The warning message to suggest "Consider running git-status" from "git-diff" that we experimented with during the 1.5.3 cycle turns out to be a bad idea. It robbed cache-dirty information from people who valued it, while still asking users to run "update-index --refresh". It was hoped that the new behaviour would at least have some educational value, but not showing the cache-dirty paths like before meant that the user would not even know easily which paths were cache-dirty, and it made the need to refresh the index look like even more unnecessary chore. This commit reinstates the traditional behaviour, but with a twist. By default, the empty "diff --git" output is totally squelched out from "git diff" output. At the end of the command, it automatically runs "update-index --refresh" as needed, without even bothering the user. In other words, people who do not care about the cache-dirtyness do not even have to see the warning. The traditional behaviour to see the stat-dirty output and to bypassing the overhead of content comparison can be specified by setting the configuration variable diff.autorefreshindex to false. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Take binary diffs into account for "git rebase"Linus Torvalds2007-08-19
| | | | | | | | | | | | | | | | We used to not generate a patch ID for binary diffs, but that means that some commits may be skipped as being identical to already-applied diffs when doing a rebase. So just delete the code that skips the binary diff. At the very least, we'd want the filenames to be part of the patch ID, but we might also want to generate some hash for the binary diff itself too. This fixes an issue noticed by Torgil Svensson. Tested-by: Torgil Svensson <torgil.svensson@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff: squelch empty diffs even moreRené Scharfe2007-08-14
| | | | | | | | | | | | | | | | | | | | | | | | When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git-diff: squelch "empty" diffsJunio C Hamano2007-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | After starting to edit a working tree file but later when your edit ends up identical to the original (this can also happen when you ran a wholesale regexp replace with something like "perl -i" that does not actually modify many of the paths), "git diff" between the index and the working tree outputs many "empty" diffs that show "diff --git" headers and nothing else, because these paths are stat-dirty. While it was a way to warn the user that the earlier action of the user made the index ineffective as an optimization mechanism, it was felt too loud for the purpose of warning even to experienced users, and also resulted in confusing people new to git. This replaces the "empty" diffs with a single warning message at the end. Having many such paths hurts performance, and you can run "git-update-index --refresh" to update the lstat(2) information recorded in the index in such a case. "git-status" does so as a side effect, and that is more familiar to the end-user, so we recommend it to them. The change affects only "git diff" that outputs patch text, because that is where the annoyance of too many "empty" diff is most strongly felt, and because the warning message can be safely ignored by downstream tools without getting mistaken as part of the patch. For the low-level "git diff-files" and "git diff-index", the traditional behaviour is retained. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git_mkstemp(): be careful not to overflow the path buffer.Junio C Hamano2007-07-25
| | | | | | | | If user's TMPDIR is insanely long, return negative after setting errno to ENAMETOOLONG, pretending that the underlying mkstemp() choked on a temporary file path that is too long. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff.c: make built-in hunk header pattern a separate tableJunio C Hamano2007-07-08
| | | | | | | | This would hopefully make it easier to maintain. Initially we would have "java" and "tex" defined, as they are the only ones we already have. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff: honor binariness specified in attributesJunio C Hamano2007-07-07
| | | | | | | | | The code shuffling mistakenly lost binariness specified with the attribute mecahnism and made it always guess from the data. Noticed by Johannes, with two test cases to t4020. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix configuration syntax to specify customized hunk header patterns.Junio C Hamano2007-07-07
| | | | | | | | | | | | | | | | | | | | | | | | This updates the hunk header customization syntax. The special case 'funcname' attribute is gone. You assign the name of the type of contents to path's "diff" attribute as a string value in .gitattributes like this: *.java diff=java *.perl diff=perl *.doc diff=doc If you supply "diff.<name>.funcname" variable via the configuration mechanism (e.g. in $HOME/.gitconfig), the value is used as the regexp set to find the line to use for the hunk header (the variable is called "funcname" because such a line typically is the one that has the name of the function in programming language source text). If there is no such configuration, built-in default is used, if any. Currently there are two default patterns: default and java. Signed-off-by: Junio C Hamano <gitster@pobox.com>