aboutsummaryrefslogtreecommitdiff
path: root/git.c
Commit message (Collapse)AuthorAge
* Merge branch 'bw/recurse-submodules-relative-fix'Junio C Hamano2017-03-30
|\ | | | | | | | | | | | | | | | | | | | | | | | | A few commands that recently learned the "--recurse-submodule" option misbehaved when started from a subdirectory of the superproject. * bw/recurse-submodules-relative-fix: ls-files: fix bug when recursing with relative pathspec ls-files: fix typo in variable name grep: fix bug when recursing with relative pathspec setup: allow for prefix to be passed to git commands grep: fix help text typo
| * setup: allow for prefix to be passed to git commandsBrandon Williams2017-03-17
| | | | | | | | | | | | | | | | | | | | | | In a future patch child processes which act on submodules need a little more context about the original command that was invoked. This patch teaches git to use the prefix stored in `GIT_INTERNAL_TOPLEVEL_PREFIX` instead of the prefix that was potentally found during the git directory setup process. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | rebase--helper: add a builtin helper for interactive rebasesJohannes Schindelin2017-02-09
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git's interactive rebase is still implemented as a shell script, despite its complexity. This implies that it suffers from the portability point of view, from lack of expressibility, and of course also from performance. The latter issue is particularly serious on Windows, where we pay a hefty price for relying so much on POSIX. Unfortunately, being such a huge shell script also means that we missed the train when it would have been relatively easy to port it to C, and instead piled feature upon feature onto that poor script that originally never intended to be more than a slightly pimped cherry-pick in a loop. To open the road toward better performance (in addition to all the other benefits of C over shell scripts), let's just start *somewhere*. The approach taken here is to add a builtin helper that at first intends to take care of the parts of the interactive rebase that are most affected by the performance penalties mentioned above. In particular, after we spent all those efforts on preparing the sequencer to process rebase -i's git-rebase-todo scripts, we implement the `git rebase -i --continue` functionality as a new builtin, git-rebase--helper. Once that is in place, we can work gradually on tackling the rest of the technical debt. Note that the rebase--helper needs to learn about the transient --ff/--no-ff options of git-rebase, as the corresponding flag is not persisted to, and re-read from, the state directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'sb/unpack-trees-super-prefix'Junio C Hamano2017-02-03
|\ | | | | | | | | | | | | | | | | | | | | "git read-tree" and its underlying unpack_trees() machinery learned to report problematic paths prefixed with the --super-prefix option. * sb/unpack-trees-super-prefix: unpack-trees: support super-prefix option t1001: modernize style t1000: modernize style read-tree: use OPT_BOOL instead of OPT_SET_INT
| * unpack-trees: support super-prefix optionStefan Beller2017-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the future we want to support working tree operations within submodules, e.g. "git checkout --recurse-submodules", which will update the submodule to the commit as recorded in its superproject. In the submodule the unpack-tree operation is carried out as usual, but the reporting to the user needs to prefix any path with the superproject. The mechanism for this is the super-prefix. (see 74866d757, git: make super-prefix option) Add support for the super-prefix option for commands that unpack trees by wrapping any path output in unpacking trees in the newly introduced super_prefixed function. This new function prefixes any path with the super-prefix if there is one. Assuming the submodule case doesn't happen in the majority of the cases, we'd want to have a fast behavior for no super prefix, i.e. no reallocation/copying, but just returning path. Another aspect of introducing the `super_prefixed` function is to consider who owns the memory and if this is the right place where the path gets modified. As the super prefix ought to change the output behavior only and not the actual unpack tree part, it is fine to be that late in the line. As we get passed in 'const char *path', we cannot change the path itself, which means in case of a super prefix we have to copy over the path. We need two static buffers in that function as the error messages contain at most two paths. For testing purposes enable it in read-tree, which has no output of paths other than an unpack-trees.c. These are all converted in this patch. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'js/difftool-builtin'Junio C Hamano2017-01-31
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite a scripted porcelain "git difftool" in C. * js/difftool-builtin: difftool: hack around -Wzero-length-format warning difftool: retire the scripted version difftool: implement the functionality in the builtin difftool: add a skeleton for the upcoming builtin
| * | difftool: retire the scripted versionJohannes Schindelin2017-01-19
| | | | | | | | | | | | | | | | | | | | | | | | It served its purpose, but now we have a builtin difftool. Time for the Perl script to enjoy Florida. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | difftool: add a skeleton for the upcoming builtinJohannes Schindelin2017-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a builtin difftool that still falls back to the legacy Perl version, which has been renamed to `legacy-difftool`. The idea is that the new, experimental, builtin difftool immediately hands off to the legacy difftool for now, unless the config variable difftool.useBuiltin is set to true. This feature flag will be used in the upcoming Git for Windows v2.11.0 release, to allow early testers to opt-in to use the builtin difftool and flesh out any bugs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/execv-dashed-external'Junio C Hamano2017-01-18
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Typing ^C to pager, which usually does not kill it, killed Git and took the pager down as a collateral damage in certain process-tree structure. This has been fixed. * jk/execv-dashed-external: execv_dashed_external: wait for child on signal death execv_dashed_external: stop exiting with negative code execv_dashed_external: use child_process struct
| * | | execv_dashed_external: wait for child on signal deathJeff King2017-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When you hit ^C to interrupt a git command going to a pager, this usually leaves the pager running. But when a dashed external is in use, the pager ends up in a funny state and quits (but only after eating one more character from the terminal!). This fixes it. Explaining the reason will require a little background. When git runs a pager, it's important for the git process to hang around and wait for the pager to finish, even though it has no more data to feed it. This is because git spawns the pager as a child, and thus the git process is the session leader on the terminal. After it dies, the pager will finish its current read from the terminal (eating the one character), and then get EIO trying to read again. When you hit ^C, that sends SIGINT to git and to the pager, and it's a similar situation. The pager ignores it, but the git process needs to hang around until the pager is done. We addressed that long ago in a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22). But when you have a dashed external (or an alias pointing to a builtin, which will re-exec git for the builtin), there's an extra process in the mix. For instance, running: $ git -c alias.l=log l will end up with a process tree like: git (parent) \ git-log (child) \ less (pager) If you hit ^C, SIGINT goes to all of them. The pager ignores it, and the child git process will end up in wait_for_pager(). But the parent git process will die, and the usual EIO trouble happens. So we really want the parent git process to wait_for_pager(), but of course it doesn't know anything about the pager at all, since it was started by the child. However, we can have it wait on the git-log child, which in turn is waiting on the pager. And that's what this patch does. There are a few design decisions here worth explaining: 1. The new feature is attached to run-command's clean_on_exit feature. Partly this is convenience, since that feature already has a signal handler that deals with child cleanup. But it's also a meaningful connection. The main reason that dashed externals use clean_on_exit is to bind the two processes together. If somebody kills the parent with a signal, we propagate that to the child (in this instance with SIGINT, we do propagate but it doesn't matter because the original signal went to the whole process group). Likewise, we do not want the parent to go away until the child has done so. In a traditional Unix world, we'd probably accomplish this binding by just having the parent execve() the child directly. But since that doesn't work on Windows, everything goes through run_command's more spawn-like interface. 2. We do _not_ automatically waitpid() on any clean_on_exit children. For dashed externals this makes sense; we know that the parent is doing nothing but waiting for the child to exit anyway. But with other children, it's possible that the child, after getting the signal, could be waiting on the parent to do something (like closing a descriptor). If we were to wait on such a child, we'd end up in a deadlock. So this errs on the side of caution, and lets callers enable the feature explicitly. 3. When we send children the cleanup signal, we send all the signals first, before waiting on any children. This is to avoid the case where one child might be waiting on another one to exit, causing a deadlock. We inform all of them that it's time to die before reaping any. In practice, there is only ever one dashed external run from a given process, so this doesn't matter much now. But it future-proofs us if other callers start using the wait_after_clean mechanism. There's no automated test here, because it would end up racy and unportable. But it's easy to reproduce the situation by running the log command given above and hitting ^C. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | execv_dashed_external: stop exiting with negative codeJeff King2017-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we try to exec a git sub-command, we pass along the status code from run_command(). But that may return -1 if we ran into an error with pipe() or execve(). This tends to work (and end up as 255 due to twos-complement wraparound and truncation), but in general it's probably a good idea to avoid negative exit codes for portability. We can easily translate to the normal generic "128" code we get when syscalls cause us to die. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | execv_dashed_external: use child_process structJeff King2017-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we run a dashed external, we use the one-liner run_command_v_opt() to do so. Let's switch to using a child_process struct, which has two advantages: 1. We can drop all of the allocation and cleanup code for building our custom argv array, and just rely on the builtin argv_array (at the minor cost of doing a few extra mallocs). 2. We have access to the complete range of child_process options, not just the ones that the "_opt()" form can forward. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'bw/grep-recurse-submodules'Junio C Hamano2017-01-18
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git grep" has been taught to optionally recurse into submodules. * bw/grep-recurse-submodules: grep: search history of moved submodules grep: enable recurse-submodules to work on <tree> objects grep: optionally recurse into submodules grep: add submodules as a grep source type submodules: load gitmodules file from commit sha1 submodules: add helper to determine if a submodule is initialized submodules: add helper to determine if a submodule is populated real_path: canonicalize directory separators in root parts real_path: have callers use real_pathdup and strbuf_realpath real_path: create real_pathdup real_path: convert real_path_internal to strbuf_realpath real_path: resolve symlinks by hand
| * | | grep: optionally recurse into submodulesBrandon Williams2016-12-22
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow grep to recognize submodules and recursively search for patterns in each submodule. This is done by forking off a process to recursively call grep on each submodule. The top level --super-prefix option is used to pass a path to the submodule which can in turn be used to prepend to output or in pathspec matching logic. Recursion only occurs for submodules which have been initialized and checked out by the parent project. If a submodule hasn't been initialized and checked out it is simply skipped. In order to support the existing multi-threading infrastructure in grep, output from each child process is captured in a strbuf so that it can be later printed to the console in an ordered fashion. To limit the number of theads that are created, each child process has half the number of threads as its parents (minimum of 1), otherwise we potentailly have a fork-bomb. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'sb/submodule-embed-gitdir'Junio C Hamano2017-01-10
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new submodule helper "git submodule embedgitdirs" to make it easier to move embedded .git/ directory for submodules in a superproject to .git/modules/ (and point the latter with the former that is turned into a "gitdir:" file) has been added. * sb/submodule-embed-gitdir: worktree: initialize return value for submodule_uses_worktrees submodule: add absorb-git-dir function move connect_work_tree_and_git_dir to dir.h worktree: check if a submodule uses worktrees test-lib-functions.sh: teach test_commit -C <dir> submodule helper: support super prefix submodule: use absolute path for computing relative path connecting
| * | submodule helper: support super prefixStefan Beller2016-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like main commands in Git, the submodule helper needs access to the superproject prefix. Enable this in the git.c but have its own fuse in the helper code by having a flag to turn on the super prefix. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/common-main'Junio C Hamano2016-11-29
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | Fix for a small regression in a topic already in 'master'. * jk/common-main: common-main: stop munging argv[0] path
| * | common-main: stop munging argv[0] pathJeff King2016-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 650c44925 (common-main: call git_extract_argv0_path(), 2016-07-01), the argv[0] that is seen in cmd_main() of individual programs is always the basename of the executable, as common-main strips off the full path. This can produce confusing results for git-daemon, which wants to re-exec itself. For instance, if the program was originally run as "/usr/lib/git/git-daemon", it will try just re-execing "git-daemon", which will find the first instance in $PATH. If git's exec-path has not been prepended to $PATH, we may find the git-daemon from a different version (or no git-daemon at all). Normally this isn't a problem. Git commands are run as "git daemon", the git wrapper puts the exec-path at the front of $PATH, and argv[0] is already "daemon" anyway. But running git-daemon via its full exec-path, while not really a recommended method, did work prior to 650c44925. Let's make it work again. The real goal of 650c44925 was not to munge argv[0], but to reliably set the argv0_path global. The only reason it munges at all is that one caller, the git.c wrapper, piggy-backed on that computation to find the command basename. Instead, let's leave argv[0] untouched in common-main, and have git.c do its own basename computation. While we're at it, let's drop the return value from git_extract_argv0_path(). It was only ever used in this one callsite, and its dual purposes is what led to this confusion in the first place. Note that by changing the interface, the compiler can confirm for us that there are no other callers storing the return value. But the compiler can't tell us whether any of the cmd_main() functions (besides git.c) were relying on the basename munging. However, we can observe that prior to 650c44925, no other cmd_main() functions did that munging, and no new cmd_main() functions have been introduced since then. So we can't be regressing any of those cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | archive: read local configurationJunio C Hamano2016-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since b9605bc4f2 ("config: only read .git/config from configured repos", 2016-09-12), we do not read from ".git/config" unless we know we are in a repository. "git archive" however didn't do the repository discovery and instead relied on the old behaviour. Teach the command to run a "gentle" version of repository discovery so that local configuration variables are honoured. [jc: stole tests from peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | mailinfo: read local configurationJunio C Hamano2016-11-22
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since b9605bc4f2 ("config: only read .git/config from configured repos", 2016-09-12), we do not read from ".git/config" unless we know we are in a repository. "git mailinfo" however didn't do the repository discovery and instead relied on the old behaviour. This was mostly OK because it was merely run as a helper program by other porcelain scripts that first chdir's up to the root of the working tree. Teach the command to run a "gentle" version of repository discovery so that local configuration variables like mailinfo.scissors are honoured. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jc/cocci-xstrdup-or-null'Junio C Hamano2016-10-26
|\ \ | | | | | | | | | | | | | | | | | | Code cleanup. * jc/cocci-xstrdup-or-null: cocci: refactor common patterns to use xstrdup_or_null()
| * | cocci: refactor common patterns to use xstrdup_or_null()Junio C Hamano2016-10-12
| |/ | | | | | | | | | | | | | | | | d64ea0f83b ("git-compat-util: add xstrdup_or_null helper", 2015-01-12) added a handy wrapper that allows us to get a duplicate of a string or NULL if the original is NULL, but a handful of codepath predate its introduction or just weren't aware of it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'bw/ls-files-recurse-submodules'Junio C Hamano2016-10-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files). This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject. * bw/ls-files-recurse-submodules: ls-files: add pathspec matching for submodules ls-files: pass through safe options for --recurse-submodules ls-files: optionally recurse into submodules git: make super-prefix option
| * | ls-files: optionally recurse into submodulesBrandon Williams2016-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow ls-files to recognize submodules in order to retrieve a list of files from a repository's submodules. This is done by forking off a process to recursively call ls-files on all submodules. Use top-level --super-prefix option to pass a path to the submodule which it can use to prepend to output or pathspec matching logic. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | git: make super-prefix optionBrandon Williams2016-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' which can be used to specify a path from above a repository down to its root. When such a super-prefix is specified, the paths reported by Git are prefixed with it to make them relative to that directory "above". The paths given by the user on the command line (e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken relative to the directory "above" to match. The immediate use of this option is by commands which have a --recurse-submodule option in order to give context to submodules about how they were invoked. This option is currently only allowed for builtins which support a super-prefix. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jk/setup-sequence-update'Junio C Hamano2016-09-21
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were numerous corner cases in which the configuration files are read and used or not read at all depending on the directory a Git command was run, leading to inconsistent behaviour. The code to set-up repository access at the beginning of a Git process has been updated to fix them. * jk/setup-sequence-update: t1007: factor out repeated setup init: reset cached config when entering new repo init: expand comments explaining config trickery config: only read .git/config from configured repos test-config: setup git directory t1302: use "git -C" pager: handle early config pager: use callbacks instead of configset pager: make pager_program a file-local static pager: stop loading git_default_config() pager: remove obsolete comment diff: always try to set up the repository diff: handle --no-index prefixes consistently diff: skip implicit no-index check when given --no-index patch-id: use RUN_SETUP_GENTLY hash-object: always try to set up the git repository
| * | patch-id: use RUN_SETUP_GENTLYJeff King2016-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch-id does not require a repository because it is just processing the incoming diff on stdin, but it may look at git config for keys like patchid.stable. Even though we do not setup_git_directory(), this works from the top-level of a repository because we blindly look at ".git/config" in this case. But as the included test demonstrates, it does not work from a subdirectory. We can fix it by using RUN_SETUP_GENTLY. We do not take any filenames from the user on the command line, so there's no need to adjust them via prefix_filename(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'rt/help-unknown'Junio C Hamano2016-09-08
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | "git nosuchcommand --help" said "No manual entry for gitnosuchcommand", which was not intuitive, given that "git nosuchcommand" said "git: 'nosuchcommand' is not a git command". * rt/help-unknown: help: make option --help open man pages only for Git commands help: introduce option --exclude-guides
| * | help: make option --help open man pages only for Git commandsRalf Thielow2016-08-30
| |/ | | | | | | | | | | | | | | | | | | | | | | If option --help is passed to a Git command, we try to open the man page of that command. However, we do it for both commands and concepts. Make sure it is an actual command. This makes "git <concept> --help" not working anymore, while "git help <concept>" still works. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | common-main: call git_setup_gettext()Jeff King2016-07-01
| | | | | | | | | | | | | | | | | | | | | | This should be part of every program, as otherwise users do not get translated error messages. However, some external commands forgot to do so (e.g., git-credential-store). This fixes them, and eliminates the repeated code in programs that did remember to use it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | common-main: call restore_sigpipe_to_default()Jeff King2016-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | This is another safety/sanity setup that should be in force everywhere, but which we only applied in git.c. This did catch most cases, since even external commands are typically run via "git ..." (and the restoration applies to sub-processes, too). But there were cases we missed, such as somebody calling git-upload-pack directly via ssh, or scripts which use dashed external commands directly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | common-main: call sanitize_stdfds()Jeff King2016-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is setup that should be done in every program for safety, but we never got around to adding it everywhere (so builtins benefited from the call in git.c, but any external commands did not). Putting it in the common main() gives us this safety everywhere. Note that the case in daemon.c is a little funny. We wait until we know whether we want to daemonize, and then either: - call daemonize(), which will close stdio and reopen it to /dev/null under the hood - sanitize_stdfds(), to fix up any odd cases But that is way too late; the point of sanitizing is to give us reliable descriptors on 0/1/2, and we will already have executed code, possibly called die(), etc. The sanitizing should be the very first thing that happens. With this patch, git-daemon will sanitize first, and can remove the call in the non-daemonize case. It does mean that daemonize() may just end up closing the descriptors we opened, but that's not a big deal (it's not wrong to do so, nor is it really less optimal than the case where our parent process redirected us from /dev/null ahead of time). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | common-main: call git_extract_argv0_path()Jeff King2016-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every program which links against libgit.a must call this function, or risk hitting an assert() in system_path() that checks whether we have configured argv0_path (though only when RUNTIME_PREFIX is defined, so essentially only on Windows). Looking at the diff, you can see that putting it into the common main() saves us having to do it individually in each of the external commands. But what you can't see are the cases where we _should_ have been doing so, but weren't (e.g., git-credential-store, and all of the t/helper test programs). This has been an accident-waiting-to-happen for a long time, but wasn't triggered until recently because it involves one of those programs actually calling system_path(). That happened with git-credential-store in v2.8.0 with ae5f677 (lazily load core.sharedrepository, 2016-03-11). The program: - takes a lock file, which... - opens a tempfile, which... - calls adjust_shared_perm to fix permissions, which... - lazy-loads the config (as of ae5f677), which... - calls system_path() to find the location of /etc/gitconfig On systems with RUNTIME_PREFIX, this means credential-store reliably hits that assert() and cannot be used. We never noticed in the test suite, because we set GIT_CONFIG_NOSYSTEM there, which skips the system_path() lookup entirely. But if we were to tweak git_config() to find /etc/gitconfig even when we aren't going to open it, then the test suite shows multiple failures (for credential-store, and for some other test helpers). I didn't include that tweak here because it's way too specific to this particular call to be worth carrying around what is essentially dead code. The implementation is fairly straightforward, with one exception: there is exactly one caller (git.c) that actually cares about the result of the function, and not the side-effect of setting up argv0_path. We can accommodate that by simply replacing the value of argv[0] in the array we hand down to cmd_main(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | add an extra level of indirection to main()Jeff King2016-07-01
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* setup: make startup_info available everywhereJeff King2016-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a60645f (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/tighten-alloc'Junio C Hamano2016-02-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...
| * convert manual allocations to argv_arrayJeff King2016-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are many manual argv allocations that predate the argv_array API. Switching to that API brings a few advantages: 1. We no longer have to manually compute the correct final array size (so it's one less thing we can screw up). 2. In many cases we had to make a separate pass to count, then allocate, then fill in the array. Now we can do it in one pass, making the code shorter and easier to follow. 3. argv_array handles memory ownership for us, making it more obvious when things should be free()d and and when not. Most of these cases are pretty straightforward. In some, we switch from "run_command_v" to "run_command" which lets us directly use the argv_array embedded in "struct child_process". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'ak/git-strip-extension-from-dashed-command'Junio C Hamano2016-02-26
|\ \ | | | | | | | | | | | | | | | | | | Code simplification. * ak/git-strip-extension-from-dashed-command: git.c: simplify stripping extension of a file in handle_builtin()
| * | git.c: simplify stripping extension of a file in handle_builtin()Alexander Kuleshov2016-02-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The handle_builtin() starts from stripping of command extension if STRIP_EXTENSION is enabled. Actually STRIP_EXTENSION does not used anywhere else. This patch introduces strip_extension() helper to strip STRIP_EXTENSION extension from argv[0] with the strip_suffix() instead of manually stripping. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'nd/clear-gitenv-upon-use-of-alias'Junio C Hamano2016-02-17
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The automatic typo correction applied to an alias was broken with a recent change already in 'master'. * nd/clear-gitenv-upon-use-of-alias: restore_env(): free the saved environment variable once we are done git: simplify environment save/restore logic git: protect against unbalanced calls to {save,restore}_env() git: remove an early return from save_env_before_alias()
| * | | restore_env(): free the saved environment variable once we are doneJunio C Hamano2016-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like we free orig_cwd, which is the value of the original working directory saved in save_env_before_alias(), once we are done with it, the contents of orig_env[] array, saved in the save_env_before_alias() function should be freed; otherwise, the second and subsequent calls to save/restore pair will leak the memory allocated in save_env_before_alias(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | git: simplify environment save/restore logicJunio C Hamano2016-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only code that cares about the value of the global variable saved_env_before_alias after the previous fix is handle_builtin() that turns into a glorified no-op when the variable is true, so the logic could safely be lifted to its caller, i.e. the caller can refrain from calling it when the variable is set. This variable tells us if save_env_before_alias() was called (with or without matching restore_env()), but the sole caller of the function, handle_alias(), always calls it as the first thing, so we can consider that the variable essentially keeps track of the fact that handle_alias() has ever been called. It turns out that handle_builtin() and handle_alias() are called only from one function in a way that the value of the variable matters, which is run_argv(), and it already keeps track of the fact that it already called handle_alias(). So we can simplify the whole thing by: - Change handle_builtin() to always make a direct call to the builtin implementation it finds, and make sure the caller refrains from calling it if handle_alias() has ever been called; - Remove saved_env_before_alias variable, and instead use the local "done_alias" variable maintained inside run_argv() to make the same decision. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | git: protect against unbalanced calls to {save,restore}_env()Junio C Hamano2016-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We made sure that save_env_before_alias() does not skip saving the environment when asked to (which led to use-after-free of orig_cwd in restore_env() in the buggy version) with the previous step. Protect against future breakage where somebody adds new callers of these functions in an unbalanced fashion. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | git: remove an early return from save_env_before_alias()Junio C Hamano2016-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When help.autocorrect is in effect, an attempt to auto-execute an uniquely corrected result of a misspelt alias will result in an irrelevant error message. The codepath that causes this calls save_env_before_alias() and restore_env() in handle_alias(), and that happens twice. A global variable orig_cwd is allocated to hold the return value of getcwd() in save_env_before_alias(), which is then used in restore_env() to go back to that directory and finally free(3)'d there. However, save_env_before_alias() is not prepared to be called twice. It returns early when it knows it has already been called, leaving orig_cwd undefined, which is then checked in the second call to restore_env(), and by that time, the memory that used to hold the contents of orig_cwd is either freed or reused to hold something else, and this is fed to chdir(2), causing it to fail. Even if it did not fail (i.e. reading of the already free'd piece of memory yielded a directory path that we can chdir(2) to), it then gets free(3)'d. Fix this by making sure save_env() does do the saving when called. While at it, add a minimal test for help.autocorrect facility. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'nd/clear-gitenv-upon-use-of-alias'Junio C Hamano2016-01-20
|\ \ \ \ | |/ / / | | | / | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | d95138e6 (setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR, 2015-06-26) attempted to work around a glitch in alias handling by overwriting GIT_WORK_TREE environment variable to affect subprocesses when set_git_work_tree() gets called, which resulted in a rather unpleasant regression to "clone" and "init". Try to address the same issue by always restoring the environment and respawning the real underlying command when handling alias. * nd/clear-gitenv-upon-use-of-alias: run-command: don't warn on SIGPIPE deaths git.c: make sure we do not leak GIT_* to alias scripts setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when .. git.c: make it clear save_env() is for alias handling only
| * | git.c: make sure we do not leak GIT_* to alias scriptsNguyễn Thái Ngọc Duy2015-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The unfortunate commit d95138e (setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR - 2015-06-26) exposes another problem, besides git-clone that's described in the previous commit. If GIT_WORK_TREE (or even GIT_DIR) is exported to an alias script, it may mislead git commands in the script where the repo is. Granted, most scripts work on the repo where the alias is summoned from. But nowhere do we forbid the script to visit another repository. The revert of d95138e in the previous commit is sufficient as a fix. However, to protect us from accidentally leaking GIT_* environment variables again, we restore certain sensitive env before calling the external script. GIT_PREFIX is let through because there's another setup side effect that we simply accepted so far: current working directory is moved. Maybe in future we can introduce a new alias format that guarantees no cwd move, then we can unexport GIT_PREFIX. Reported-by: Gabriel Ganne <gabriel.ganne@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when ..Nguyễn Thái Ngọc Duy2015-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d95138e [1] attempted to fix a .git file problem by setting GIT_WORK_TREE whenever GIT_DIR is set. It sounded harmless because we handle GIT_DIR and GIT_WORK_TREE side by side for most commands, with two exceptions: git-init and git-clone. "git clone" is not happy with d95138e. This command ignores GIT_DIR but respects GIT_WORK_TREE [2] [3] which means it used to run fine from a hook, where GIT_DIR was set but GIT_WORK_TREE was not (*). With d95138e, GIT_WORK_TREE is set all the time and git-clone interprets that as "I give you order to put the worktree here", usually against the user's intention. The solution in d95138e is reverted earlier, and instead we reuse the solution from c056261 [4]. It fixed another setup-messed- up-by-alias by saving and restoring env and spawning a new process, but for git-clone and git-init only. Now we conclude that setup-messed-up-by-alias is always evil. So the env restoration is done for _all_ commands, including external ones, whenever aliases are involved. It fixes what d95138e tried to fix, without upsetting git-clone-inside-hooks. The test from d95138e remains to verify it's not broken by this. A new test is added to make sure git-clone-inside-hooks remains happy. (*) GIT_WORK_TREE was not set _most of the time_. In some cases GIT_WORK_TREE is set and git-clone will behave differently. The use of GIT_WORK_TREE to direct git-clone to put work tree elsewhere looks like a mistake because it causes surprises this way. But that's a separate story. [1] d95138e (setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR - 2015-06-26) [2] 2beebd2 (clone: create intermediate directories of destination repo - 2008-06-25) [3] 20ccef4 (make git-clone GIT_WORK_TREE aware - 2007-07-06) [4] c056261 (git potty: restore environments after alias expansion - 2014-06-08) Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | git.c: make it clear save_env() is for alias handling onlyNguyễn Thái Ngọc Duy2015-12-22
| | | | | | | | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'sb/submodule-helper'Junio C Hamano2015-10-05
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The infrastructure to rewrite "git submodule" in C is being built incrementally. Let's polish these early parts well enough and make them graduate to 'next' and 'master', so that the more involved follow-up can start cooking on a solid ground. * sb/submodule-helper: submodule: rewrite `module_clone` shell function in C submodule: rewrite `module_name` shell function in C submodule: rewrite `module_list` shell function in C
| * | | submodule: rewrite `module_list` shell function in CStefan Beller2015-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the submodule operations work on a set of submodules. Calculating and using this set is usually done via: module_list "$@" | { while read mode sha1 stage sm_path do # the actual operation done } Currently the function `module_list` is implemented in the git-submodule.sh as a shell script wrapping a perl script. The rewrite is in C, such that it is faster and can later be easily adapted when other functions are rewritten in C. git-submodule.sh, similar to the builtin commands, will navigate to the top-most directory of the repository and keep the subdirectory as a variable. As the helper is called from within the git-submodule.sh script, we are already navigated to the root level, but the path arguments are still relative to the subdirectory we were in when calling git-submodule.sh. That's why there is a `--prefix` option pointing to an alternative path which to anchor relative path arguments. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>