| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
This function was used for comparing local and remote ref
names during fetch (which makes it a candidate for "most
confusingly named function of the year").
It no longer has any callers, so let's get rid of it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The get_remote_heads function reads the list of remote refs
during git protocol session. It dates all the way back to
def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04).
At that time, the idea was to come up with a list of refs we
were interested in, and then filter the list as we got it
from the remote side.
Later, 1baaae5 (Make maximal use of the remote refs,
2005-10-28) stopped filtering at the get_remote_heads layer,
letting us use the non-matching refs to find common history.
As a result, all callers now simply pass an empty match
list (and any future callers will want to do the same). So
let's drop these now-useless parameters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\
| |
| |
| |
| |
| | |
* jk/maint-pack-objects-compete-with-delete:
downgrade "packfile cannot be accessed" errors to warnings
pack-objects: protect against disappearing packs
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It's possible that while pack-objects is running, a
simultaneously running prune process might delete a pack
that we are interested in. Because we load the pack indices
early on, we know that the pack contains our item, but by
the time we try to open and map it, it is gone.
Since c715f78, we already protect against this in the normal
object access code path, but pack-objects accesses the packs
at a lower level. In the normal access path, we call
find_pack_entry, which will call find_pack_entry_one on each
pack index, which does the actual lookup. If it gets a hit,
we will actually open and verify the validity of the
matching packfile (using c715f78's is_pack_valid). If we
can't open it, we'll issue a warning and pretend that we
didn't find it, causing us to go on to the next pack (or on
to loose objects).
Furthermore, we will cache the descriptor to the opened
packfile. Which means that later, when we actually try to
access the object, we are likely to still have that packfile
opened, and won't care if it has been unlinked from the
filesystem.
Notice the "likely" above. If there is another pack access
in the interim, and we run out of descriptors, we could
close the pack. And then a later attempt to access the
closed pack could fail (we'll try to re-open it, of course,
but it may have been deleted). In practice, this doesn't
happen because we tend to look up items and then access them
immediately.
Pack-objects does not follow this code path. Instead, it
accesses the packs at a much lower level, using
find_pack_entry_one directly. This means we skip the
is_pack_valid check, and may end up with the name of a
packfile, but no open descriptor.
We can add the same is_pack_valid check here. Unfortunately,
the access patterns of pack-objects are not quite as nice
for keeping lookup and object access together. We look up
each object as we find out about it, and the only later when
writing the packfile do we necessarily access it. Which
means that the opened packfile may be closed in the interim.
In practice, however, adding this check still has value, for
three reasons.
1. If you have a reasonable number of packs and/or a
reasonable file descriptor limit, you can keep all of
your packs open simultaneously. If this is the case,
then the race is impossible to trigger.
2. Even if you can't keep all packs open at once, you
may end up keeping the deleted one open (i.e., you may
get lucky).
3. The race window is shortened. You may notice early that
the pack is gone, and not try to access it. Triggering
the problem without this check means deleting the pack
any time after we read the list of index files, but
before we access the looked-up objects. Triggering it
with this check means deleting the pack means deleting
the pack after we do a lookup (and successfully access
the packfile), but before we access the object. Which
is a smaller window.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* ph/transport-with-gitfile:
Fix is_gitfile() for files too small or larger than PATH_MAX to be a gitfile
Add test showing git-fetch groks gitfiles
Teach transport about the gitfile mechanism
Learn to handle gitfiles in enter_repo
enter_repo: do not modify input
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
entr_repo(..., 0) currently modifies the input to strip away
trailing slashes. This means that we some times need to copy the
input to keep the original.
Change it to unconditionally copy it into the used_path buffer so
we can safely use the input without having to copy it. Also store
a working copy in validated_path up-front before we start
resolving anything.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Phil Hord <hordp@cisco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* bc/attr-ignore-case:
attr.c: respect core.ignorecase when matching attribute patterns
attr: read core.attributesfile from git_default_core_config
builtin/mv.c: plug miniscule memory leak
cleanup: use internal memory allocation wrapper functions everywhere
attr.c: avoid inappropriate access to strbuf "buf" member
Conflicts:
transport-helper.c
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This code calls git_config from a helper function to parse the config entry
it is interested in. Calling git_config in this way may cause a problem if
the helper function can be called after a previous call to git_config by
another function since the second call to git_config may reset some
variable to the value in the config file which was previously overridden.
The above is not a problem in this case since the function passed to
git_config only parses one config entry and the variable it sets is not
assigned outside of the parsing function. But a programmer who desires
all of the standard config options to be parsed may be tempted to modify
git_attr_config() so that it falls back to git_default_config() and then it
_would_ be vulnerable to the above described behavior.
So, move the call to git_config up into the top-level cmd_* function and
move the responsibility for parsing core.attributesfile into the main
config file parser.
Which is only the logical thing to do ;-)
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \
| | |
| | |
| | |
| | | |
* jk/name-hash-dirent:
fix phantom untracked files when core.ignorecase is set
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When core.ignorecase is turned on and there are stale index
entries, "git commit" can sometimes report directories as
untracked, even though they contain tracked files.
You can see an example of this with:
# make a case-insensitive repo
git init repo && cd repo &&
git config core.ignorecase true &&
# with some tracked files in a subdir
mkdir subdir &&
> subdir/one &&
> subdir/two &&
git add . &&
git commit -m base &&
# now make the index entries stale
touch subdir/* &&
# and then ask commit to update those entries and show
# us the status template
git commit -a
which will report "subdir/" as untracked, even though it
clearly contains two tracked files. What is happening in the
commit program is this:
1. We load the index, and for each entry, insert it into the index's
name_hash. In addition, if ignorecase is turned on, we make an
entry in the name_hash for the directory (e.g., "contrib/"), which
uses the following code from 5102c61's hash_index_entry_directories:
hash = hash_name(ce->name, ptr - ce->name);
if (!lookup_hash(hash, &istate->name_hash)) {
pos = insert_hash(hash, &istate->name_hash);
if (pos) {
ce->next = *pos;
*pos = ce;
}
}
Note that we only add the directory entry if there is not already an
entry.
2. We run add_files_to_cache, which gets updated information for each
cache entry. It helpfully inserts this information into the cache,
which calls replace_index_entry. This in turn calls
remove_name_hash() on the old entry, and add_name_hash() on the new
one. But remove_name_hash doesn't actually remove from the hash, it
only marks it as "no longer interesting" (from cache.h):
/*
* We don't actually *remove* it, we can just mark it invalid so that
* we won't find it in lookups.
*
* Not only would we have to search the lists (simple enough), but
* we'd also have to rehash other hash buckets in case this makes the
* hash bucket empty (common). So it's much better to just mark
* it.
*/
static inline void remove_name_hash(struct cache_entry *ce)
{
ce->ce_flags |= CE_UNHASHED;
}
This is OK in the specific-file case, since the entries in the hash
form a linked list, and we can just skip the "not here anymore"
entries during lookup.
But for the directory hash entry, we will _not_ write a new entry,
because there is already one there: the old one that is actually no
longer interesting!
3. While traversing the directories, we end up in the
directory_exists_in_index_icase function to see if a directory is
interesting. This in turn checks index_name_exists, which will
look up the directory in the index's name_hash. We see the old,
deleted record, and assume there is nothing interesting. The
directory gets marked as untracked, even though there are index
entries in it.
The problem is in the code I showed above:
hash = hash_name(ce->name, ptr - ce->name);
if (!lookup_hash(hash, &istate->name_hash)) {
pos = insert_hash(hash, &istate->name_hash);
if (pos) {
ce->next = *pos;
*pos = ce;
}
}
Having a single cache entry that represents the directory is
not enough; that entry may go away if the index is changed.
It may be tempting to say that the problem is in our removal
method; if we removed the entry entirely instead of simply
marking it as "not here anymore", then we would know we need
to insert a new entry. But that only covers this particular
case of remove-replace. In the more general case, consider
something like this:
1. We add "foo/bar" and "foo/baz" to the index. Each gets
their own entry in name_hash, plus we make a "foo/"
entry that points to "foo/bar".
2. We remove the "foo/bar" entry from the index, and from
the name_hash.
3. We ask if "foo/" exists, and see no entry, even though
"foo/baz" exists.
So we need that directory entry to have the list of _all_
cache entries that indicate that the directory is tracked.
So that implies making a linked list as we do for other
entries, like:
hash = hash_name(ce->name, ptr - ce->name);
pos = insert_hash(hash, &istate->name_hash);
if (pos) {
ce->next = *pos;
*pos = ce;
}
But that's not right either. In fact, it shows a second bug
in the current code, which is that the "ce->next" pointer is
supposed to be linking entries for a specific filename
entry, but here we are overwriting it for the directory
entry. So the same cache entry ends up in two linked lists,
but they share the same "next" pointer.
As it turns out, this second bug can't be triggered in the
current code. The "if (pos)" conditional is totally dead
code; pos will only be non-NULL if there was an existing
hash entry, and we already checked that there wasn't one
through our call to lookup_hash.
But fixing the first bug means taking out that call to
lookup_hash, which is going to activate the buggy dead code,
and we'll end up splicing the two linked lists together.
So we need to have a separate next pointer for the list in
the directory bucket, and we need to traverse that list in
index_name_exists when we are looking up a directory.
This bloats "struct cache_entry" by a few bytes. Which is
annoying, because it's only necessary when core.ignorecase
is enabled. There's not an easy way around it, short of
separating out the "next" pointers from cache_entry entirely
(i.e., having a separate "cache_entry_list" struct that gets
stored in the name_hash). In practice, it probably doesn't
matter; we have thousands of cache entries, compared to the
millions of objects (where adding 4 bytes to the struct
actually does impact performance).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* mh/check-ref-format-3: (23 commits)
add_ref(): verify that the refname is formatted correctly
resolve_ref(): expand documentation
resolve_ref(): also treat a too-long SHA1 as invalid
resolve_ref(): emit warnings for improperly-formatted references
resolve_ref(): verify that the input refname has the right format
remote: avoid passing NULL to read_ref()
remote: use xstrdup() instead of strdup()
resolve_ref(): do not follow incorrectly-formatted symbolic refs
resolve_ref(): extract a function get_packed_ref()
resolve_ref(): turn buffer into a proper string as soon as possible
resolve_ref(): only follow a symlink that contains a valid, normalized refname
resolve_ref(): use prefixcmp()
resolve_ref(): explicitly fail if a symlink is not readable
Change check_refname_format() to reject unnormalized refnames
Inline function refname_format_print()
Make collapse_slashes() allocate memory for its result
Do not allow ".lock" at the end of any refname component
Refactor check_refname_format()
Change check_ref_format() to take a flags argument
Change bad_ref_char() to return a boolean value
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Record information about resolve_ref(), hard-won via reverse
engineering, in a comment for future spelunkers.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Previously, get_sha1_hex() would read one character past the end of a
null-terminated string whose strlen was an even number less than 40.
Although the function correctly returned -1 in these cases, the extra
memory access might have been to uninitialized (or even, conceivably,
unallocated) memory.
Add a check to avoid reading past the end of a string.
This problem was discovered by Thomas Rast <trast@student.ethz.ch>
using valgrind.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* cb/common-prefix-unification:
rename pathspec_prefix() to common_prefix() and move to dir.[ch]
consolidate pathspec_prefix and common_prefix
remove prefix argument from pathspec_prefix
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Also make common_prefix_len() static as this refactoring makes dir.c
itself the only caller of this helper function.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Passing a prefix to a function that is supposed to find the prefix is
strange. And it's really only used if the pathspec is NULL. Make the
callers handle this case instead.
As we are always returning a fresh copy of a string (or NULL), change the
type of the returned value to non-const "char *".
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* fg/submodule-git-file-git-dir:
Move git-dir for submodules
rev-parse: add option --resolve-git-dir <path>
Conflicts:
cache.h
git-submodule.sh
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Check if <path> is a valid git-dir or a valid git-file that points
to a valid git-dir.
We want tests to be independent from the fact that a git-dir may
be a git-file. Thus we changed tests to use this feature.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Mentored-by: Jens Lehmann <Jens.Lehmann@web.de>
Mentored-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \
| |_|_|/ /
|/| | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* rr/revert-cherry-pick-continue:
builtin/revert.c: make commit_list_append() static
revert: Propagate errors upwards from do_pick_commit
revert: Introduce --continue to continue the operation
revert: Don't implicitly stomp pending sequencer operation
revert: Remove sequencer state when no commits are pending
reset: Make reset remove the sequencer state
revert: Introduce --reset to remove sequencer state
revert: Make pick_commits functionally act on a commit list
revert: Save command-line options for continuing operation
revert: Save data for continuing after conflict resolution
revert: Don't create invalid replay_opts in parse_args
revert: Separate cmdline parsing from functional code
revert: Introduce struct to keep command-line options
revert: Eliminate global "commit" variable
revert: Rename no_replay to record_origin
revert: Don't check lone argument in get_encoding
revert: Simplify and inline add_message_to_msg
config: Introduce functions to write non-standard file
advice: Introduce error_resolve_conflict
|
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Introduce two new functions corresponding to "git_config_set" and
"git_config_set_multivar" to write a non-standard configuration file.
Expose these new functions in cache.h for other git programs to use.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* nd/maint-clone-gitdir:
clone: allow to clone from .git file
read_gitfile_gently(): rename misnamed function to read_gitfile()
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The function was not gentle at all to the callers and died without giving
them a chance to deal with possible errors. Rename it to read_gitfile(),
and update all the callers.
As no existing caller needs a true "gently" variant, we do not bother
adding one at this point.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | | |
* cb/maint-ls-files-error-report:
ls-files: fix pathspec display on error
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The following sequence of commands reveals an issue with error
reporting of relative paths:
$ mkdir sub
$ cd sub
$ git ls-files --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
$ git commit --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
This bug is visible only if the normalized path (i.e., the relative
path from the repository root) is longer than the prefix.
Otherwise, the code skips over the normalized path and reads from
an unused memory location which still contains a leftover of the
original command line argument.
So instead, use the existing facilities to deal with relative paths
correctly.
Also fix inconsistency between "checkout" and "commit", e.g.
$ cd Documentation
$ git checkout nosuch.txt
error: pathspec 'Documentation/nosuch.txt' did not match...
$ git commit nosuch.txt
error: pathspec 'nosuch.txt' did not match...
by propagating the prefix down the codepath that reports the error.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* js/ref-namespaces:
ref namespaces: tests
ref namespaces: documentation
ref namespaces: Support remote repositories via upload-pack and receive-pack
ref namespaces: infrastructure
Fix prefix handling in ref iteration functions
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Add support for dividing the refs of a single repository into multiple
namespaces, each of which can have its own branches, tags, and HEAD.
Git can expose each namespace as an independent repository to pull from
and push to, while sharing the object store, and exposing all the refs
to operations such as git-gc.
Storing multiple repositories as namespaces of a single repository
avoids storing duplicate copies of the same objects, such as when
storing multiple branches of the same source. The alternates mechanism
provides similar support for avoiding duplicates, but alternates do not
prevent duplication between new objects added to the repositories
without ongoing maintenance, while namespaces do.
To specify a namespace, set the GIT_NAMESPACE environment variable to
the namespace. For each ref namespace, git stores the corresponding
refs in a directory under refs/namespaces/. For example,
GIT_NAMESPACE=foo will store refs under refs/namespaces/foo/. You can
also specify namespaces via the --namespace option to git.
Note that namespaces which include a / will expand to a hierarchy of
namespaces; for example, GIT_NAMESPACE=foo/bar will store refs under
refs/namespaces/foo/refs/namespaces/bar/. This makes paths in
GIT_NAMESPACE behave hierarchically, so that cloning with
GIT_NAMESPACE=foo/bar produces the same result as cloning with
GIT_NAMESPACE=foo and cloning from that repo with GIT_NAMESPACE=bar. It
also avoids ambiguity with strange namespace paths such as
foo/refs/heads/, which could otherwise generate directory/file conflicts
within the refs directory.
Add the infrastructure for ref namespaces: handle the GIT_NAMESPACE
environment variable and --namespace option, and support iterating over
refs in a namespace.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \
| |_|_|/ /
|/| | | /
| | |_|/
| |/| | |
* cb/partial-commit-relative-pathspec:
commit: allow partial commits with relative paths
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to do partial commits, git-commit overlays a tree on the
cache and checks pathspecs against the result. Currently, the
overlaying is done using "prefix" which prevents relative pathspecs
with ".." and absolute pathspec from matching when they refer to
files not under "prefix" and absent from the index, but still in
the tree (i.e. files staged for removal).
The point of providing a prefix at all is performance optimization.
If we say there is no common prefix for the files of interest, then
we have to read the entire tree into the index.
But even if we cannot use the working directory as a prefix, we can
still figure out if there is a common prefix for all given paths,
and use that instead. The pathspec_prefix() routine from ls-files.c
does exactly that.
Any use of global variables is removed from pathspec_prefix() so
that it can be called from commit.c.
Reported-by: Reuben Thomas <rrt@sc3d.org>
Analyzed-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | | |
* jc/pack-order-tweak:
pack-objects: optimize "recency order"
core: log offset pack data accesses happened
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In a workload other than "git log" (without pathspec nor any option that
causes us to inspect trees and blobs), the recency pack order is said to
cause the access jump around quite a bit. Add a hook to allow us observe
how bad it is.
"git config core.logpackaccess /var/tmp/pal.txt" will give you the log
in the specified file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* jc/index-pack:
verify-pack: use index-pack --verify
index-pack: show histogram when emulating "verify-pack -v"
index-pack: start learning to emulate "verify-pack -v"
index-pack: a miniscule refactor
index-pack --verify: read anomalous offsets from v2 idx file
write_idx_file: need_large_offset() helper function
index-pack: --verify
write_idx_file: introduce a struct to hold idx customization options
index-pack: group the delta-base array entries also by type
Conflicts:
builtin/verify-pack.c
cache.h
sha1_file.c
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This finally gets rid of the inefficient verify-pack implementation that
walks objects in the packfile in their object name order and replaces it
with a call to index-pack --verify. As a side effect, it also removes
packed_object_info_detail() API which is rather expensive.
As this changes the way errors are reported (verify-pack used to rely on
the usual runtime error detection routine unpack_entry() to diagnose the
CRC errors in an entry in the *.idx file; index-pack --verify checks the
whole *.idx file in one go), update a test that expected the string "CRC"
to appear in the error message.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
* jk/clone-cmdline-config:
clone: accept config options on the command line
config: make git_config_parse_parameter a public function
remote: use new OPT_STRING_LIST
parse-options: add OPT_STRING_LIST helper
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We use this internally to parse "git -c core.foo=bar", but
the general format of "key=value" is useful for other
places.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
* jc/zlib-wrap:
zlib: allow feeding more than 4GB in one go
zlib: zlib can only process 4GB at a time
zlib: wrap deflateBound() too
zlib: wrap deflate side of the API
zlib: wrap inflateInit2 used to accept only for gzip format
zlib: wrap remaining calls to direct inflate/inflateEnd
zlib wrapper: refactor error message formatter
Conflicts:
sha1_file.c
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.
But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.
In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.
Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().
There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
http-backend.c uses inflateInit2() to tell the library that it wants to
accept only gzip format. Wrap it in a helper function so that readers do
not have to wonder what the magic numbers 15 and 16 are for.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
* jk/maint-config-alias-fix:
handle_options(): do not miscount how many arguments were used
config: always parse GIT_CONFIG_PARAMETERS during git_config
git_config: don't peek at global config_parameters
config: make environment parsing routines static
|
| |\ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
* jk/git-connection-deadlock-fix:
test core.gitproxy configuration
send-pack: avoid deadlock on git:// push with failed pack-objects
connect: let callers know if connection is a socket
connect: treat generic proxy processes like ssh processes
Conflicts:
connect.c
|
| |\ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
* jc/pack-objects-bigfile:
Teach core.bigfilethreashold to pack-objects
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
* jc/streaming-filter:
t0021: test application of both crlf and ident
t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test
streaming: filter cascading
streaming filter: ident filter
Add LF-to-CRLF streaming conversion
stream filter: add "no more input" to the filters
Add streaming filter API
convert.h: move declarations for conversion from cache.h
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Before adding the streaming filter API to the conversion layer,
move the existing declarations related to the conversion to its
own header file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \ \ \ \ \
| |/ / / / / / / / /
| | | | | | | / / /
| |_|_|_|_|_|/ / /
|/| | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
* jc/streaming:
sha1_file: use the correct type (ssize_t, not size_t) for read-style function
streaming: read loose objects incrementally
sha1_file.c: expose helpers to read loose objects
streaming: read non-delta incrementally from a pack
streaming_write_entry(): support files with holes
convert: CRLF_INPUT is a no-op in the output codepath
streaming_write_entry(): use streaming API in write_entry()
streaming: a new API to read from the object store
write_entry(): separate two helper functions out
unpack_object_header(): make it public
sha1_object_info_extended(): hint about objects in delta-base cache
sha1_object_info_extended(): expose a bit more info
packed_object_info_detail(): do not return a string
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Make map_sha1_file(), parse_sha1_header() and unpack_sha1_header()
available to the streaming read API by exporting them via cache.h header
file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When the output to a path does not have to be converted, we can read from
the object database from the streaming API and write to the file in the
working tree, without having to hold everything in the memory.
The ident, auto- and safe- crlf conversions inherently require you to read
the whole thing before deciding what to do, so while it is technically
possible to support them by using a buffer of an unbound size or rewinding
and reading the stream twice, it is less practical than the traditional
"read the whole thing in core and convert" approach.
Adding streaming filters for the other conversions on top of this should
be doable by tweaking the can_bypass_conversion() function (it should be
renamed to can_filter_stream() when it happens). Then the streaming API
can be extended to wrap the git_istream streaming_write_entry() opens on
the underlying object in another git_istream that reads from it, filters
what is read, and let the streaming_write_entry() read the filtered
result. But that is outside the scope of this series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This function is used to read and skip over the per-object header
in a packfile.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
An object found in the delta-base cache is not guaranteed to
stay there, but we know it came from a pack and it is likely
to give us a quick access if we read_sha1_file() it right now,
which is a piece of useful information.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The original interface for sha1_object_info() takes an object name and
gives back a type and its size (the latter is given only when it was
asked). The new interface wraps its implementation and exposes a bit
more pieces of information that the interface used to discard, namely:
- where the object is stored (loose? cached? packed?)
- if packed, where in which packfile?
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* In the earlier round, this used u.pack.delta to record the length of
the delta chain, but the caller is not necessarily interested in the
length of the delta chain per-se, but may only want to know if it is a
delta against another object or is stored as a deflated data. Calling
packed_object_info_detail() involves walking the reverse index chain to
compute the store size of the object and is unnecessarily expensive.
We could resurrect the code if a new caller wants to know, but I doubt
it.
|