diff options
author | Nicolas Pitre <nico@cam.org> | 2007-04-16 12:32:13 -0400 |
---|---|---|
committer | Junio C Hamano <junkio@cox.net> | 2007-04-16 17:43:31 -0700 |
commit | 5c49c11686df9d1c27a194349d0b2092e6446f42 (patch) | |
tree | 756ebac6660db74f88ff30c83abcfa9e2ea60aeb /README | |
parent | 54dab52ae8518da67e271b5b3a1f91af1fd5e314 (diff) | |
download | git-5c49c11686df9d1c27a194349d0b2092e6446f42.tar.gz git-5c49c11686df9d1c27a194349d0b2092e6446f42.tar.xz |
pack-objects: better check_object() performances
With large amount of objects, check_object() is really trashing the pack
sliding map and the filesystem cache. It has a completely random access
pattern especially with old objects where delta replay jumps back and
forth all over the pack.
This patch improves things by:
1) sorting objects by their offset in pack before calling check_object()
so the pack access pattern is linear;
2) recording the object type at add_object_entry() time since it is
already known in most cases;
3) recording the pack offset even for preferred_base objects;
4) avoid calling sha1_object_info() if all possible.
This limits pack accesses to the bare minimum and makes them perfectly
linear.
In the process check_object() was made more clear (to me at least).
Note: I thought about walking the sorted_by_offset list backward in
get_object_details() so if a pack happens to be larger than the available
file cache, then the cache would have been populated with useful data from
the beginning of the pack already when find_deltas() is called. Strangely,
testing (on Linux) showed absolutely no performance difference.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Diffstat (limited to 'README')
0 files changed, 0 insertions, 0 deletions