aboutsummaryrefslogtreecommitdiff
path: root/notes.c
Commit message (Collapse)AuthorAge
* Notes API: prune_notes(): Prune notes that belong to non-existing objectsJohan Herland2010-02-13
| | | | | | | | | | | | | | | | | | | When an object is made unreachable by Git, any notes that annotate that object are not automagically made unreachable, since all notes are always trivially reachable from a notes ref. In order to remove notes for non-existing objects, we therefore need to add functionality for traversing the notes tree and explicitly removing references to notes that annotate non-reachable objects. Thus the notes objects themselves also become unreachable, and are removed by a later garbage collect. prune_notes() performs this traversal (by using for_each_note() internally), and removes the notes in question from the notes tree. Note that the effect of prune_notes() is not persistent unless a subsequent call to write_notes_tree() is made. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Teach notes code to properly preserve non-notes in the notes treeJohan Herland2010-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | The note tree structure allows for non-note entries to coexist with note entries in a notes tree. Although we certainly expect there to be very few non-notes in a notes tree, we should still support them to a certain degree. This patch teaches the notes code to preserve non-notes when updating the notes tree with write_notes_tree(). Non-notes are not affected by fanout restructuring. For non-notes to be handled correctly, we can no longer allow subtree entries that do not match the fanout structure produced by the notes code itself. This means that fanouts like 4/36, 6/34, 8/32, 4/4/32, etc. are no longer recognized as note subtrees; only 2-based fanouts are allowed (2/38, 2/2/36, 2/2/2/34, etc.). Since the notes code has never at any point _produced_ non-2-based fanouts, it is highly unlikely that this change will cause problems for anyone. The patch also adds some tests verifying the correct handling of non-notes in a notes tree. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Refactor notes concatenation into a flexible interface for combining notesJohan Herland2010-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a note to an object that already has an existing note, the current solution is to concatenate the contents of the two notes. However, the caller may instead wish to _overwrite_ the existing note with the new note, or maybe even _ignore_ the new note, and keep the existing one. There might also be other ways of combining notes that are only known to the caller. Therefore, instead of unconditionally concatenating notes, we let the caller specify how to combine notes, by passing in a pointer to a function for combining notes. The caller may choose to implement its own function for notes combining, but normally one of the following three conveniently supplied notes combination functions will be sufficient: - combine_notes_concatenate() combines the two notes by appending the contents of the new note to the contents of the existing note. - combine_notes_overwrite() replaces the existing note with the new note. - combine_notes_ignore() keeps the existing note, and ignores the new note. A combine_notes function can be passed to init_notes() to choose a default combine_notes function for that notes tree. If NULL is given, the notes tree falls back to combine_notes_concatenate() as the ultimate default. A combine_notes function can also be passed directly to add_note(), to control the notes combining behaviour for a note addition in particular. If NULL is passed, the combine_notes function registered for the given notes tree is used. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: Allow multiple concurrent notes trees with new struct notes_treeJohan Herland2010-02-13
| | | | | | | | | | | | | The new struct notes_tree encapsulates access to a specific notes tree. It is provided to allow callers to make use of several different notes trees simultaneously. A struct notes_tree * parameter is added to every function in the notes API. In all cases, NULL can be passed, in which case the fallback "default" notes tree (default_notes_tree) is used. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: write_notes_tree(): Store the notes tree in the databaseJohan Herland2010-02-13
| | | | | | | | | Uses for_each_note() to traverse the notes tree, and produces tree objects on the fly representing the "on-disk" version of the notes tree with appropriate fanout. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: for_each_note(): Traverse the entire notes tree with a callbackJohan Herland2010-02-13
| | | | | | | | This includes a first attempt at creating an optimal fanout scheme (which is calculated on-the-fly, while traversing). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: get_note(): Return the note annotating the given objectJohan Herland2010-02-13
| | | | | | | Created by a simple cleanup and rename of lookup_notes(). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: remove_note(): Remove note objects from the notes tree structureJohan Herland2010-02-13
| | | | | | | | This includes adding internal functions for maintaining a healthy notes tree structure after removing individual notes. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: add_note(): Add note objects to the internal notes tree structureJohan Herland2010-02-13
| | | | | Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: init_notes(): Initialize the notes tree from the given notes refJohan Herland2010-02-13
| | | | | | | | | | | | | | Created by a simple refactoring of initialize_notes(). Also add a new 'flags' parameter, which is a bitwise combination of notes initialization flags. For now, there is only one flag - NOTES_INIT_EMPTY - which indicates that the notes tree should not auto-load the contents of the given (or default) notes ref, but rather should leave the notes tree initialized to an empty state. This will become useful in the future when manipulating the notes tree through the notes API. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Notes API: get_commit_notes() -> format_note() + remove the commit restrictionJohan Herland2010-02-13
| | | | | | | | | | | | | There is really no reason why only commit objects can be annotated. By changing the struct commit parameter to get_commit_notes() into a sha1 we gain the ability to annotate any object type. To reflect this in the function naming as well, we rename get_commit_notes() to format_note(). This patch also fixes comments and variable names throughout notes.c as a consequence of the removal of the unnecessary 'commit' restriction. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Minor cosmetic fixes to notes.cJohan Herland2010-02-13
| | | | | Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix crasher on encountering SHA1-like non-note in notes treeJohan Herland2009-12-03
| | | | | | | | | | | | | | | | | | | | When loading a notes tree, the code primarily looks for SHA1-like paths whose total length (discounting directory separators) are 40 chars (interpreted as valid note entries) or less (interpreted as subtree entries that may in turn contain note entries when unpacked). However, there is an additional condition that must hold for valid subtree entries: They must be _tree_ objects (duh). This patch adds an appropriate test for this condition, thereby fixing the crash that occured when passing a non-tree object to the tree-walk API. The patch also adds another selftest verifying correct behaviour of non-notes in note trees. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Refactor notes code to concatenate multiple notes annotating the same objectJohan Herland2009-10-19
| | | | | | | | | | | | | | | | | | | | | | Currently, having multiple notes referring to the same commit from various locations in the notes tree is strongly discouraged, since only one of those notes will be parsed and shown. This patch teaches the notes code to _concatenate_ multiple notes that annotate the same commit. Notes are concatenated by creating a new blob object containing the concatenation of the notes in question, and replacing them with the concatenated note in the internal notes tree structure. Getting the concatenation right requires being more proactive in unpacking subtree entries in the internal notes tree structure, so that we don't return a note prematurely (i.e. before having found all other notes that annotate the same object). As such, this patch may incur a small performance penalty. Suggested-by: Sam Vilain <sam@vilain.net> Re-suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Teach the notes lookup code to parse notes trees with various fanout schemesJohan Herland2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The semantics used when parsing notes trees (with regards to fanout subtrees) follow Dscho's proposal fairly closely: - No concatenation/merging of notes is performed. If there are several notes objects referencing a given commit, only one of those objects are used. - If a notes object for a given commit is present in the "root" notes tree, no subtrees are consulted; the object in the root tree is used directly. - If there are more than one subtree that prefix-matches the given commit, only the subtree with the longest matching prefix is consulted. This means that if the given commit is e.g. "deadbeef", and the notes tree have subtrees "de" and "dead", then the following paths in the notes tree are searched: "deadbeef", "dead/beef". Note that "de/adbeef" is NOT searched. - Fanout directories (subtrees) must references a whole number of bytes from the SHA1 sum they subdivide. E.g. subtrees "dead" and "de" are acceptable; "d" and "dea" are not. - Multiple levels of fanout are allowed. All the above rules apply recursively. E.g. "de/adbeef" is preferred over "de/adbe/ef", etc. This patch changes the in-memory datastructure for holding parsed notes: Instead of holding all note (and subtree) entries in a hash table, a simple 16-tree structure is used instead. The tree structure consists of 16-arrays as internal nodes, and note/subtree entries as leaf nodes. The tree is traversed by indexing subsequent nibbles of the search key until a leaf node is encountered. If a subtree entry is encountered while searching for a note, the subtree is unpacked into the 16-tree structure, and the search continues into that subtree. The new algorithm performs significantly better in the cases where only a fraction of the notes need to be looked up (this is assumed to be the common case for notes lookup). The new code even performs marginally better in the worst case (where _all_ the notes are looked up). In addition to this, comes the massive performance win associated with organizing the notes tree according to some fanout scheme. Even a simple 2/38 fanout scheme is dramatically quicker to traverse (going from tens of seconds to sub-second runtimes). As for memory usage, the new code is marginally better than the old code in the worst case, but in the case of looking up only some notes from a notes tree with proper fanout, the new code uses only a small fraction of the memory needed to hold the entire notes tree. However, there is one casualty of this patch. The old notes lookup code was able to parse notes that were associated with non-SHA1s (e.g. refs). The new code requires the referenced object to be named by a SHA1 sum. Still, this is not considered a major setback, since the notes infrastructure was not originally intended to annotate objects outside the Git object database. Cc: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Teach notes code to free its internal data structures on requestJohan Herland2009-10-19
| | | | | | | | | | | There's no need to be rude to memory-concious callers... This patch has been improved by the following contributions: - Junio C Hamano: avoid old-style declaration Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add flags to get_commit_notes() to control the format of the note stringJohan Herland2009-10-19
| | | | | | | | | | | This patch adds the following flags to get_commit_notes() for adjusting the format of the produced note string: - NOTES_SHOW_HEADER: Print "Notes:" line before the notes contents - NOTES_INDENT: Indent notes contents by 4 spaces Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Speed up git notes lookupJohannes Schindelin2009-10-19
| | | | | | | | | | | | | | | | | | | | To avoid looking up each and every commit in the notes ref's tree object, which is very expensive, speed things up by slurping the tree object's contents into a hash_map. The idea for the hashmap singleton is from David Reiss, initial benchmarking by Jeff King. Note: the implementation allows for arbitrary entries in the notes tree object, ignoring those that do not reference a valid object. This allows you to annotate arbitrary branches, or objects. This patch has been improved by the following contributions: - Junio C Hamano: fixed an obvious error in initialize_hash_map() Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Introduce commit notesJohannes Schindelin2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit notes are blobs which are shown together with the commit message. These blobs are taken from the notes ref, which you can configure by the config variable core.notesRef, which in turn can be overridden by the environment variable GIT_NOTES_REF. The notes ref is a branch which contains "files" whose names are the names of the corresponding commits (i.e. the SHA-1). The rationale for putting this information into a ref is this: we want to be able to fetch and possibly union-merge the notes, maybe even look at the date when a note was introduced, and we want to store them efficiently together with the other objects. This patch has been improved by the following contributions: - Thomas Rast: fix core.notesRef documentation - Tor Arne Vestbø: fix printing of multi-line notes - Alex Riesen: Using char array instead of char pointer costs less BSS - Johan Herland: Plug leak when msg is good, but msglen or type causes return Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> get_commit_notes(): Plug memory leak when 'if' triggers, but not because of read_sha1_file() failure
* Revert "Merge branch 'js/notes'"Junio C Hamano2009-02-10
| | | | | This reverts commit 7b75b331f6744fbf953fe8913703378ef86a2189, reversing changes made to 5d680a67d7909c89af96eba4a2d77abed606292b.
* git-notes: fix printing of multi-line notesTor Arne Vestbø2009-01-14
| | | | | | | | | | | | | The line length was read from the same position every time, causing mangled output when printing notes with multiple lines. Also, adding new-line manually for each line ensures that we get a new-line between commits, matching git-log for commits without notes. Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Speed up git notes lookupJohannes Schindelin2009-01-11
| | | | | | | | | | | | | | | | | | To avoid looking up each and every commit in the notes ref's tree object, which is very expensive, speed things up by slurping the tree object's contents into a hash_map. The idea fo the hashmap singleton is from David Reiss, initial benchmarking by Jeff King. Note: the implementation allows for arbitrary entries in the notes tree object, ignoring those that do not reference a valid object. This allows you to annotate arbitrary branches, or objects. [jc: fixed an obvious error in initialize_hash_map()] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Introduce commit notesJohannes Schindelin2008-12-21
Commit notes are blobs which are shown together with the commit message. These blobs are taken from the notes ref, which you can configure by the config variable core.notesRef, which in turn can be overridden by the environment variable GIT_NOTES_REF. The notes ref is a branch which contains "files" whose names are the names of the corresponding commits (i.e. the SHA-1). The rationale for putting this information into a ref is this: we want to be able to fetch and possibly union-merge the notes, maybe even look at the date when a note was introduced, and we want to store them efficiently together with the other objects. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>