diff options
author | René Scharfe <rene.scharfe@lsrfire.ath.cx> | 2010-02-19 23:20:44 +0100 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2010-02-20 09:22:44 -0800 |
commit | 462749b728f72079a67202d4d0d1ef19ef993f61 (patch) | |
tree | d88473ea817cf1c86f8cd3f51c906ef3cccd34f5 | |
parent | 68ad5e1e9c10e8a640703aadbdf8b8366014373b (diff) | |
download | git-462749b728f72079a67202d4d0d1ef19ef993f61.tar.gz git-462749b728f72079a67202d4d0d1ef19ef993f61.tar.xz |
utf8.c: speculatively assume utf-8 in strbuf_add_wrapped_text()
is_utf8() works by calling utf8_width() for each character at the
supplied location. In strbuf_add_wrapped_text(), we do that anyway
while wrapping the lines. So instead of checking the encoding
beforehand, optimistically assume that it's utf-8 and wrap along
until an invalid character is hit, and when that happens start over.
This pays off if the text consists only of valid utf-8 characters.
The following command was run against the Linux kernel repo with
git 1.7.0:
$ time git log --format='%b' v2.6.32 >/dev/null
real 0m2.679s
user 0m2.580s
sys 0m0.100s
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m4.342s
user 0m4.230s
sys 0m0.110s
And with this patch series:
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m3.741s
user 0m3.630s
sys 0m0.110s
So the cost of wrapping is reduced to 70% in this case.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
-rw-r--r-- | utf8.c | 23 |
1 files changed, 17 insertions, 6 deletions
@@ -324,16 +324,21 @@ static size_t display_mode_esc_sequence_len(const char *s) * consumed (and no extra indent is necessary for the first line). */ int strbuf_add_wrapped_text(struct strbuf *buf, - const char *text, int indent, int indent2, int width) + const char *text, int indent1, int indent2, int width) { - int w = indent, assume_utf8 = is_utf8(text); - const char *bol = text, *space = NULL; + int indent, w, assume_utf8 = 1; + const char *bol, *space, *start = text; + size_t orig_len = buf->len; if (width <= 0) { - strbuf_add_indented_text(buf, text, indent, indent2); + strbuf_add_indented_text(buf, text, indent1, indent2); return 1; } +retry: + bol = text; + w = indent = indent1; + space = NULL; if (indent < 0) { w = -indent; space = text; @@ -385,9 +390,15 @@ new_line: } continue; } - if (assume_utf8) + if (assume_utf8) { w += utf8_width(&text, NULL); - else { + if (!text) { + assume_utf8 = 0; + text = start; + strbuf_setlen(buf, orig_len); + goto retry; + } + } else { w++; text++; } |