dev-python/icalendar/files/2.1_p20100409/05_all_utf8-multi-octet-fix.patch


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

From http://codespeak.net/pipermail/icalendar-dev/2010-April/000152.html:

Lines get folded in the middle of multi-octet sequences (checked out
code from svn today). Consider this case:
import icalendar
ical = icalendar.Calendar()
ical.add('summary', u'a' + u'ą'*100)
ical.as_string().decode('utf-8')
...
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 90-91:
invalid data

I have attached a diff of a simple one-line fix.

As I see in the code you actually try not to split a multi-octet
character but you don't recalculate the slice after finding the new
end position. Could you confirm this?

Submitted by Rimvydas Naktinis.
===================================================================
--- src/icalendar/parser.py (revision 73587)
+++ src/icalendar/parser.py (working copy)
@@ -456,6 +456,7 @@
                     else:
                         end -= 1
 
+            slice = self[start:end]
             new_lines.append(slice)
             if end == l_line:
                 # Done