Commits (1)
-
Andrew Hodgkinson authored
Fix inconsistency in handling illegal byte sequences. Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD. Also, a few extra mappings. Detail: enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD. Reset the count of expected continuation bytes to 0 when encountering illegal byte sequences. Previously, if the character callback returned non-zero, this count would not be reset, thus leaving the codec in an inconsistent state. Additionally, we no longer consume the illegal continuation byte: instead, we process it as a start byte next time round. encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16} If these are needed, it's probably best that different charset names are used rather than overloading 8859-n. iso2022.c: Permit SS2/3 escape sequences for EUC encode/decode. Disable C1 chara...
69a25816
Showing