Fix bugs and inconsistencies in encoding handlers.
Fix inconsistency in handling illegal byte sequences. Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD. Also, a few extra mappings. Detail: enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD. Reset the count of expected continuation bytes to 0 when encountering illegal byte sequences. Previously, if the character callback returned non-zero, this count would not be reset, thus leaving the codec in an inconsistent state. Additionally, we no longer consume the illegal continuation byte: instead, we process it as a start byte next time round. encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16} If these are needed, it's probably best that different charset names are used rather than overloading 8859-n. iso2022.c: Permit SS2/3 escape sequences for EUC encode/decode. Disable C1 characters for EUC encode/decode. Fix G94x94 read function to handle GR 0xA0/0xFF correctly. Fix writing of C1 controls for 8859-n. Prevent dereference of NULL pointer when scanning tables. iso6937.c: Replace C99 loop iterators with C89 friendly versions. johab.c: Fix final_only lookup table to have entries in the right place. Map 0x5C to the Won sign. Actually pay attention to encoding_WRITE_STRICT. shiftjis.c: Map 0x7E to overbar rather than tilde. textconv.c: Fix static assignment of stdin/stdout. unix.c: Perform wildcard lookup of mapping tables. ccsolaris/Makefile: Modify for use with GCCSDK Admin: Tested with the Iconv module testsuite. Author: John-Mark Bell Version 0.56. Tagged as 'Unicode-0_56'
Showing
Please register or sign in to comment