- 05 Dec, 2008 1 commit
-
-
Andrew Hodgkinson authored
Fix inconsistency in handling illegal byte sequences. Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD. Also, a few extra mappings. Detail: enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD. Reset the count of expected continuation bytes to 0 when encountering illegal byte sequences. Previously, if the character callback returned non-zero, this count would not be reset, thus leaving the codec in an inconsistent state. Additionally, we no longer consume the illegal continuation byte: instead, we process it as a start byte next time round. encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16} If these are needed, it's probably best that different charset names are used rather than overloading 8859-n. iso2022.c: Permit SS2/3 escape sequences for EUC encode/decode. Disable C1 chara...
-
- 10 Jun, 2002 1 commit
-
-
Stewart Brodie authored
Detail: Buffer overrun fixed; some buffer counting problems fixed too. There is now helpful initialisation and tidyup routines you can call too (called encoding_initialise and encoding_tidyup) Admin: I've built this with cc 5.45 in basic build environment - it built OK. This source code now matches that in NCBrowser 5.28. Version 0.47. Tagged as 'Unicode-0_47'
-
- 13 Oct, 2000 1 commit
-
-
John Beranek authored
Detail: Added some changes from Unicode lib in branched tree. All basically type changes. This appears to be because other compilers are more picky about types than armcc. Admin: Will add 0.46 VersionNum file into branched tree, and all will be syncronised fully. Version 0.46. Tagged as 'Unicode-0_46'
-
- 05 Oct, 2000 1 commit
-
-
John Beranek authored
Detail: Copyright messages changed from E-14 to Pace throughout, filename placed at top of file throughout, instead of in just some files. Merged branch's fixes into our code base, plus made it possible to get nice debug output in branched tree, and vfprintf() to stderr in RISC OS tree. Exactly same source used in branched tree now (apart from OS specific files riscos.c and unix.c moving into layers directory structure). Admin: Built for branched, both Unix and RISC OS. Built in RISC OS tree, and compiled into TextConv. Version 0.45. Tagged as 'Unicode-0_45'
-
- 12 Mar, 1999 1 commit
-
-
Kevin Bracey authored
x-Current encoding didn't work if International 1.50 wasn't loaded. Adjusted various ISO 2022 escape sequence tables to change prioritisation. ISO 2022 writer won't shift character set until required. Version 0.35. Tagged as 'Unicode-0_35'
-
- 11 Mar, 1999 1 commit
-
-
Kevin Bracey authored
Added encoding_set_flags(). Proper handling of byte order marks in UTF-16 and UCS-4. Fixed UTF-16 surrogate writing. Adjusted various MIME charset identifiers. Incorporated latest Unicode Character Database (2.1.8). Added "current system alphabet" encoding. Created "TextConv" command line character set conversion utility. Version 0.34. Tagged as 'Unicode-0_34'
-
- 24 Feb, 1999 1 commit
-
-
Simon Middleton authored
Version 0.33. Not tagged
-
- 16 Nov, 1998 1 commit
-
-
Simon Middleton authored
Updated all the writers to ignore the NULL_UCS4 character (as had been previously added to the iso2022_escapes case). Any new writers should flush any pending characters they may have at this point. Also udpated enc_UCS4.c and utf8.c to turn all illegal characeters (top bit set) into FFFD. Version 0.28. Tagged as 'Unicode-0_28'
-
- 21 Nov, 1997 1 commit
-
-
Simon Middleton authored
Added a default language field to each encoding (using above codes). Added a max char size field to each encoding. Tidied up some of the reencoders behaviour when output ptr NULL. Fixed a load of charset numbers which were wrong. New UTF8 function to skiop multiple characters in a string. Fixed RISC OS build which was out of date. Version 0.04. Tagged as 'Unicode-0_04'
-
- 11 Nov, 1997 1 commit
-
-
Simon Middleton authored
Version 0.01. Not tagged
-