- 05 Dec, 2008 1 commit
-
-
Andrew Hodgkinson authored
Fix inconsistency in handling illegal byte sequences. Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD. Also, a few extra mappings. Detail: enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD. Reset the count of expected continuation bytes to 0 when encountering illegal byte sequences. Previously, if the character callback returned non-zero, this count would not be reset, thus leaving the codec in an inconsistent state. Additionally, we no longer consume the illegal continuation byte: instead, we process it as a start byte next time round. encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16} If these are needed, it's probably best that different charset names are used rather than overloading 8859-n. iso2022.c: Permit SS2/3 escape sequences for EUC encode/decode. Disable C1 chara...
-
- 26 Aug, 2005 1 commit
-
-
Kevin Bracey authored
Version 0.55. Tagged as 'Unicode-0_55'
-
- 25 Aug, 2005 1 commit
-
-
Kevin Bracey authored
(This isn't integrated with ISO 2022 processing though - it's standalone). * Added a Dstroke -> Eth second-attempt conversion in various write routines, primarily for ISO 6937 -> Latin1 conversion (ISO 6937 unifies them). Version 0.54. Tagged as 'Unicode-0_54'
-
- 01 Jul, 2004 1 commit
-
-
Steve Revill authored
Detail: Builds on 32-bit machine even with 26-bit environment. Fixed c.encoding so that it builds with newer tools. Admin: Works in Baseline 500 build. Version 0.53. Tagged as 'Unicode-0_53'
-
- 05 Mar, 2004 1 commit
-
-
Steve Revill authored
> Summary: > Merged changes from branch tree > Reversed previous change > Detail: > > * Merged a few changes/fixes from the Unicode library in > branch's tree. > > * Reversed Steve's change from version 0.50. The change wasn't > necessary, and with the changed definition of NOT_USED in this > version, it compiles fine with cc 5.45. > > * Small comment change in unix.c. It now states that the file > isn't equivalent to any in the branch tree. > > Admin: > Built and briefly tested using TextConv utility on Risc PC. Version 0.52. Tagged as 'Unicode-0_52'
-
- 23 Jul, 2002 1 commit
-
-
Steve Revill authored
Detail: This version now builds with cc-5_45. Note: it has not been verified as actually functioning correctly. Admin: Tested in DSL Baseline build. Version 0.50. Tagged as 'Unicode-0_50'
-
- 10 Jun, 2002 2 commits
-
-
Stewart Brodie authored
Fixed a comparison of a plain char (signedness issue) Admin: These were from NCBrowser 5.28 too - but got forgot in the last checkin :-( I've not tried using this library. Version 0.48. Tagged as 'Unicode-0_48'
-
Stewart Brodie authored
Detail: Buffer overrun fixed; some buffer counting problems fixed too. There is now helpful initialisation and tidyup routines you can call too (called encoding_initialise and encoding_tidyup) Admin: I've built this with cc 5.45 in basic build environment - it built OK. This source code now matches that in NCBrowser 5.28. Version 0.47. Tagged as 'Unicode-0_47'
-
- 13 Oct, 2000 1 commit
-
-
John Beranek authored
Detail: Added some changes from Unicode lib in branched tree. All basically type changes. This appears to be because other compilers are more picky about types than armcc. Admin: Will add 0.46 VersionNum file into branched tree, and all will be syncronised fully. Version 0.46. Tagged as 'Unicode-0_46'
-
- 05 Oct, 2000 1 commit
-
-
John Beranek authored
Detail: Copyright messages changed from E-14 to Pace throughout, filename placed at top of file throughout, instead of in just some files. Merged branch's fixes into our code base, plus made it possible to get nice debug output in branched tree, and vfprintf() to stderr in RISC OS tree. Exactly same source used in branched tree now (apart from OS specific files riscos.c and unix.c moving into layers directory structure). Admin: Built for branched, both Unix and RISC OS. Built in RISC OS tree, and compiled into TextConv. Version 0.45. Tagged as 'Unicode-0_45'
-
- 16 Sep, 1999 1 commit
-
-
Kevin Bracey authored
ISO 8859-8 is now ISO-IR 198 (05/14). Version 0.43. Tagged as 'Unicode-0_43'
-
- 14 Sep, 1999 1 commit
-
-
Kevin Bracey authored
Improved handling of SIP ideographs. Added ISO-8859-11 (csISOLatinThai). Renamed Latin13 to Latin7. Version 0.42. Tagged as 'Unicode-0_42'
-
- 13 Sep, 1999 1 commit
-
-
Kevin Bracey authored
UTF-8 encoder handles out-of-space conditions correctly. ISO 2022 encoder/decoder doesn't try to load table 7E (the null table). encoding_new() does identify a null MIME string with auto-detect Japanese. UnicodeData 3.0.0 imported. Version 0.41. Tagged as 'Unicode-0_41'
-
- 04 Aug, 1999 1 commit
-
-
Kevin Bracey authored
Changed default language of Latin-5 (ISO 8859-9) from English to Turkish. Version 0.40. Tagged as 'Unicode-0_40'
-
- 26 Mar, 1999 1 commit
-
-
Simon Middleton authored
Modified encoding.c so that Chinese encodings use the correct country code as a secondary tag to the language code so that we can distinguish Chinese Simplified and Traditional. Version 0.39. Tagged as 'Unicode-0_39'
-
- 23 Mar, 1999 1 commit
-
-
Simon Middleton authored
Fixed encoding_table_remove_unused() which totally failed to work correctly would most likely crash as soon as it tried to free any tables. Verified that fixed version does work within branched. Version 0.38. Tagged as 'Unicode-0_38'
-
- 18 Mar, 1999 1 commit
-
-
Simon Middleton authored
Fixed encoding_new() so that it returns NULL if an encoding is chosen that does not have an encoding structure with it. e.g. encoding 0 or AutoDetectJP. Version 0.37. Tagged as 'Unicode-0_37'
-
- 12 Mar, 1999 2 commits
-
-
Simon Middleton authored
Changed encoding_table_remove_unused() so that it takes a parameter giving the depth from which to start purging. Fixed ISO2022 write code to free search tables. Added unix.c for unix-targeted builds. Updated cross-compile build. Added unix-targeted build of library and textconv tool in ccsolaris directory. Version 0.36. Tagged as 'Unicode-0_36'
-
Kevin Bracey authored
x-Current encoding didn't work if International 1.50 wasn't loaded. Adjusted various ISO 2022 escape sequence tables to change prioritisation. ISO 2022 writer won't shift character set until required. Version 0.35. Tagged as 'Unicode-0_35'
-
- 11 Mar, 1999 1 commit
-
-
Kevin Bracey authored
Added encoding_set_flags(). Proper handling of byte order marks in UTF-16 and UCS-4. Fixed UTF-16 surrogate writing. Adjusted various MIME charset identifiers. Incorporated latest Unicode Character Database (2.1.8). Added "current system alphabet" encoding. Created "TextConv" command line character set conversion utility. Version 0.34. Tagged as 'Unicode-0_34'
-
- 24 Feb, 1999 2 commits
-
-
Simon Middleton authored
Version 0.33. Not tagged
-
Simon Middleton authored
Created new file riscos.c for RISC OS specific functions. Rest of library should remain portable. Moved function to load a map file into that new file. Added #defines for directory separator and wild card characters and updated the various file names. Version 0.33. Tagged as 'Unicode-0_33'
-
- 23 Feb, 1999 2 commits
-
-
Kevin Bracey authored
Reinstated use of data->data relocations. Version 0.32. Not tagged
-
Kevin Bracey authored
DOS code page 866 (Russian) added. Version 0.32. Tagged as 'Unicode-0_32'
-
- 05 Jan, 1999 1 commit
-
-
Simon Middleton authored
Version 0.30. Tagged as 'Unicode-0_30'
-
- 16 Nov, 1998 1 commit
-
-
Simon Middleton authored
Updated all the writers to ignore the NULL_UCS4 character (as had been previously added to the iso2022_escapes case). Any new writers should flush any pending characters they may have at this point. Also udpated enc_UCS4.c and utf8.c to turn all illegal characeters (top bit set) into FFFD. Version 0.28. Tagged as 'Unicode-0_28'
-
- 06 Nov, 1998 1 commit
-
-
Simon Middleton authored
Added new function encoding_default_mime_type() which given an encoding number returns the first mime type from the matching entry in the table. Version 0.27. Tagged as 'Unicode-0_27'
-
- 15 Sep, 1998 1 commit
-
-
Kevin Bracey authored
Version 0.19. Tagged as 'Unicode-0_19'
-
- 10 Sep, 1998 1 commit
-
-
Andrew Hodgkinson authored
Greek and Hebrew. ISO-8859-.. added for Celtic, which is renamed to csISOLatin8 in the header file from csCeltic; csISOLatin9 added (ISO-IR-203); csSami ISO-8859-15 MIME type form removed to not clash with csISOLatin9 (added to the header, defined as 4007 to follow on from csISOLatin8).
-
- 04 Sep, 1998 1 commit
-
-
Kevin Bracey authored
Version 0.15. Tagged as 'Unicode-0_15'
-
- 06 Mar, 1998 1 commit
-
-
Kevin Bracey authored
Version 0.14. Tagged as 'Unicode-0_14'
-
- 05 Jan, 1998 1 commit
-
-
Simon Middleton authored
Fixed autojp state machine. It wasn't resetting 'state' to HAD_NONE after changing whatcode. So basically it was lucky it ever worked. Also rewrote the various range tests to only use one compare per case. Changed the 'for_encoding' parameter to encoding_write() to an enumeration. Added a new type of writing where if the character cannot be encoded then the function returns -1 rather than writing a default character Added the pseudo-charsets csAutodetectJP and csEUCorShiftJIS to the encoding table so that they return the correct default language (ja). Added function to remove unused encoding tables (must be called explicitly). Fixed usage counting in iso2022 (I think). When looking up encoding name try stripping 'x-' and 'X-' off the front i can't find on first pass. Version 0.12. Tagged as 'Unicode-0_12'
-
- 18 Dec, 1997 1 commit
-
-
Simon Middleton authored
It also now assumes that the first write encoding is already set up. Version 0.10. Tagged as 'Unicode-0_10'
-
- 10 Dec, 1997 1 commit
-
-
Simon Middleton authored
Version 0.09. Tagged as 'Unicode-0_09'
-
- 08 Dec, 1997 1 commit
-
-
Simon Middleton authored
Fixed when SS1 or SS2 followed by a set change by disallowing controlcharacters after single shifts. Made encoding_table_ptr and encoding_n_table_entries check for null tables. moved 'Lm' type characters from marks to letters in mkunictype. Version 0.08. Tagged as 'Unicode-0_08'
-
- 02 Dec, 1997 1 commit
-
-
Simon Middleton authored
Acorn Latin1 encoding using fuzzy mapping to get the greatest number of displayable characters. Reads as Acorn.Latin1. Version 0.07. Tagged as 'Unicode-0_07'
-
- 21 Nov, 1997 1 commit
-
-
Simon Middleton authored
Added a default language field to each encoding (using above codes). Added a max char size field to each encoding. Tidied up some of the reencoders behaviour when output ptr NULL. Fixed a load of charset numbers which were wrong. New UTF8 function to skiop multiple characters in a string. Fixed RISC OS build which was out of date. Version 0.04. Tagged as 'Unicode-0_04'
-
- 12 Nov, 1997 1 commit
-
-
Simon Middleton authored
Made all tables be on linked list to avoid static copies of pointers. Removed redundant 8bit files. Version 0.03. Tagged as 'Unicode-0_03'
-
- 11 Nov, 1997 2 commits
-
-
Simon Middleton authored
Unicode:Encodings directly. Version 0.02. Tagged as 'Unicode-0_02'
-
Simon Middleton authored
Version 0.01. Not tagged
-