Commits · 69a258163d4d80511e7febf43ebeba39b48f4ae8 · RiscOS / Sources / Lib / UnicodeLib

05 Dec, 2008 1 commit

Fix bugs and inconsistencies in encoding handlers. · 69a25816

  Fix inconsistency in handling illegal byte sequences.
  Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD.
  Also, a few extra mappings.
Detail:
  enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD.
              Reset the count of expected continuation bytes to 0 when
              encountering illegal byte sequences. Previously, if the character
              callback returned non-zero, this count would not be reset, thus
              leaving the codec in an inconsistent state. Additionally, we no
              longer consume the illegal continuation byte: instead, we process
              it as a start byte next time round.
  encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16}
              If these are needed, it's probably best that different charset
              names are used rather than overloading 8859-n.
  iso2022.c:  Permit SS2/3 escape sequences for EUC encode/decode.
              Disable C1 chara...

69a25816

10 Jun, 2002 1 commit

Merge of bug fixes from NCBrowser tree. · 0524cabb

Stewart Brodie authored 22 years ago

Detail:
  Buffer overrun fixed; some buffer counting problems fixed too.  There is
    now helpful initialisation and tidyup routines you can call too (called
    encoding_initialise and encoding_tidyup)
Admin:
  I've built this with cc 5.45 in basic build environment - it built OK.
  This source code now matches that in NCBrowser 5.28.


Version 0.47. Tagged as 'Unicode-0_47'

0524cabb

13 Oct, 2000 1 commit

More syncronisation with Unicode lib in branched tree · 4e5abb29

John Beranek authored 24 years ago

Detail:
  Added some changes from Unicode lib in branched tree.  All basically
   type changes.  This appears to be because other compilers are
   more picky about types than armcc.

Admin:
  Will add 0.46 VersionNum file into branched tree, and all will be
   syncronised fully.


Version 0.46. Tagged as 'Unicode-0_46'

4e5abb29

05 Oct, 2000 1 commit

John Beranek authored 24 years ago

Detail:
  Copyright messages changed from E-14 to Pace throughout, filename
   placed at top of file throughout, instead of in just some files.

  Merged branch's fixes into our code base, plus made it possible to
   get nice debug output in branched tree, and vfprintf() to stderr in
   RISC OS tree.  Exactly same source used in branched tree now (apart
   from OS specific files riscos.c and unix.c moving into layers
   directory structure).

Admin:
  Built for branched, both Unix and RISC OS.
  Built in RISC OS tree, and compiled into TextConv.


Version 0.45. Tagged as 'Unicode-0_45'

b5fafb8f

24 Feb, 1999 2 commits

Added copyright messages to all source files and unified the header #define's. · a2254cad
Simon Middleton authored 26 years ago
```
Version 0.33. Not tagged
```
a2254cad

Created new file riscos.c for RISC OS specific functions. Rest of library... · ff925330

Simon Middleton authored 26 years ago

Created new file riscos.c for RISC OS specific functions. Rest of library should remain portable. Moved function to load a map file into that new file. Added #defines for directory separator and wild card characters and updated the various file names.

Version 0.33. Tagged as 'Unicode-0_33'

ff925330

16 Nov, 1998 2 commits

Range check the second byte of a double-byte Shift-JIS character. · 25d70444
Kevin Bracey authored 26 years ago
```
Version 0.29. Tagged as 'Unicode-0_29'
```
25d70444

Updated all the writers to ignore the NULL_UCS4 character (as had been... · 103112be

Simon Middleton authored 26 years ago

Updated all the writers to ignore the NULL_UCS4 character (as had been previously added to the iso2022_escapes case). Any new writers should flush any pending characters they may have at this point.

Also udpated enc_UCS4.c and utf8.c to turn all illegal characeters
(top bit set) into FFFD.

Version 0.28. Tagged as 'Unicode-0_28'

103112be

21 Oct, 1998 1 commit
- Fixed bug in ShiftJIS:lookup_table replace 'i' with 'u'. · a095c642
  Simon Middleton authored 26 years ago
```
Version 0.25. Tagged as 'Unicode-0_25'
```
  a095c642
16 Oct, 1998 1 commit

Changed Shift-JIS 7-bit set to match MS CP932. It is now ASCII, except that... · d2bd5b44

Kevin Bracey authored 26 years ago

Changed Shift-JIS 7-bit set to match MS CP932. It is now ASCII, except that character &5C is yen instead of backslash.

Version 0.23. Tagged as 'Unicode-0_23'

d2bd5b44

25 Sep, 1998 1 commit
- Extended Shift-JIS to cover all of Microsoft Code Page 932. · e3435918
  Kevin Bracey authored 26 years ago
```
Version 0.22. Tagged as 'Unicode-0_22'
```
  e3435918
05 Jan, 1998 1 commit

Fixed autojp state machine. It wasn't resetting 'state' to HAD_NONE after... · 407bccff

Simon Middleton authored 27 years ago

Fixed autojp state machine. It wasn't resetting 'state' to HAD_NONE after changing whatcode. So basically it was lucky it ever worked. Also rewrote the various range tests to only use one compare per case.

Changed the 'for_encoding' parameter to encoding_write() to an enumeration.
Added a new type of writing where if the character cannot be encoded then
the function returns -1 rather than writing a default character
Added the pseudo-charsets csAutodetectJP and csEUCorShiftJIS to the encoding
table so that they return the correct default language (ja).
Added function to remove unused encoding tables (must be called explicitly).
Fixed usage counting in iso2022 (I think).
When looking up encoding name try stripping 'x-' and 'X-' off the front i
can't find on first pass.

Version 0.12. Tagged as 'Unicode-0_12'

407bccff

12 Nov, 1997 1 commit

Fixed encoding table so that modules builds will work. · 1c323496

Simon Middleton authored 27 years ago

Made all tables be on linked list to avoid static copies of pointers.
Removed redundant 8bit files.

Version 0.03. Tagged as 'Unicode-0_03'

1c323496

11 Nov, 1997 1 commit
- Initial version checked in · 36e3c744
  Simon Middleton authored 27 years ago
```
Version 0.01. Not tagged
```
  36e3c744