1. 05 Dec, 2008 1 commit
    • Andrew Hodgkinson's avatar
      Fix bugs and inconsistencies in encoding handlers. · 69a25816
      Andrew Hodgkinson authored
        Fix inconsistency in handling illegal byte sequences.
        Convert surrogate codepoints and U+FFFE, U+FFFF to U+FFFD.
        Also, a few extra mappings.
      Detail:
        enc_utf8.c: 0x80 is a continuation byte. Map stray ones to U+FFFD.
                    Reset the count of expected continuation bytes to 0 when
                    encountering illegal byte sequences. Previously, if the character
                    callback returned non-zero, this count would not be reset, thus
                    leaving the codec in an inconsistent state. Additionally, we no
                    longer consume the illegal continuation byte: instead, we process
                    it as a start byte next time round.
        encoding.c: Do not load extension tables for ISO-8859-{1,2,9,10,15,16}
                    If these are needed, it's probably best that different charset
                    names are used rather than overloading 8859-n.
        iso2022.c:  Permit SS2/3 escape sequences for EUC encode/decode.
                    Disable C1 chara...
      69a25816
  2. 10 Jun, 2002 1 commit
    • Stewart Brodie's avatar
      Merge of bug fixes from NCBrowser tree. · 0524cabb
      Stewart Brodie authored
      Detail:
        Buffer overrun fixed; some buffer counting problems fixed too.  There is
          now helpful initialisation and tidyup routines you can call too (called
          encoding_initialise and encoding_tidyup)
      Admin:
        I've built this with cc 5.45 in basic build environment - it built OK.
        This source code now matches that in NCBrowser 5.28.
      
      
      Version 0.47. Tagged as 'Unicode-0_47'
      0524cabb
  3. 13 Oct, 2000 1 commit
    • John Beranek's avatar
      More syncronisation with Unicode lib in branched tree · 4e5abb29
      John Beranek authored
      Detail:
        Added some changes from Unicode lib in branched tree.  All basically
         type changes.  This appears to be because other compilers are
         more picky about types than armcc.
      
      Admin:
        Will add 0.46 VersionNum file into branched tree, and all will be
         syncronised fully.
      
      
      Version 0.46. Tagged as 'Unicode-0_46'
      4e5abb29
  4. 05 Oct, 2000 1 commit
    • John Beranek's avatar
      Copyright message changes + changes from branch + Unified branched/non-branched builds · b5fafb8f
      John Beranek authored
      Detail:
        Copyright messages changed from E-14 to Pace throughout, filename
         placed at top of file throughout, instead of in just some files.
      
        Merged branch's fixes into our code base, plus made it possible to
         get nice debug output in branched tree, and vfprintf() to stderr in
         RISC OS tree.  Exactly same source used in branched tree now (apart
         from OS specific files riscos.c and unix.c moving into layers
         directory structure).
      
      Admin:
        Built for branched, both Unix and RISC OS.
        Built in RISC OS tree, and compiled into TextConv.
      
      
      Version 0.45. Tagged as 'Unicode-0_45'
      b5fafb8f
  5. 24 Feb, 1999 2 commits
  6. 16 Nov, 1998 2 commits
  7. 21 Oct, 1998 1 commit
  8. 16 Oct, 1998 1 commit
  9. 25 Sep, 1998 1 commit
  10. 05 Jan, 1998 1 commit
    • Simon Middleton's avatar
      Fixed autojp state machine. It wasn't resetting 'state' to HAD_NONE after... · 407bccff
      Simon Middleton authored
      Fixed autojp state machine. It wasn't resetting 'state' to HAD_NONE after changing whatcode. So basically it was lucky it ever worked. Also rewrote the various range tests to only use one compare per case.
      
      Changed the 'for_encoding' parameter to encoding_write() to an enumeration.
      Added a new type of writing where if the character cannot be encoded then
      the function returns -1 rather than writing a default character
      Added the pseudo-charsets csAutodetectJP and csEUCorShiftJIS to the encoding
      table so that they return the correct default language (ja).
      Added function to remove unused encoding tables (must be called explicitly).
      Fixed usage counting in iso2022 (I think).
      When looking up encoding name try stripping 'x-' and 'X-' off the front i
      can't find on first pass.
      
      Version 0.12. Tagged as 'Unicode-0_12'
      407bccff
  11. 12 Nov, 1997 1 commit
  12. 11 Nov, 1997 1 commit