Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6776

ujis and eucjmps erroneously accept 0x8EA0 as a valid byte sequence

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 5.5.39, 10.0.13
    • Fix Version/s: 10.0.14
    • Component/s: Character Sets
    • Labels:
      None

      Description

      Byte sequence 0x8EA0 is erroneously accepted as a valid ujis/eucjpms code:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ujis);
      INSERT INTO t1 VALUES (0x8EA0);
      SELECT HEX(a), CHAR_LENGTH(a) FROM t1;
      

      returns:

      +--------+----------------+
      | HEX(a) | CHAR_LENGTH(a) |
      +--------+----------------+
      | 8EA0   |              2 |
      +--------+----------------+
      

      This is wrong. The correct code ranges for ujis are:

        [x00-x7F]                     # ASCII/JIS-Roman (one-byte/character)  
        [x8E][xA1-xDF]                # half-width katakana (two bytes/char)  
        [x8F][xA1-xFE][xA1-xFE]       # JIS X 0212-1990 (three bytes/char)  
        [xA1-xFE][xA1-xFE]            # JIS X 0208:1997 (two bytes/char)
      

      The same problem is observed with eucjpms.

        Gliffy Diagrams

          Attachments

            Activity

            There are no comments yet on this issue.

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                bar Alexander Barkov
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: