Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6596

Unassigned characters are not fully supported in ENUM and SET

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.3.12, 5.5.39, 10.0.13
    • Fix Version/s: 10.0
    • Component/s: None
    • Labels:
      None

      Description

      Start a terminal session using character set big5.
      In gnome-terminal:
      Terminal -> Character Coding -> Traditional Chinese (big5)

      Make sure everything works fine:

      LANG=zh_TW.big mysql --default-character-set=big5 --table << END
      SET NAMES big5;
      SELECT HEX(''),HEX('乂');
      END
      

      should return:

      +----------+-----------+
      | HEX('?') | HEX('乂') |
      +----------+-----------+
      | C840     | C940      |
      +----------+-----------+
      

      If you get a different output, then something is wrong with the terminal
      character set settings.

      Notice, the character with the Big5 code C840 is unassigned
      (does not have a Unicode mapping), while the character with
      the Big5 code c940 is assigned.

      Not create an ENUM with non-assigned and assigned characters:

      LANG=zh_TW.big mysql --default-character-set=big5 --table test << END
      SET NAMES big5;
      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a ENUM('','乂') CHARACTER SET big5);
      SHOW CREATE TABLE t1;
      INSERT INTO t1 VALUES (''),('乂');
      SELECT HEX(a),a FROM t1;
      END
      

      The output will be:

      +-------+-----------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                    |
      +-------+-----------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `a` enum('?','乂') CHARACTER SET big5 DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
      +-------+-----------------------------------------------------------------------------------------------------------------+
      +--------+------+
      | HEX(a) | a    |
      +--------+------+
      | C840   |    |
      | C940   | 乂   |
      +--------+------+
      

      Notice, the unassigned character got converted to question mark
      in the SHOW CREATE output, but INSERT/SELECT actually work fine.

      Now dump and restore:

      mysqldump --socket=/tmp/mysql.sock test >t1.sql
      mysql -e "drop table t1" test
      mysql test <t1.sql
      mysql -e "select hex(a),a from t1" test
      

      The output will be:

      +--------+------+
      | hex(a) | a    |
      +--------+------+
      | 3F     | ?    |
      | C940   | 乂   |
      +--------+------+
      

      The unassigned character got lost.

        Gliffy Diagrams

          Attachments

            Activity

            There are no comments yet on this issue.

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                bar Alexander Barkov
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: