Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 10.0.4, 5.5.33a
-
Fix Version/s: 10.0.6
-
Component/s: None
-
Labels:None
Description
There are incompatibilities between some MariaDB and MySQL collations
which we need to solve somehow.
Problems
1.
The utf8_croatian_ci and ucs2_croatian_ci collations appeared in MariaDB-5.1 in the end of 2009, based on Alexander Barkov's patch from: http://collation-charts.org/articles/croatian.htm
Later, the Croatian collations were added into MySQL-5.6.
Still, MariaDB Croatian collation uses the latest version of the rules from http://unicode.org/cldr/trac/browser/trunk/common/collation/hr.xml while MySQL implements the older version.
The difference is in 3 letters only. But it's enough to make the indexes incompatible.
As a effect:
- utf8_croatian_ci (ID 213) is different in MariaDB and MySQL
- ucs2_croatian_ci (ID 149) is different in MariaDB and MySQL
2.
Later, MySQL-5.5 added support for utf8mb4, utf16, utf32. When merging the new character sets (MySQL-5.5 -> MariaDB-5.5) the MariaDB team added the following corresponding collations, for symmetry with utf8 and ucs2:
- utf8mb4_croatian_ci (ID=245)
- utf16_croatian_ci (ID=215)
- utf32_croatian_ci (ID=214)
But when the collations with the same names finally appeared in MySQL-5.6, they were given different IDs. So the IDs 215, 215, 245 are assigned in MySQL-5.6 to something else.
This is what we have in MariaDB:
mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS --> WHERE COLLATION_NAME LIKE 'u%croat%'; +---------------------+-----+ | COLLATION_NAME | ID | +---------------------+-----+ | ucs2_croatian_ci | 149 | | utf8_croatian_ci | 213 | | utf32_croatian_ci | 214 | | utf16_croatian_ci | 215 | | utf8mb4_croatian_ci | 245 | +---------------------+-----+ 5 rows in set (0.01 sec)
This is what we have in MySQL-5.6:
mysql> SELECT COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS --> WHERE ID IN (149,213,214,215,245); +---------------------+-----+ | COLLATION_NAME | ID | Problem: +---------------------+-----+ | ucs2_croatian_ci | 149 | MySQL rules differ from MariaDB rules | utf8_croatian_ci | 213 | MySQL rules differ from MariaDB rules | utf8_unicode_520_ci | 214 | MariaDB utf32_croatian_ci | utf8_vietnamese_ci | 215 | MariaDB utf16_croatian_ci | utf8mb4_croatian_ci | 245 | MySQL rules differ from MariaDB rules +---------------------+-----+
Solution
Collation changes
- Bar moves MariaDB-5.5 xxx_croatian_ci collations to new IDs (preferrably, outside of the 0..255 range), without changing the collation name.
- Bar merges MySQL-5.6 xxx_croatian_ci using MySQL-5.6 IDs, but changing the names to xxx_croatian_mysql56_ci.
Detect attempts to open tables with the old MariaDB collations.
Bar fixes TABLE_SHARE::init_from_binary_frm_image() and adds an error message for a table created by any MariaDB version prior to 10.0.5 that have indexes using collation IDs 213, 149, 245, 215, 214:
+---------------------+---------+-----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +---------------------+---------+-----+---------+----------+---------+ | utf8_croatian_ci | utf8 | 213 | | Yes | 8 | | ucs2_croatian_ci | ucs2 | 149 | | Yes | 8 | | utf8mb4_croatian_ci | utf8mb4 | 245 | | Yes | 8 | | utf16_croatian_ci | utf16 | 215 | | Yes | 8 | | utf32_croatian_ci | utf32 | 214 | | Yes | 8 | +---------------------+---------+-----+---------+----------+---------+
ER_TABLE_NEEDS_UPGRADE looks suitable for this purposes:
"Table upgrade required. Please do \"REPAIR TABLE `%-.32s`\" or dump/reload to fix it!"
mysql_upgrade
Monty will try to fix REPAIR to solve the conflicting IDs problem.
quick REPAIR
In long terms we can add a quick REPAIR to replace collation IDs in table definitions in FRM files and in engine-specific structure definitions (e.g. in MYI files for MyISAM) without having to do the full repair for the table.
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Pushed into MariaDB-10.0.6