Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: 5.5.39, 10.0.13
-
Fix Version/s: 10.0.14
-
Component/s: Character Sets
-
Labels:None
Description
Byte sequence 0x8EA0 is erroneously accepted as a valid ujis/eucjpms code:
DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET ujis); INSERT INTO t1 VALUES (0x8EA0); SELECT HEX(a), CHAR_LENGTH(a) FROM t1;
returns:
+--------+----------------+ | HEX(a) | CHAR_LENGTH(a) | +--------+----------------+ | 8EA0 | 2 | +--------+----------------+
This is wrong. The correct code ranges for ujis are:
[x00-x7F] # ASCII/JIS-Roman (one-byte/character) [x8E][xA1-xDF] # half-width katakana (two bytes/char) [x8F][xA1-xFE][xA1-xFE] # JIS X 0212-1990 (three bytes/char) [xA1-xFE][xA1-xFE] # JIS X 0208:1997 (two bytes/char)
The same problem is observed with eucjpms.
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions