Introduce the concept of data type handlers

Description

This is a pre-requisite task for "MDEV-4912 Add a plugin to field types (column types)".

While reviewing the prototype patch for MDEV-4912, Sergei suggested
to modify the code to identify data types in memory using Type_handler pointers,
instead of a "enum_field_types" value. This will be quite a huge change per se,
so we'll do it as a separate task and then add the main patch on top of that.

This task include a few steps to introduce the concept of the data type handlers.
A few ongoing tasks we'll be done later (but before adding ).

This task will include the following subtasks:

1. Remove IMPOSSIBLE_RESULT from Item_result. It's an internal server thing
and is not needed neither on the client side, nor for the UDF API.

2. Introduce a new "enum Type_cmp", which will have similar
values to Item_result:

1 2 3 4 5 6 7 8 9 10 enum Item_cmp { STRING_CMP= STRING_RESULT, REAL_CMP= REAL_RESULT, INT_CMP= INT_RESULT, ROW_CMP= ROW_RESULT, DECIMAL_CMP= DECIMAL_RESULT, TIME_CMP= TIME_RESULT, IMPOSSIBLE_CMP };

3. Change Item::cmp_type() to return
Item_cmp instead of Item_result.

This is needed for stricter type control, to avoid
erroneous confusion between cmp_type() and result_type().

4. Introduce a new base class Type_handler without members, with only three methods at this point:

1 2 3 4 5 6 7 class Type_handler { public: virtual enum_field_types field_type() const = 0; virtual Item_result result_type() const = 0; virtual Item_cmp cmp_type() const = 0; };

More methods will be added in later tasks.

5. Create classes for all MYSQL_TYPE_XXX data types, for example:

1 2 3 4 5 6 7 class Type_handler_longlong: public virtual Type_handler { public: virtual enum_field_types field_type() const { return MYSQL_TYPE_LONGLONG; } virtual Item_result result_type() const { return INT_RESULT; } virtual Item_cmp cmp_type() const { return INT_CMP; } };

6. Derive virtually Item from Type_handler:

1 2 3 4 class Item: public virtual Type_handler { ... };

Notice both Type_handler_longlong and Item use "public virtual Type_handler"
This will introduce a so called diamond inheritance.
The idea is to define methods one time in Type_handler_xxx,
and make all Items of type "xxx" reuse Type_handler_xxx.

For example, deriving from Type_handler_longlong
will automatically add proper implementation of the three
mentioned methods (field_type, result_type and cmp_type),
without having to duplicate them every time we need a INT_RESULT
item:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 class Item_int: public Item_num, public Type_handler_longlong { ... }; class Item_int_func: public Item_func, public Type_handler_longlong { ... }; class Item_func_udf_int :public Item_udf_func, public Type_handler_longlong { ... }; class Item_exists_subselect :public Item_subselect, public Type_handler_longlong { ... };

7. Remove the default implementations for:

  • Item::cmp_type()

  • Item::field_type()

  • Item::result_type()

In diamond inheritance they should stay zero pointers, until a Type_handler_xxx
is added virtually for the Item_xxx class. Otherwise, the compiler will just return an error.

So we'll have temporary rename Item::cmp_type() to Item::cmp_type_from_field_type()
and use the latter for some items not covered yet by this task.
Note, cmp_type_from_field_type() will have gone as soon as
we switch ALL items to use Type_handlers instead of defining
type related methods directly. At the end, all items will use Type_handler_xxx::cmp_type() instead of this.

Things that are not covered by this task, to be done soon separately:

  • Under terms of this task we'll define only three aforementioned methods
    in Type_handler. In a separate task later, we'll move all other data type
    specific methods from Item to Type_handler:

    1 2 3 4 5 6 7 8 - val_int - val_decimal - val_real - val_str - get_date, - save_in_field - make_field(Send_field*) - hash_sort()

    and some others.
    This will remove A LOT of duplicate code (e.g. implementation of the val_str()
    method look very the same for all INT_RESULT Items).

  • Under terms of this task we'll switch only some of the Items to use Type_handler.
    Complex cases (when an Item has parallel independent result_type() and field_type()
    methods) will be changed in a separate patch, to avoid too many changes a single patch.
    These complex items include at least:

    • Item_func_hybrid_result_type

    • Item_copy

    • Item_sum_hybrid

    • All Items that have references to the actual data type containers (Items or Fields) e.g. Item_func_rollup_const, Item_field.

At the end, when all of the ongoing tasks are done, we'll have:

  • either field_type() depend on result_type()

  • or result_type() depend on field_type()

  • or all type specific methods depend on an Item or Field reference.
    All cases with parallel independent result_type() and field_type() should be removed.

Further tasks:

  • Remove as much direct use of field_type(), cmp_type(), result_type() as possible.
    Move this code inside methods in Type_handler.

  • Change enum_field_type members to "Type_handler*" in all structures and classes,
    e.g. Create_field, Send_field, CAST related classes, sql_yacc.yy tokens, etc.

Environment

None

Status

Assignee

Alexander Barkov

Reporter

Alexander Barkov

Labels

None

External issue ID

None

External issue ID

None

Fix versions

Priority

Major