Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6445

UUID column type addition for distributed systems

    Details

    • Type: Task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Fix Version/s: 10.2
    • Component/s: None
    • Labels:
      None

      Description

      Feature request:
      The AUTO_INCREMENT field attribute has some attributes that are true in some cases, and not in others:

      • The value is unique
      • The value in incremented in sequence
      • The value is incremented without "gaps" in the sequence

      The two last of these attributes are no longer always true. In addition, to keep the value unique one has to resort to using auto_increment_increment and auto_increment_offset, which will break the last of the assumptions above and when using a distributed system, classic Replication or Galera, then even the second one breaks. But these settings are still necessary to ensure uniqueness in a distributed system.

      So the idea then is to, in addition to AUTO_INCREMENT I suggest we add a UUID column attribute. As for an INSERT, this would be the same as using UUID() for the column in the insert statement, but having this as a column attribute has several advantages. Compared to AUTO_INCREMENT there are also many advantages, such as avoiding at AUTO_INCREMENT mutex locks. In addition, having this as a column attribute would be a declarative, so the schema documents the column attribute properly. To make this work, we need to add a mysql_last_uuid() function and a LAST_UUID SQL functioon added.

      I could see this having some additional features, such as:

      • Having this attribute assume a specific datatype, so that the type need not be explicitly specified..
      • Not allowing the value in a column with this attribute to be updated or inserted with anything but NULL, which will generate a new UUID.
      • Allowing a UUID_SHORT in addition to a UUID column attribute.
      • Automatic compacting of the UUID value (remove the dashes etc).

      A drawback of this schema compared to AUTO_INCREMENT is that this will consume some more space in the specified column, but 16 bytes (8 bytes of we use UUID_SHORT) when the dashes are removed isn't so bad. The advantages not having to deal with the auto_increment_xxx settings, in particular when new nodes are added to the cluster, should outweigh this,

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              karlsson Anders Karlsson added a comment -

              An added benefit of using a UUID instead of some AUTO_INCREMENT is that in a distributed system, using Galera or Replication, data can be migrated between nodes and there is no data that ties or identifies a row to a specific server.

              Show
              karlsson Anders Karlsson added a comment - An added benefit of using a UUID instead of some AUTO_INCREMENT is that in a distributed system, using Galera or Replication, data can be migrated between nodes and there is no data that ties or identifies a row to a specific server.
              Hide
              serg Sergei Golubchik added a comment -

              This looks like a duplicate of MDEV-4958.

              Show
              serg Sergei Golubchik added a comment - This looks like a duplicate of MDEV-4958 .
              Hide
              karlsson Anders Karlsson added a comment -

              There is an overlap, but not a duplicate. What I suggest though is that automatic generation of a UUID is added and that semantics are such that only a proper UUID can be inserted and that it is always generated, so in the case of an INSERT and UPDATE, NULL always has to be passed if the value is to be updated. To support this, the proposed SQL and API functions for retrieving the last generated UUID has to be added.

              Show
              karlsson Anders Karlsson added a comment - There is an overlap, but not a duplicate. What I suggest though is that automatic generation of a UUID is added and that semantics are such that only a proper UUID can be inserted and that it is always generated, so in the case of an INSERT and UPDATE, NULL always has to be passed if the value is to be updated. To support this, the proposed SQL and API functions for retrieving the last generated UUID has to be added.
              Hide
              serg Sergei Golubchik added a comment -

              Indeed, thanks for the clarification

              Show
              serg Sergei Golubchik added a comment - Indeed, thanks for the clarification

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  karlsson Anders Karlsson
                • Votes:
                  1 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated: