Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5174

Filesort excessively uses disk space for TINYBLOB and TINYTEXT columns

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.0.4, 5.5.33, 5.1.67, 5.2.14, 5.3.12
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Filesort uses Field::sort_length() to calculate space needed to sort the field.

      In case of BLOB and TEXT fields and all their
      TINY, MEDIUM, LONG variations the length is calculated as follows:

      uint32 Field_blob::sort_length() const

      { return (uint32) (current_thd->variables.max_sort_length + (field_charset == &my_charset_bin ? 0 : packlength)); }

      The default value of max_sort_length is 1024.

      This is bad for TINYBLOB and TINYTEXT.

      • It should be enough to use 256 bytes to sort TINYBLOB,
      • It should be enough to use 256*strxfrm_multiply bytes to sort TINYTEXT.
        (where strxfrm_multiply is 1 for many collations, which makes 256 bytes again)

      So TINYBLOB (and TINYTEXT in most cases) use 4 times more space
      for sorting than it's actually needed. That should affect performance
      very negatively.

        Gliffy Diagrams

          Attachments

            Activity

            There are no comments yet on this issue.

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                bar Alexander Barkov
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: