Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6568

Performance improvement: Order by column in a left joined table

    Details

    • Type: Task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Fix Version/s: 10.2
    • Component/s: None

      Description

      Originally asked in https://answers.launchpad.net/maria/+question/252779

      In a query like:

      SELECT a.c1, b.c2
      FROM a
      LEFT JOIN b ON a.id = b.a_id
      ORDER BY b.c3;
      

      No index is used since b.c3 is a column of a left join table, so the optimizer ignores it.

      We have implemented an optimization in our client code that split the query in two: first part with WHERE b.c3 IS NULL, and a second part with WHERE b.c3 IS NOT NULL (depending on the order by ascending clause). It works great and most of the times the second part is not needed (when using limit). But it would be great if the database engine can figure it out by itself.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            serg Sergei Golubchik added a comment -

            Thanks for the optimization idea.
            While we cannot implement this right now, we'll consider it for the future versions.

            As for now — the optimizer needs to know when to use this optimization. You suggest to split the join, it means MariaDB will do two joins, not one. This is slower. But, on the other hand, MariaDB won't need to do the filesort, and could retrieve in the index order. So, the optimizer needs to compare the cost of the additional join with the cost of the filesort. But, unfortunately, at the moment optimizer cannot properly estimate the cost of the filesort — this needs to be fixed first, before we could implement your suggestion.

            Show
            serg Sergei Golubchik added a comment - Thanks for the optimization idea. While we cannot implement this right now, we'll consider it for the future versions. As for now — the optimizer needs to know when to use this optimization. You suggest to split the join, it means MariaDB will do two joins, not one. This is slower. But, on the other hand, MariaDB won't need to do the filesort, and could retrieve in the index order. So, the optimizer needs to compare the cost of the additional join with the cost of the filesort. But, unfortunately, at the moment optimizer cannot properly estimate the cost of the filesort — this needs to be fixed first, before we could implement your suggestion.

              People

              • Assignee:
                Unassigned
                Reporter:
                jose.canciani Jose Canciani
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: