• Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Spec draft

      Place in the optimizer

      Window functions are evaluated on the result set after the WHERE, GROUP BY and HAVING have been applied.

      • If we are operating after a temptable-based grouping operation, we can read its output (and apply HAVING on the fly)
      • if we are operating after a Join operation, or non-temptable based grouping, we will need to store the output of previous operation in a temptable and then work from there.

      Basic idea about operation

      Let's start with one window function:

      func_name(arg) OVER (PARTITION BY part_expr  ORDER BY order_expr  $window_spec)

      Window function is evaluated for each row of the result set. It is a function of row's partition, we need to look at the rows of the resultset that are ordered according to order_expr. $window_spec specifies at which rows we need to look. Sometimes it's certain preceding rows, sometimes it's certain following rows (and sometimes it's both?)

      Execution can be done as follows:

      • sort resultset rows by (part_expr, order_expr).
      • scan each partition (either in reverse or forward order) and compute the value of window function on the fly.

      Unresolved questions:

      • How to handle the context. If a window function depends on preceding and following rows, how do we keep track of those?
      • Whether we should use files or temp.tables for storing the sorted resultset (we will always have to store it, because filesort doesn't support pull-based reading of its resultset)

      Multiple window functions

      A query may use multiple window functions which use different PARTITION BY / ORDER BY expressions:

         window_func1 OVER (PARTITION BY part_expr1 ORDER BY order_expr1)
         window_func2 OVER (PARTITION BY part_expr2 ORDER BY order_expr2)
      FROM ... WHERE ... etc

      The query should be evaluated as follows:

      • sort the result according to part_expr1/order_expr1, compute values of window_func1
      • sort the result according to part_expr2/order_expr2, compute values of window_func2



          Issue links



              • Assignee:
                igor Igor Babaev
                serg Sergei Golubchik
              • Votes:
                16 Vote for this issue
                19 Start watching this issue


                • Created: