Fix Version/s: None
Window functions are evaluated on the result set after the WHERE, GROUP BY and HAVING have been applied.
- If we are operating after a temptable-based grouping operation, we can read its output (and apply HAVING on the fly)
- if we are operating after a Join operation, or non-temptable based grouping, we will need to store the output of previous operation in a temptable and then work from there.
Let's start with one window function:
Window function is evaluated for each row of the result set. It is a function of row's partition, we need to look at the rows of the resultset that are ordered according to order_expr. $window_spec specifies at which rows we need to look. Sometimes it's certain preceding rows, sometimes it's certain following rows (and sometimes it's both?)
Execution can be done as follows:
- sort resultset rows by (part_expr, order_expr).
- scan each partition (either in reverse or forward order) and compute the value of window function on the fly.
- How to handle the context. If a window function depends on preceding and following rows, how do we keep track of those?
- Whether we should use files or temp.tables for storing the sorted resultset (we will always have to store it, because filesort doesn't support pull-based reading of its resultset)
A query may use multiple window functions which use different PARTITION BY / ORDER BY expressions:
The query should be evaluated as follows:
- sort the result according to part_expr1/order_expr1, compute values of window_func1
- sort the result according to part_expr2/order_expr2, compute values of window_func2
- a tutorial http://sqlschool.modeanalytics.com/advanced/window-functions.html
- SQL standard draft: see 5CD2-02-Foundation-2006-01.pdf
- pg. 62 - 4.15.3 window functions
- pg. 204 - <window function>