[backport] Implement UNION ALL without usage of a temporary table (MWL#44)


Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.


1. Handling union operations in MySQL Server

1.1. Specifics of MySQL union operations

UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:

It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:

A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.

1.2 Validation of union units

When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT:repare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT:repare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT:repare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.

1.3 Execution of union units

After SELECT_LEX_UNIT:repare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.

2. Optimizations improving performance of UNION ALL operations

The following three optimizations are proposed to be implemented in the
framework of this task.

2.1 Execution of UNION ALL without temporary table

If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:

2.2. Avoiding unnecessary copying

If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.

2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL

If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.

3. Other possible optimizations for union units

The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.


Igor Babaev


Rasmus Johansson


Time tracking



Fix versions