Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5539

Empty results in UNION with Sphinx engine

    Details

      Description

      Hello,

      We have a strange behaviour with a query with 2 UNION.

      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.branchid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.criteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
      FROM
      (
      SELECT si.subcriteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id;
      

      The first UNION returns 439 rows.
      The last query returns an empty sets.

      As a result of this 2 UNION we have an empty set instead of 439 rows.

      If we move the last query in first position, it's ok, we have our 439 rows.

      SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
      FROM
      (
      SELECT si.subcriteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.branchid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.criteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id;
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              serg Sergei Golubchik added a comment -

              Can you try to create a test case for this? I suppose, your database is too large, but, perhaps, you could repeat the bug on a smaller dataset? Or without SphinxSE at all?

              Show
              serg Sergei Golubchik added a comment - Can you try to create a test case for this? I suppose, your database is too large, but, perhaps, you could repeat the bug on a smaller dataset? Or without SphinxSE at all?
              Hide
              llafage Laurent Lafage added a comment - - edited

              Hello,

              Here is a generic test case if you use sphinxsearch:

              This query returns around 500 results:

              -- empty set
              SELECT a.*
              FROM
              (
              SELECT *
              FROM sphinxindexad si
              WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
              ) AS a
              UNION
              -- 500 rows in set 
              SELECT b.*
              FROM
              (
              SELECT *
              FROM sphinxindexad si
              WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
              ) AS b
              ;
              
              -> 500 rows in set 
              

              if we invert the 2 WHERE clauses, we have an empty result, the following query will return an empty result:

              -- 500 rows in set 
              SELECT a.*
              FROM
              (
              SELECT *
              FROM sphinxindexad si
              WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
              ) AS a
              UNION
              -- Empty set
              SELECT b.*
              FROM
              (
              SELECT *
              FROM sphinxindexad si
              WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
              ) AS b
              ;
              
              -> Empty set
              

              Hope this will help.

              With best regards

              Show
              llafage Laurent Lafage added a comment - - edited Hello, Here is a generic test case if you use sphinxsearch: This query returns around 500 results: -- empty set SELECT a.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION -- 500 rows in set SELECT b.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS b ; -> 500 rows in set if we invert the 2 WHERE clauses, we have an empty result, the following query will return an empty result: -- 500 rows in set SELECT a.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION -- Empty set SELECT b.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS b ; -> Empty set Hope this will help. With best regards
              Hide
              elenst Elena Stepanova added a comment -

              To reproduce in MTR, put the following test case into suite/sphinx folder:

              eval create table ts ( id bigint unsigned not null, w int not null, query varchar(255) not null, index(query) ) engine=sphinx connection="sphinx://127.0.0.1:$SPHINXSEARCH_PORT/*";
              
              SET optimizer_switch = REPLACE( @@optimizer_switch, '=on', '=off' );
              
              SELECT a.*
              FROM
              (
              SELECT *
              FROM ts si
              WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
              ) AS a
              UNION
              SELECT b.*
              FROM
              (
              SELECT *
              FROM ts si
              WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
              ) AS b
              ;
              
              SELECT a.*
              FROM
              (
              SELECT *
              FROM ts si
              WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
              ) AS a
              UNION
              SELECT b.*
              FROM
              (
              SELECT *
              FROM ts si
              WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
              ) AS b
              ;
              
              drop table ts;
              

              The first query returns an empty set, and the second one returns 4 rows.
              Same happens with views instead of subqueries.
              In 5.2, both queries return 4 rows, so I assume the problem is not on sphinx side.

              EXPLAINs:

              id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
              1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
              2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
              3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
              4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
              NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
              Warnings:
              Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `b`
              
              id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
              1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
              2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
              3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
              4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
              NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
              Warnings:
              Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `b`
              
              Show
              elenst Elena Stepanova added a comment - To reproduce in MTR, put the following test case into suite/sphinx folder: eval create table ts ( id bigint unsigned not null, w int not null, query varchar(255) not null, index(query) ) engine=sphinx connection= "sphinx://127.0.0.1:$SPHINXSEARCH_PORT/*" ; SET optimizer_switch = REPLACE( @@optimizer_switch, '=on', '=off' ); SELECT a.* FROM ( SELECT * FROM ts si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION SELECT b.* FROM ( SELECT * FROM ts si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS b ; SELECT a.* FROM ( SELECT * FROM ts si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION SELECT b.* FROM ( SELECT * FROM ts si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS b ; drop table ts; The first query returns an empty set, and the second one returns 4 rows. Same happens with views instead of subqueries. In 5.2, both queries return 4 rows, so I assume the problem is not on sphinx side. EXPLAINs: id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 3 100.00 2 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500') 3 UNION <derived4> ALL NULL NULL NULL NULL 3 100.00 4 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500') NULL UNION RESULT <union1,3> ALL NULL NULL NULL NULL NULL NULL Warnings: Note 1003 select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `b` id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 3 100.00 2 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500') 3 UNION <derived4> ALL NULL NULL NULL NULL 3 100.00 4 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500') NULL UNION RESULT <union1,3> ALL NULL NULL NULL NULL NULL NULL Warnings: Note 1003 select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `b`
              Hide
              serg Sergei Golubchik added a comment -

              It's a bug in Sphinx. It can only handle one sphinx-query per connection at any given point in time. But in this UNION query you have two. So MariaDB

              1. pushes the first query (say, @* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500) to SphinxSE. It remembers it.
              2. pushes the second query (;mode=extended2;limit=1000000;maxmatches=500) into SphinxSE. It remembers it in the same slot, overwriting the first query.

              So, instead of performing two different sphinx searches and merging the results, it only performs one ­— whichever happens to be the last.

              Why it worked in 5.2? Perhaps in 5.2 the execution sequence was different, and the server 1) prepared the first part of the UNION, 2) executed it, 3) prepared the second part of the UNION 4) executed it, etc. That would've worked in SphinxSE. And in 5.3 MariaDB first prepares all parts in a UNION, and then executes them all. That doesn't work very well in SphinxSE.

              Anyway, this fix seem to require big changes in SphinxSE, we cannot do it. Please, report this bug to Sphinx developers — when they fix it, we will merge a fixed version into MariaDB.

              Show
              serg Sergei Golubchik added a comment - It's a bug in Sphinx. It can only handle one sphinx-query per connection at any given point in time. But in this UNION query you have two. So MariaDB pushes the first query (say, @* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500 ) to SphinxSE. It remembers it. pushes the second query ( ;mode=extended2;limit=1000000;maxmatches=500 ) into SphinxSE. It remembers it in the same slot, overwriting the first query. So, instead of performing two different sphinx searches and merging the results, it only performs one ­— whichever happens to be the last. Why it worked in 5.2? Perhaps in 5.2 the execution sequence was different, and the server 1) prepared the first part of the UNION, 2) executed it, 3) prepared the second part of the UNION 4) executed it, etc. That would've worked in SphinxSE. And in 5.3 MariaDB first prepares all parts in a UNION, and then executes them all. That doesn't work very well in SphinxSE. Anyway, this fix seem to require big changes in SphinxSE, we cannot do it. Please, report this bug to Sphinx developers — when they fix it, we will merge a fixed version into MariaDB.
              Hide
              elenst Elena Stepanova added a comment -

              Both upstream (Sphinx) issues were fixed in 2.1.10. We currently have 2.1.9, the latest downloadable release at sphinxsearch is 2.2.6.

              Show
              elenst Elena Stepanova added a comment - Both upstream (Sphinx) issues were fixed in 2.1.10. We currently have 2.1.9, the latest downloadable release at sphinxsearch is 2.2.6.
              Hide
              monty Michael Widenius added a comment -

              Serg upgraded sphinx to 2.2.6 in MariaDB 10.0, which fixed the issue.

              Show
              monty Michael Widenius added a comment - Serg upgraded sphinx to 2.2.6 in MariaDB 10.0, which fixed the issue.

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  llafage Laurent Lafage
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - 0 minutes
                    0m
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 30 minutes
                    30m