Details

    • Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: 10.2
    • Component/s: None
    • Labels:
      None

      Description

      Derived from the "EXPLAIN query" "TOKENIZE query" print a resultset like

      TOKEN  | TOKEN_TYPE 
      select | T_SQL
      c1     | T_COLUMN 
      from   | T_SQL
      t1     | T_TABLE
      where  | T_SQL
      c2     | T_COLUMN 
      =      | T_SQL
      3      | T_CONST
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              psergey Sergei Petrunia added a comment -

              The question is whether the server should be used to do it. MaxScale has a similar feature where it uses MariaDB's parser to parse the query and then replaces constants with '?'.

              Show
              psergey Sergei Petrunia added a comment - The question is whether the server should be used to do it. MaxScale has a similar feature where it uses MariaDB's parser to parse the query and then replaces constants with '?'.
              Hide
              psergey Sergei Petrunia added a comment - - edited

              More details about how MaxScale does it:

              see skygw_get_canonical(). It walks through thd->free_list and replaces Item::STRING_ITEM, Item::INT_ITEM, Item::DECIMAL_ITEM etc with '?'.

              I have a doubt about how it does this, though. It calls replace_literal(). Is there a warranty that it replaces the right occurence of the literal?

              Show
              psergey Sergei Petrunia added a comment - - edited More details about how MaxScale does it: see skygw_get_canonical(). It walks through thd->free_list and replaces Item::STRING_ITEM, Item::INT_ITEM, Item::DECIMAL_ITEM etc with '?'. I have a doubt about how it does this, though. It calls replace_literal(). Is there a warranty that it replaces the right occurence of the literal?
              Hide
              psergey Sergei Petrunia added a comment - - edited

              The technique used by maxscale to catch constants is not applicable to
              table/column names. The problem is, Item_field's db_name, table_name,
              field_name point to the data in temporary buffers. They do not point into the
              query string.

              The copying is done in sql_lex.cc, get_token(), get_quoted_token().

              So, if we want to have info about where "table.column" was located in the
              original query, it needs to be saved here.

              One way to save it would be to add another element into %union and then the
              lexer, instead of just doing assignments like

                    yylval->lex_str=get_token(lip, 0, length);
              

              should also save the data about the source's location.

              Show
              psergey Sergei Petrunia added a comment - - edited The technique used by maxscale to catch constants is not applicable to table/column names. The problem is, Item_field's db_name, table_name, field_name point to the data in temporary buffers. They do not point into the query string. The copying is done in sql_lex.cc, get_token(), get_quoted_token(). So, if we want to have info about where "table.column" was located in the original query, it needs to be saved here. One way to save it would be to add another element into %union and then the lexer, instead of just doing assignments like yylval->lex_str=get_token(lip, 0, length); should also save the data about the source's location.
              Hide
              psergey Sergei Petrunia added a comment -

              The above says how to get info about token locations from the lexer.

              Lexer itself doesn't know about whether the tokens are table names or column
              names or something else. So, we need to pass this info to the parser
              (sql_yacc.yy).

              In the parser, when we use a token as e.g. a table name, we could record that
              somewhere in THD. Then, after the parsing is complete, we would know which
              bytes in the original query text were table names or column names, or something
              else.

              Show
              psergey Sergei Petrunia added a comment - The above says how to get info about token locations from the lexer. Lexer itself doesn't know about whether the tokens are table names or column names or something else. So, we need to pass this info to the parser (sql_yacc.yy). In the parser, when we use a token as e.g. a table name, we could record that somewhere in THD. Then, after the parsing is complete, we would know which bytes in the original query text were table names or column names, or something else.

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  stephane@skysql.com VAROQUI Stephane
                • Votes:
                  6 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated: