Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7144

Warnings "bytes lost" during server shutdown after running connect.part_file test in buildbot

    Details

      Description

      Buildbot example: http://buildbot.askmonty.org/buildbot/builders/kvm-dgcov-jaunty-i386/builds/4449/steps/test_2/logs/stdio

      I can reproduce the problem by running the test with valgrind on a debug build. The valgrind report goes OK, but the warnings are generated on shutdown. It happens almost every time when I run the test:

      perl ./mtr connect.part_file  --valgrind-mysqld
      Logging: ./mtr  connect.part_file --valgrind-mysqld
      ...
      
      worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
      connect.part_file                        [ pass ]   4482
      ***Warnings generated in error logs during shutdown after running tests: connect.part_file
      
      Warning: 65536 bytes lost at 0xfaea070, allocated by T@0 at 
      valgrind_report                          [ pass ]       
      --------------------------------------------------------------------------
      The servers were restarted 0 times
      Spent 4.482 of 31 seconds executing testcases
      
      Warnings in log: All 2 tests were successful.
      
      Errors/warnings were found in logfiles during server shutdown after running the
      following sequence(s) of tests:
          connect.part_file
      mysql-test-run: *** ERROR: There where errors/warnings in server logs after running test cases.
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              bertrandop Olivier Bertrand added a comment -

              Elena,

              If you mean that this test should be run with the mentioned change in TDBCSV::PrepareWriting

              I didn't mean that. Running the small test without the change is just to check whether the warning is raised because of that particular UPDATE command. What you could do is to split the original test in parts to check which command(s) are causing the warning.

              And also checking with the partition table made with another engine.

              Thanks and regards,
              Olivier

              Show
              bertrandop Olivier Bertrand added a comment - Elena, If you mean that this test should be run with the mentioned change in TDBCSV::PrepareWriting I didn't mean that. Running the small test without the change is just to check whether the warning is raised because of that particular UPDATE command. What you could do is to split the original test in parts to check which command(s) are causing the warning. And also checking with the partition table made with another engine. Thanks and regards, Olivier
              Hide
              elenst Elena Stepanova added a comment -

              Hi Olivier,

              No, I am not getting the warnings on your test case. Here is the minimal one which triggers "bytes lost" warnings for me (they look not quite the same as the initial one, but I suppose it's just because there are more of them; the first warning from the three could be the same as the initial unfinished one):

              
              --source include/have_partition.inc
              CREATE TABLE t1 (
                id INT NOT NULL,
                msg VARCHAR(32)
              ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10
              PARTITION BY RANGE(id) (
              PARTITION first VALUES LESS THAN(10),
              PARTITION middle VALUES LESS THAN(50),
              PARTITION last VALUES LESS THAN(MAXVALUE));
              --error ER_GET_ERRMSG
              UPDATE t1 set id = 41 WHERE msg = 'four';
              
              perl ./mtr part_file2  --valgrind-mysqld --mysqld=--plugin-load-add=ha_connect
              
              ...
              CREATE TABLE t1 (
              id INT NOT NULL,
              msg VARCHAR(32)
              ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10
              PARTITION BY RANGE(id) (
              PARTITION first VALUES LESS THAN(10),
              PARTITION middle VALUES LESS THAN(50),
              PARTITION last VALUES LESS THAN(MAXVALUE));
              UPDATE t1 set id = 41 WHERE msg = 'four';
              ERROR HY000: Got error 174 'Cannot update column id because it is used for partitioning' from CONNECT
              ***Warnings generated in error logs during shutdown after running tests: main.part_file2
              
              Warning: 65536 bytes lost at 0xfaa3070, allocated by T@0 at mysys/mf_iocache.c:231, mysys/mf_cache.c:69, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
              Warning:    8 bytes lost at 0xf850b70, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:65, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
              Warning:   48 bytes lost at 0xf84ffb0, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:64, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379
              
              mysql-test-run: *** ERROR: Test suite aborted
              

              I tried the same set with MyISAM, InnoDB, and Archive engines (the latter is a dynamic library, like Connect), no warnings there.
              Of course, the other difference is that neither of them return an error on the update, it's Connect-specific, maybe the warnings are somehow related to it.

              Show
              elenst Elena Stepanova added a comment - Hi Olivier, No, I am not getting the warnings on your test case. Here is the minimal one which triggers "bytes lost" warnings for me (they look not quite the same as the initial one, but I suppose it's just because there are more of them; the first warning from the three could be the same as the initial unfinished one): --source include/have_partition.inc CREATE TABLE t1 ( id INT NOT NULL, msg VARCHAR (32) ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10 PARTITION BY RANGE(id) ( PARTITION first VALUES LESS THAN(10), PARTITION middle VALUES LESS THAN(50), PARTITION last VALUES LESS THAN(MAXVALUE)); --error ER_GET_ERRMSG UPDATE t1 set id = 41 WHERE msg = 'four'; perl ./mtr part_file2 --valgrind-mysqld --mysqld=--plugin-load-add=ha_connect ... CREATE TABLE t1 ( id INT NOT NULL, msg VARCHAR(32) ) ENGINE=CONNECT TABLE_TYPE=CSV AVG_ROW_LENGTH=10 PARTITION BY RANGE(id) ( PARTITION first VALUES LESS THAN(10), PARTITION middle VALUES LESS THAN(50), PARTITION last VALUES LESS THAN(MAXVALUE)); UPDATE t1 set id = 41 WHERE msg = 'four'; ERROR HY000: Got error 174 'Cannot update column id because it is used for partitioning' from CONNECT ***Warnings generated in error logs during shutdown after running tests: main.part_file2 Warning: 65536 bytes lost at 0xfaa3070, allocated by T@0 at mysys/mf_iocache.c:231, mysys/mf_cache.c:69, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379 Warning: 8 bytes lost at 0xf850b70, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:65, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379 Warning: 48 bytes lost at 0xf84ffb0, allocated by T@0 at mysys/my_malloc.c:239, mysys/mf_cache.c:64, sql/sql_update.cc:580, sql/sql_parse.cc:3295, sql/sql_parse.cc:6407, sql/sql_parse.cc:1301, sql/sql_parse.cc:996, sql/sql_connect.cc:1379 mysql-test-run: *** ERROR: Test suite aborted I tried the same set with MyISAM, InnoDB, and Archive engines (the latter is a dynamic library, like Connect), no warnings there. Of course, the other difference is that neither of them return an error on the update, it's Connect-specific, maybe the warnings are somehow related to it.
              Hide
              bertrandop Olivier Bertrand added a comment -

              Helena,

              Great! Thanks to this information I have a better idea of what is happening.
              It begins in mysql_update (in sql_update.cc):

              int mysql_update:
              ...
                We are doing a search on a key that is updated. In this case
                we go trough the matching rows, save a pointer to them and
                update these in a separate loop based on the pointer.
                    IO_CACHE tempfile;
                    if (open_cached_file(&tempfile, mysql_tmpdir,TEMP_PREFIX,
                       DISK_BUFFER_SIZE, MYF(MY_WME)))
                goto err;
              
                    /* If quick select is used, initialize it before retrieving rows. */
                    if (select && select->quick && select->quick->reset())
                    {
                      close_cached_file(&tempfile);
                      goto err;
                    }
                    table->file->try_semi_consistent_read(1);
              
                    /*
                      When we get here, we have one of the following options:
                      A. query_plan.index == MAX_KEY
                         This means we should use full table scan, and start it with
                         init_read_record call
                      B. query_plan.index != MAX_KEY
                         B.1 quick select is used, start the scan with init_read_record
                         B.2 quick select is not used, this is full index scan (with LIMIT)
                             Full index scan must be started with init_read_record_idx
                    */
              
                    if (query_plan.index == MAX_KEY || (select && select->quick))
                    {
                      if (init_read_record(&info, thd, table, select, 0, 1, FALSE))
                        goto err;
                    }
                    else
                      init_read_record_idx(&info, thd, table, 1, query_plan.index, reverse);
              

              It calls open_cached_file that calls init_io_cache

              int init_io_cache:
              ...
                    if ((info->buffer= (uchar*) my_malloc(buffer_block, flags)) != 0)
              ...
              

              This is where the buffer of 65536 bytes is allocated. When updating is done,
              for instance for a MyISAM table, there is a call to close_cached_file

              void close_cached_file(IO_CACHE *cache)
              {
                DBUG_ENTER("close_cached_file");
                if (my_b_inited(cache))
                {
                  File file=cache->file;
                  cache->file= -1;        /* Don't flush data */
                  (void) end_io_cache(cache);
                  if (file >= 0)
                  {
                    (void) my_close(file,MYF(0));
              #ifdef CANT_DELETE_OPEN_FILES
                    if (cache->file_name)
                    {
                (void) my_delete(cache->file_name,MYF(MY_WME | ME_NOINPUT));
                my_free(cache->file_name);
                    }
              #endif
                  }
                  my_free(cache->dir);
                  my_free(cache->prefix);
                }
                DBUG_VOID_RETURN;
              }
              

              and in end_io_cache

              int end_io_cache(IO_CACHE *info)
              ...
                  my_free(info->buffer);
              ...
              

              the allocated memory is freed.

              However, in the case of a CONNECT table, when calling init_read_record
              different functions are called until ha_connect::rnd_init

              int ha_connect::rnd_init(bool scan)
              ...
                if (OpenTable(g, xmod == MODE_DELETE))
                  DBUG_RETURN(HA_ERR_INITIALIZATION);
              

              and in ha_connect::OpenTable

              int ha_connect::OpenTable(PGLOBAL g, bool del)
              ...
                        if (part_id && bitmap_is_set(part_id, fp->field_index)) {
                          // Trying to update a column used for partitioning
                          // This cannot be currently done because it may require
                          // a row to be moved in another partition.
                          sprintf(g->Message,
                            "Cannot update column %s because it is used for partitioning",
                            p);
                          return HA_ERR_INTERNAL_ERROR;
                          } // endif part_id
              

              This causes in mysql_update the statement:

                      if (init_read_record(&info, thd, table, select, 0, 1, FALSE))
                        goto err;
              ...
              err:
                delete select;
                free_underlaid_joins(thd, select_lex);
                table->disable_keyread();
                thd->abort_on_warning= 0;
                DBUG_RETURN(1);
              

              to return in error. Actually in delete select the function close_cached_file
              is called but the cache parameter passed to it is not the one that have been
              allocated previously and this function does nothing (my_b_inited(cache) is false)
              This explains the warnings, memory being lost indeed.

              Is this a CONNECT bug or a MariaDB bug? I can't tell but this could happen with
              any engine returning an error in this place.

              Perhaps, adding to CONNECT the flag HA_NOT_DELETE_WITH_CACHE could remove this problem
              but this would be masking it instead of really fix it.

              Show
              bertrandop Olivier Bertrand added a comment - Helena, Great! Thanks to this information I have a better idea of what is happening. It begins in mysql_update (in sql_update.cc): int mysql_update: ... We are doing a search on a key that is updated. In this case we go trough the matching rows, save a pointer to them and update these in a separate loop based on the pointer. IO_CACHE tempfile; if (open_cached_file(&tempfile, mysql_tmpdir,TEMP_PREFIX, DISK_BUFFER_SIZE, MYF(MY_WME))) goto err; /* If quick select is used, initialize it before retrieving rows. */ if (select && select->quick && select->quick->reset()) { close_cached_file(&tempfile); goto err; } table->file->try_semi_consistent_read(1); /* When we get here, we have one of the following options: A. query_plan.index == MAX_KEY This means we should use full table scan, and start it with init_read_record call B. query_plan.index != MAX_KEY B.1 quick select is used, start the scan with init_read_record B.2 quick select is not used, this is full index scan (with LIMIT) Full index scan must be started with init_read_record_idx */ if (query_plan.index == MAX_KEY || (select && select->quick)) { if (init_read_record(&info, thd, table, select, 0, 1, FALSE)) goto err; } else init_read_record_idx(&info, thd, table, 1, query_plan.index, reverse); It calls open_cached_file that calls init_io_cache int init_io_cache: ... if ((info->buffer= (uchar*) my_malloc(buffer_block, flags)) != 0) ... This is where the buffer of 65536 bytes is allocated. When updating is done, for instance for a MyISAM table, there is a call to close_cached_file void close_cached_file(IO_CACHE *cache) { DBUG_ENTER( "close_cached_file" ); if (my_b_inited(cache)) { File file=cache->file; cache->file= -1; /* Don't flush data */ (void) end_io_cache(cache); if (file >= 0) { (void) my_close(file,MYF(0)); #ifdef CANT_DELETE_OPEN_FILES if (cache->file_name) { (void) my_delete(cache->file_name,MYF(MY_WME | ME_NOINPUT)); my_free(cache->file_name); } #endif } my_free(cache->dir); my_free(cache->prefix); } DBUG_VOID_RETURN; } and in end_io_cache int end_io_cache(IO_CACHE *info) ... my_free(info->buffer); ... the allocated memory is freed. However, in the case of a CONNECT table, when calling init_read_record different functions are called until ha_connect::rnd_init int ha_connect::rnd_init(bool scan) ... if (OpenTable(g, xmod == MODE_DELETE)) DBUG_RETURN(HA_ERR_INITIALIZATION); and in ha_connect::OpenTable int ha_connect::OpenTable(PGLOBAL g, bool del) ... if (part_id && bitmap_is_set(part_id, fp->field_index)) { // Trying to update a column used for partitioning // This cannot be currently done because it may require // a row to be moved in another partition. sprintf(g->Message, "Cannot update column %s because it is used for partitioning" , p); return HA_ERR_INTERNAL_ERROR; } // endif part_id This causes in mysql_update the statement: if (init_read_record(&info, thd, table, select, 0, 1, FALSE)) goto err; ... err: delete select; free_underlaid_joins(thd, select_lex); table->disable_keyread(); thd->abort_on_warning= 0; DBUG_RETURN(1); to return in error. Actually in delete select the function close_cached_file is called but the cache parameter passed to it is not the one that have been allocated previously and this function does nothing (my_b_inited(cache) is false) This explains the warnings, memory being lost indeed. Is this a CONNECT bug or a MariaDB bug? I can't tell but this could happen with any engine returning an error in this place. Perhaps, adding to CONNECT the flag HA_NOT_DELETE_WITH_CACHE could remove this problem but this would be masking it instead of really fix it.
              Hide
              elenst Elena Stepanova added a comment -

              Is this a CONNECT bug or a MariaDB bug?

              It's probably best to ask Sergei Golubchik about it.

              Show
              elenst Elena Stepanova added a comment - Is this a CONNECT bug or a MariaDB bug? It's probably best to ask Sergei Golubchik about it.
              Hide
              serg Sergei Golubchik added a comment - - edited

              Looks like a MariaDB bug to me. Another goto err few lines above is correctly accompanied with the close_cached_file(&tempfile).

              I'll fix it. Thanks for pointing out exactly what line to fix!

              Show
              serg Sergei Golubchik added a comment - - edited Looks like a MariaDB bug to me. Another goto err few lines above is correctly accompanied with the close_cached_file(&tempfile) . I'll fix it. Thanks for pointing out exactly what line to fix!

                People

                • Assignee:
                  serg Sergei Golubchik
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: