Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7030

CONNECT Engine: Errors in Path Substitution, when using file_name option

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.14, 10.1.1
    • Fix Version/s: 10.0.15
    • Labels:
      None
    • Environment:
      Windows 7 X64 German

      Description

      I tried Mariadb 10.0.14 and 10.1.1 on Windows 7
      i want to read a csv file via CONNECT Engine , but i couldn't specify a correct Path for the file.
      The ouput of my tests:

      D:\Maria\TEST>dir
      
      23.10.2014  15:42    <DIR>          .
      23.10.2014  15:42    <DIR>          ..
      22.10.2014  11:56             3.484 F80_wochenverlauf.csv
      23.10.2014  15:42               616 testcsv.sql
      
      create table shu.csvtest ( 
        KALWOCHE char(8) NOT NULL,
        SUMME    int(7) NOT NULL,
        AVGDAY    int(6) NOT NULL,
        AVGHOUR   int(6) NOT NULL,    
        MAXPERHOUR int(6) NOT NULL,
        AVGTIME    double  NOT NULL,
        DAYS       int NOT NULL             
      ) engine=CONNECT table_type=CSV file_name="D:\Maria\TEST\F80_wochenverlauf.csv" header=0 sep_char=';';
      
      select * from shu.csvtest;
      -> Open(rt) error 2 on D:MariaTESTF80_wochenverlauf.csv: No such file or directory
      
      ) engine=CONNECT table_type=CSV file_name='D:\\Maria\\TEST\\F80_wochenverlauf.csv' header=0 sep_char=';';
      select * from shu.csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: Bad file descriptor' from CONNECT */
      
      ) engine=CONNECT table_type=CSV file_name='D:/Maria/TEST/F80_wochenverlauf.csv' header=0 sep_char=';';
      select * from shu.csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: Bad file descriptor' from CONNECT */
      
      ) engine=CONNECT table_type=CSV file_name="\\Maria\\TEST\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: No such file or directory' from CONNECT */
      
      ) engine=CONNECT table_type=CSV file_name="\Maria\\TEST\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from shu.csvtest;
      Open(rt) error 2 on D:\Maria\data\shu\Maria\TEST\F80_wochenverlauf.csv: No such file or directory
      
      ) engine=CONNECT table_type=CSV file_name="\Maria\\..\\..\\TEST\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from shu.csvtest;
      Open(rt) error 2 on D:\Maria\data\TEST\F80_wochenverlauf.csv: No such file or directory
      
      ) engine=CONNECT table_type=CSV file_name="\Maria\\..\\..\\..\\TEST\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: Bad file descriptor' from CONNECT */
      

      Next try: copying the file to DATA_DIR:

      D:\Maria\data\shu>dir
      
      06.11.2014  08:54    <DIR>          .
      06.11.2014  08:54    <DIR>          ..
      23.10.2014  07:56                61 db.opt
      24.10.2014  11:56             3.484 F80_wochenverlauf.csv
      
      ) engine=CONNECT table_type=CSV file_name='F80_wochenverlauf.csv' header=0 sep_char=';';
      select * from shu.csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: No such file or directory' from CONNECT */
      
      ) engine=CONNECT table_type=CSV file_name='.\F80_wochenverlauf.csv' header=0 sep_char=';';
      select * from shu.csvtest;
      Open(rt) error 2 on D:\Maria\data\shu\.F80_wochenverlauf.csv: No such file or directory
      
      ) engine=CONNECT table_type=CSV file_name="..\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: Bad file descriptor' from CONNECT */
      
      ) engine=CONNECT table_type=CSV file_name="\Maria\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from shu.csvtest;
      Open(rt) error 2 on D:\Maria\data\shu\Maria\F80_wochenverlauf.csv: No such file or directory
      
      ) engine=CONNECT table_type=CSV file_name="\Maria\\..\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from csvtest;
      /* SQL Fehler (1296): Got error 122 'ftell error for recd=0: No such file or directory' from CONNECT */
      
       engine=CONNECT table_type=CSV file_name="\\F80_wochenverlauf.csv" header=0 sep_char=';';
      select * from csvtest;
      Open(rt) error 2 on D:\F80_wochenverlauf.csv: No such file or directory
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            serg Sergei Golubchik added a comment - - edited

            What exactly do you think is wrong here?

            "No such file or directory" error looks correct — you have to use slashes or double every backslash. When you do that the file is, indeed, found. And then the next error "Got error 122 'ftell error for recd=0: Bad file descriptor'" starts appearing. This second error looks like a bug to me.

            Show
            serg Sergei Golubchik added a comment - - edited What exactly do you think is wrong here? "No such file or directory" error looks correct — you have to use slashes or double every backslash. When you do that the file is, indeed, found. And then the next error "Got error 122 'ftell error for recd=0: Bad file descriptor'" starts appearing. This second error looks like a bug to me.
            Hide
            bertrandop Olivier Bertrand added a comment - - edited

            Not finding the file is not an error and is cIearly due to a bad file_name option such as using single backslash instead of escaping them. The true problem is the ftell error. By chance, I have been the victim of the same problem a few days ago and it took me quite long to understand what happened. (btw I tried to avoid it by specifying mapped=1 and in my case the result was a server crash)

            The problem comes from Windows ending lines by CRLF when Unix use just LF. When a file is opened in text mode, Windows replaces in reading these \r\n endings by \n. However, the ftell/fseek process becomes more complicated because Windows tries to compensate the difference between the length of the file in read buffers and the one of the actual file. The result is that ftell becomes a kluge that analyses the whole file to fing the number of existing \n and substracting 1 each time to the result for the missing \r.

            This is Ok for Windows files but if the file is imported from Unix and has no \r, this results in a wrong value returned by ftell that is reguarded as an error if negative. In that case errno is not set (there was no error for Windows) and the message can say anything from "no error" to "Bad file descriptor" or "No such file or directory".

            If it is the case (your file is a Unix file) what must be done is to specify the correct ENDING value:

            ) engine=CONNECT table_type=CSV file_name='D:/Maria/TEST/F80_wochenverlauf.csv' header=0 sep_char=';' ending=1;
            

            Doing so, CONNECT does not open the file in text mode but in bin mode.

            The fix I intend to include in the next version of CONNECT will just to add in the returned message the warning:
            " (possible wrong ENDING option value)" and to fix the way line endings are handled when using file mapping.

            I shall investigate to see if a better fix is possible.

            Show
            bertrandop Olivier Bertrand added a comment - - edited Not finding the file is not an error and is cIearly due to a bad file_name option such as using single backslash instead of escaping them. The true problem is the ftell error. By chance, I have been the victim of the same problem a few days ago and it took me quite long to understand what happened. (btw I tried to avoid it by specifying mapped=1 and in my case the result was a server crash) The problem comes from Windows ending lines by CRLF when Unix use just LF. When a file is opened in text mode, Windows replaces in reading these \r\n endings by \n. However, the ftell/fseek process becomes more complicated because Windows tries to compensate the difference between the length of the file in read buffers and the one of the actual file. The result is that ftell becomes a kluge that analyses the whole file to fing the number of existing \n and substracting 1 each time to the result for the missing \r. This is Ok for Windows files but if the file is imported from Unix and has no \r, this results in a wrong value returned by ftell that is reguarded as an error if negative. In that case errno is not set (there was no error for Windows) and the message can say anything from "no error" to "Bad file descriptor" or "No such file or directory". If it is the case (your file is a Unix file) what must be done is to specify the correct ENDING value: ) engine=CONNECT table_type=CSV file_name='D:/Maria/TEST/F80_wochenverlauf.csv' header=0 sep_char=';' ending=1; Doing so, CONNECT does not open the file in text mode but in bin mode. The fix I intend to include in the next version of CONNECT will just to add in the returned message the warning: " (possible wrong ENDING option value)" and to fix the way line endings are handled when using file mapping. I shall investigate to see if a better fix is possible.
            Hide
            bertrandop Olivier Bertrand added a comment - - edited

            Finally I found that not using text mode at all seems to be the best solution. In addition, it also handles the case of files that could have mixed line endings. CONNECT just takes care of the end of lines in reading without relying on the ENDING option setting, which is still used to know how to terminate lines when writing.

            If you can compile your version, you can test that fix by changing the line 540 in filamtxt.cpp from:

              Bin = (Blocked || Ending != CRLF);
            

            to:

              Bin = true;             // To avoid ftell problems
            

            Let me know if this also works for you so I can close this issue.

            Show
            bertrandop Olivier Bertrand added a comment - - edited Finally I found that not using text mode at all seems to be the best solution. In addition, it also handles the case of files that could have mixed line endings. CONNECT just takes care of the end of lines in reading without relying on the ENDING option setting, which is still used to know how to terminate lines when writing. If you can compile your version, you can test that fix by changing the line 540 in filamtxt.cpp from: Bin = (Blocked || Ending != CRLF); to: Bin = true ; // To avoid ftell problems Let me know if this also works for you so I can close this issue.
            Hide
            matthias Schumacher added a comment -

            <) engine=CONNECT table_type=CSV file_name="D:\\Maria\\TEST
            F80_wochenverlauf.csv" header=0 sep_char=';' ending=1; > works , thanks
            And yes , my test file is a UNIX Style file. i use the same file for tests with MARIADB / Linux .
            i forgot this , mea culpa.

            Thanks

            Show
            matthias Schumacher added a comment - <) engine=CONNECT table_type=CSV file_name="D:\\Maria\\TEST F80_wochenverlauf.csv" header=0 sep_char=';' ending=1; > works , thanks And yes , my test file is a UNIX Style file. i use the same file for tests with MARIADB / Linux . i forgot this , mea culpa. Thanks
            Hide
            bertrandop Olivier Bertrand added a comment -

            The fix is to always use bin mode.

            Show
            bertrandop Olivier Bertrand added a comment - The fix is to always use bin mode.

              People

              • Assignee:
                bertrandop Olivier Bertrand
                Reporter:
                matthias Schumacher
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 4 hours
                  4h