SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)

    Troubleshooting LOAD DATA INFILE

    Hi,

    I post this to the Database forum as I suspect that it has something to do with MySQL or the SQL written?

    I am using PDO as a driver and have been trying to troubleshoot why I canīt get "LOAD DATA INFILE" working.

    I have paired it down into a very simple example and can even get that working. I try output errors using
    PHP Code:
    echo "\nPDOStatement::errorCode(): ";
    print 
    $this->o_Db->errorCode(); 
    which generates a rather generic
    PDOStatement::errorCode(): 42000
    Error 42000 seems most often to be associated with invalid SQL but I have been very careful with the paths and the sql syntax, plus I escape (using ticks) all db field names.

    It is setup like this:
    PHP Code:
    $csv_path_and_filename '/var/www/eprs/libs/page_queries/test_data.csv';
    $temp_table_name $o_RandomString->getRandomString('__temp_'); // Generates a temporary table name like __temp_rjxrrhh

    $sql=<<<EOL
             LOAD DATA INFILE "$csv_path_and_filename"
             REPLACE INTO TABLE 
    $temp_table_name
             FIELDS TERMINATED BY ';'
             IGNORE LINES 1
             ( `date`, `name`)
    EOL;
    $result $this->o_Db->exec($sql);
    echo 
    "\nPDOStatement::errorCode(): ";
    print 
    $o_Db->errorCode(); 
    You may notice that I escaped the database field names using ticks as 'date' is a reserved word in SQL. Also I did try the SQL using OPTIONALLY ENCLOSED BY "\"" LINES TERMINATED BY "\\r\\n" and ENCLOSED BY "\"" LINES TERMINATED BY "\\n" as well as the absence of these lines as shown above.

    The database create function:
    PHP Code:
    function createTempTable($temp_table_name){ 
            
    $sql "
                CREATE TABLE 
    $temp_table_name (
                  date DATE
                  , name VARCHAR(150)
                 ) TYPE=innodb;
          
    $stmt = prepare($sql);
          
    $results = $o_Db->exec(($sql);
          return 
    $results;

    My CSV contains two data rows plus a 'title' row
    Code:
    Date;Name
    05/05/07;Bill
    05/05/09;Mike
    I have verified that the '/var/www/eprs/libs/page_queries/test_data.csv' file is located in that specified path on the server. I have also set the file and folder permissions to 0755 so that permissions should not be causing the problem

    It is a MySQL 5.0.51a-24 version and you can see in the create statement that it is a INNODB table. The server is Centos Linux

    Do you have any suggestions on what else to try?

    I would appreciate your help on this.

    Regards,
    Steve
    ictus==""

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    i think you will want to debug the LOAD DATA statement in isolation first, before placing it inside your php code

    you will want to ensure that all the options and features, such as LINES TERMINATED BY, are properly set up

    one thing i noticed is that you might have trouble with your date values if they are not in year-month-day sequence
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    LOAD DATA INFILE should have IGNORE 1 LINES not IGNORE LINES 1.

  4. #4
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by r937 View Post
    i think you will want to debug the LOAD DATA statement in isolation first, before placing it inside your php code

    you will want to ensure that all the options and features, such as LINES TERMINATED BY, are properly set up

    one thing i noticed is that you might have trouble with your date values if they are not in year-month-day sequence
    When inside Excel the date was set as 2012-11-31 but on save as it must of switched this; thank you for noticing I can force it back using date() in php. I did try to run this using mysql workbench and via the command line; however I am not sure if I was specifying the paths correctly in each of these as I was using and absolute path?

    Quote Originally Posted by rcashell View Post
    LOAD DATA INFILE should have IGNORE 1 LINES not IGNORE LINES 1.
    Thanks!

    I'll post all working code once I get there.

    Regards,
    Steve
    ictus==""

  5. #5
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    Ok,

    Both suggestions were needed to get the files importing. Interestingly in MySQL LOAD DATA INFILE documentation they sow examples using 'IGNORE LINES 1' but it does not work when I use that. The dates however, are still not working correctly.

    Rudy, I did as you suggested and isolated the SQL out of PHP, here is what it is now:
    Code:
      LOAD DATA INFILE "/var/www/ex/libs/page_queries/test_data2.csv"
      REPLACE INTO TABLE `__temp_ovgoptjint`
      FIELDS TERMINATED BY ';'
      IGNORE 1 LINES
      (@date, `name`)
      SET Date = date_format(STR_TO_DATE(@date, '%b %m, %Y'), '%Y-%m-%-d');
    The actual dates that need to be working are formatted as in the CSV file contents below:
    Code:
    Date;Name
    Sep 14, 2009;Bill
    Sep 18, 2009;Mike
    When I do this it inserts the dates correctly:
    Code:
      LOAD DATA INFILE "/var/www/ex/libs/page_queries/test_data2.csv"
      REPLACE INTO TABLE `__temp_ovgoptjint`
      FIELDS TERMINATED BY ';'
      IGNORE 1 LINES
      (`date`, `name`)
    While having this data
    Code:
    Date;Name
    2009-09-14;Bill
    2009-09-18;Mike
    When using the SET command my date column contains only NULL
    Code:
    SET Date = date_format(STR_TO_DATE(@date, '%b %m, %Y'), '%Y-%m-%-d');
    The way I read the line above is:
    1. We are setting the Date column with the variable @date as the variable value being operated upon
    2. The str_to_date() method should use the value of the date variable that is formatted the same way as the CSV files 'abbreviated month + space + day (as digits) + full month (as digits)
    3. the date_format() method should return a valid date value formatted in 2012-05-30.


    I also tried just str_to_date() like:
    Code:
    SET Date = STR_TO_DATE(@date, %Y-%m-%-d');
    This, however would not allow the statement to execute.

    Any further ideas on how I might get this date formatted correctly?

    Thanks,
    Steve
    ictus==""

  6. #6
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Are you sure about this:

    SET Date = STR_TO_DATE(@date, %Y-%m-%-d');

    You have a - before d and you left out the starting ' for the format string.

    SET Date = STR_TO_DATE(@date, '%Y-%m-%d');

  7. #7
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    Yes, sorry that was a typo... I know longer had it in the code to copy so I re-wrote it but I am sure that it does not work.
    Quote Originally Posted by rcashell View Post
    Are you sure about this:

    SET Date = STR_TO_DATE(@date, %Y-%m-%-d');

    You have a - before d and you left out the starting ' for the format string.

    SET Date = STR_TO_DATE(@date, '%Y-%m-%d');
    ictus==""

  8. #8
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    I can't mess around any longer with this so I decided to set the DATE field to VARCHAR instead. Before you have a Kiniption Fit, I'm importing this data into a temporary db. When I need to insert the DATE into permanent tables I will process the string date using PHP's date formatting and then insert it into a date field. This has the added benefit; if a CSV file that is being imported contains different date formats it would be very difficult to get the SQL to adapt, so In this case I can have more control on how dates are processed and still import a lot of records very quickly.

    Regards,
    Steve
    ictus==""

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    that's an awesome post, steve

    that's a strategy i've advocated before, a "landing" table for uploads using all VARCHARs as necessary, and reformating to permanent tables once the data is completely analyzed

    it's step 2 in ETL (extract, transform, load), assuming that "load" means loading cleansed data into permanent tables

    no reason why the transform step shouldn't happen within the database
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    Thanks Rudy

    I can see the value of this strategy, although this is the first time I've used such a thing.

    I am however a little disappointed that I couldn't get it working with the date filtering in the SQL, but the shear nature that I have to be so specific means that it can break easily if a different date format is thrown at it. So, your advocation for this strategy makes a lot of sense... just wish I had thought of it earlier
    ictus==""

  11. #11
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    not sure why this is not working for you. I have run a test here and I post the code below:


    $ cat /tmp/demo.txt
    Date;Name
    2009-09-14;Bill
    2009-09-18;Mike

    $ cat demo.sql
    LOAD DATA INFILE "/tmp/demo.txt"
    REPLACE INTO TABLE datetest
    FIELDS TERMINATED BY ';'
    IGNORE 1 LINES
    (@date, `name`)
    SET Date = STR_TO_DATE(@date, '%Y-%m-%d');


    mysql> create table datetest (`date` date, name varchar(45));
    Query OK, 0 rows affected (0.03 sec)

    mysql> source demo.sql
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

    mysql> select * from datetest;
    +------------+------+
    | date | name |
    +------------+------+
    | 2009-09-14 | Bill |
    | 2009-09-18 | Mike |
    +------------+------+
    2 rows in set (0.00 sec)

  12. #12
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    2 Thread(s)
    Hi,

    Try this CSV instead. I could import the csv/demo.txt you have it is when I tried this:

    Code:
    Date;Name
    Sep 14, 2009;Bill 
    Sep 18, 2009;Mike
    
    that it didn't work. The 2009-09-14 in your code is already in a format MySQL will accept so there is no need to convert it.

    Thanks for trying though

    Regards,
    Steve

    Quote Originally Posted by rcashell View Post
    Hi,

    not sure why this is not working for you. I have run a test here and I post the code below:


    $ cat /tmp/demo.txt
    Date;Name
    2009-09-14;Bill
    2009-09-18;Mike

    $ cat demo.sql
    LOAD DATA INFILE "/tmp/demo.txt"
    REPLACE INTO TABLE datetest
    FIELDS TERMINATED BY ';'
    IGNORE 1 LINES
    (@date, `name`)
    SET Date = STR_TO_DATE(@date, '%Y-%m-%d');


    mysql> create table datetest (`date` date, name varchar(45));
    Query OK, 0 rows affected (0.03 sec)

    mysql> source demo.sql
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

    mysql> select * from datetest;
    +------------+------+
    | date | name |
    +------------+------+
    | 2009-09-14 | Bill |
    | 2009-09-18 | Mike |
    +------------+------+
    2 rows in set (0.00 sec)
    ictus==""

  13. #13
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Hi Steve,

    this should now work. I thought you were looking at the 2009-09-14 date format.


    $ cat /tmp/demo.txt
    Date;Name
    Sep 14, 2009;Bill
    Sep 18, 2009;Mike

    $ cat demo.sql
    LOAD DATA INFILE "/tmp/demo.txt"
    REPLACE INTO TABLE datetest
    FIELDS TERMINATED BY ';'
    IGNORE 1 LINES
    (@date, `name`)
    SET Date = STR_TO_DATE(@date, '%b %d, %Y');


    mysql> truncate table datetest;
    Query OK, 0 rows affected (0.00 sec)

    mysql> source demo.sql
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

    mysql> select * from datetest;
    +------------+-------+
    | date | name |
    +------------+-------+
    | 2009-09-14 | Bill |
    | 2009-09-18 | Mike |
    +------------+-------+
    2 rows in set (0.00 sec)


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •