SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Using stored procedures to move data between tables

    Hi guys,

    I've got to create a data import routine for a MySQL database. We're getting the data in using the CSV engine (tested and working great) but when the file is imported, I need to be able to then transfer that data in to a MyISAM table. I figure that we'd do something like:

    1. Get the first line from the CSV table
    2. Check whether this already exists in the MyISAM table
    3. If not, import in to the new table, else copy to a third table for investigation
    4. Check that the data imported OK
    5. Delete the original line from the CSV file

    Now, I've never used stored procedures, but it seems like an obvious solution to me. Alternatively, I'll need to create a command-line PHP script, which I don't really want to do because it's so much slower than just getting the DB server to do it.

    So, can anyone give me any tips? Once I'm up and started I should be able to finish it myself, but I've really got no idea how to start.

    Thanks all

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    writing "raat" code (row-at-a-time) like you're suggesting will, of course, work

    but if it's me, i would simply load the csv into a "landing" table (a table built to hold the data as is from the csv)

    then i would use a series of simple SQL statements to manipulate the data from there into the target tables
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    When you say a "landing" table, is that not what the CSV table will be anyway? I love that by dropping the file in to the data directory that we make either 10 or 10,000,000 records available to the database just as quickly. It's an excellent way to make the data available, but because it can't be indexed we need to move it in to the correct tables ASAP. Especially because there will be another file following shortly behind, and we don't want to overwrite the original data until it's all imported.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Antnee View Post
    When you say a "landing" table, is that not what the CSV table will be anyway?
    um, yes

    sorry, i didn't understand that the first time

    so you import the csv into a table, but then why do you write a "raat" procedure? why not handle all the rows in one SQL statement at a time?

    indexing wouldn't be useful because by definition you will want to handle all rows anyway
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I don't do a RAAT, I just thought it might be the best way to go. The table looks like this:

    Code MySQL:
    CREATE TABLE `csv_in` (
      `sc` int(2) unsigned NOT NULL,
      `job` int(6) unsigned NOT NULL,
      `customer` varchar(25) NOT NULL,
      `ptype` varchar(3) NOT NULL,
      `make` varchar(4) NOT NULL,
      `model` varchar(16) NOT NULL,
      `sku` int(6) unsigned NOT NULL,
      `serial` varchar(20) NOT NULL,
      `sa` int(2) NOT NULL,
      `symptoms` varchar(30) NOT NULL,
      `cp` varchar(20) NOT NULL,
      `booked` datetime NOT NULL
    ) ENGINE=CSV

    All of that data needs to go in to the jobs table, with the exception of the surname, which goes in to a customer table (don't ask!). The job table has a lot more data in it than just this though, this data is just to prime the system when we get jobs in to the building, using data from another system when it was booked (the other system is a flat-file system running on an old AS/400 system and completely incompatible with MySQL).

    I was sent some example data and found that some times a job appeared more than once. Although I'd like to ensure that this doesn't happen at the source, I can't rely on that, so I need to make sure that I don't automatically overwrite the original job if it does in fact exist. There's no easy way to know whether the new data is usable, so I need someone to manually check it.

    Does that help?

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Antnee View Post
    ... so I need someone to manually check it.
    can't automate that, eh
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Not easily, no

  8. #8
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    So, are you able to give me a hint about the best way to create that SP?

  9. #9
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I'm trying to work this out, being completely new to SPs, this is as far as I've got so far:

    Code MySQL:
    BEGIN
        DECLARE var_sc INT(2);
        DECLARE var_job INT(6);
        DECLARE var_ex CHAR(1);
        SELECT `sc`, `job` INTO var_sc, var_job FROM csv_in LIMIT 0,1;
        IF EXISTS (SELECT job FROM jobs WHERE job = var_job AND sc = var_sc)THEN
            SELECT 'Y' INTO var_ex;
        ELSE 
            SELECT 'N' INTO var_ex;
        END IF;
        SELECT var_ex AS `exists`;
    END

    Obviously I'll replace the SELECT queries in the IF EXISTS switch to actually put the data in one table or another, but I have a problem that stops me going any further already; I can only return one result. If I remove the LIMIT part of the original query then I get an error that I'm returning too many results. Is there a "WHILE" loop I can put it in like I would in PHP?

  10. #10
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    D'oh! What a mess of code! I think I've worked it out, I'll post in a mo. New functions available to me now!

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    LIMIT 1 sure looks like row-at-a-time logic to me
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    No, you're right, I could only get the SP to save if I had that limit in it, because I'm selecting in to a variable and that forces a 1 result limit anyway. I think I've sorted it now with a while loop, though I need to tidy it up still. I'm sure there's a neater way to do it anyway, but I'll post the code in a mo

  13. #13
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    England
    Posts
    702
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    OK, I've got it working as I want it to, but it's a bit messy and could really do with cleaning up:

    Code MySQL:
    BEGIN
        DECLARE c BIGINT(10);  
        DECLARE x BIGINT(10);
        DECLARE passed BIGINT(10);
        DECLARE failed BIGINT(10);
        SET x = 0;
        SET passed = 0;
        SET failed = 0;
        SELECT count(*) INTO c FROM csv_in;
        WHILE (x < c) DO
            SET @sc = 40;
            SET @job = 1;
            SET @customer = 'Test';
            SET @ptype = 'B01';
            SET @make = 'XXXX';
            SET @model = '*UNKNOWN';
            SET @sku = 963282;
            SET @sn = 'TO OBTAIN';
            SET @sa = 0;
            SET @symptoms = 'UNKNOWN';
            SET @cp = 0;
            SET @booked = now();
            SET @result = 1;
            SET @user    = 'WWW';
            SELECT
                `sc`, `job`, `customer`, `ptype`, `make`, `model`, `sku`, `serial`, `sa`, `symptoms`, `cp`, `booked`
            INTO
                @sc, @job, @customer, @ptype, @make, @model, @sku, @sn, @sa, @symptoms, @cp, @booked
            FROM csv_in ORDER BY booked ASC LIMIT 1;
            IF EXISTS (SELECT job FROM jobs WHERE job = @job) OR (date(@booked) > curdate()) THEN
                INSERT
                INTO csv_in_fail
                        (`sc`, `job`, `customer`, `ptype`, `make`, `model`, `sku`, `serial`, `sa`, `symptoms`, `cp`, `booked`, `failwhen`)
                VALUES
                        (@sc, @jrs, @customer, @ptype, @make, @model, @sku, @sn, @sa, @symptoms, @cp, @booked, now());
                SET failed = failed + 1;
            ELSE 
                INSERT
                INTO jobs
                        (`sc`, `job`, `cust_id`, `pt`, `make`, `model`, `sku`, `serial`, `sa`, `symptoms`, `cover`, `created`, `result`, `user`)
                VALUES
                        (@sc, @job, @job, @ptype, @make, @model, @sku, @sn, @sa, @symptoms,    @cp, @booked, @result, @user);
                INSERT
                INTO customer
                        (`sc`, `job`, `surname`)
                VALUES
                        (@sc, @job, @customer);
                SET passed = passed + 1;
            END IF;
            DELETE FROM csv_in ORDER BY booked ASC LIMIT 1;
            SET x = x + 1;
        END WHILE;
        SELECT c AS `total`, (passed + failed) AS `check` , passed, failed;
    END

    As I say, it works, though there's still some validation etc that I need to add. Lets just call this an early alpha to see if I could get it to work. Not too bad for a first ever effort at a SP is it? Honestly


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •