SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Zealot
    Join Date
    May 2007
    Posts
    109
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Tricky select query

    Hi,

    I have a table containing infos about bands (table 1), for example the ids of the releases the band has done. This table is not normalised, and there is some reason for it.
    Now I have to create a normalised table out of this, and because it's a very big table, I want to get do it with only 1 query, so there should not be any php code or so involved. I will insert the result into another table, but that's no problem. I just need the select statement to get table 2 out of table 1.
    I tried out a long time but didn't get how to do this.

    Thanks for help!

    table 1
    Code MySQL:
    id | band_id    | release_ids
    1  | 1          | 1,2,3
    2  | 2          | 4,5,6

    table 2
    Code MySQL:
    band_id    | release_id
    1          | 1
    1          | 2
    1          | 3
    2          | 4
    2          | 5
    2          | 6

  2. #2
    SitePoint Evangelist
    Join Date
    Aug 2007
    Posts
    566
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm afraid you cannot do that with mysql alone.
    In ms sql server, you would have created a table based function that would have split and pivoted the last field.

    But I cannot find equivalent in mysql.
    There seems to be no pivot in mysql (or there is no mention of it in the docs, except in the user comments, can this be true !?), and I haven't found any mentions of table based functions, nor functions that can return a table like set result.

  3. #3
    SitePoint Zealot
    Join Date
    May 2007
    Posts
    109
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    there is a possibility to use something like

    "where id in(select release_ids from table1 where foo)" for the comma separated ids

    but then I would have to do a kind of complicated join/subselect for getting the results for every band entry and I couldn't find out how to do this...

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    no, you can do it another way, but the real answer is to redesign the table

    any time you store a list of ids in a VARCHAR column, you violate first normal form

    just don't do it, you're asking for insanely complex and miserably performing queries
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Zealot
    Join Date
    May 2007
    Posts
    109
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    yes I know you're right and the table is absolutely not normalised.
    But I did it this way only for a table that stores multiple versions of an profile that can be edited, not to have 10 different tables for this. Usually I don't do things like this. Now the project is almost finished and I don't have time to change it any more.

    I'm very interest in a insanely complex and miserably performing solution if you could help me out

  6. #6
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    in a single query? not possible. you can do it in a couple queries.

    first you have to find the maximum number of elements in the release_ids column:
    Code:
    select char_length(release_ids) -
           char_length(replace(release_ids, ',', '')) + 1
      from table1
    then you create an integer table with at least that number of rows:
    Code:
    create table i (i int unsigned);
    insert i values (1), (2), (3), etc.;
    and finally, do this:
    Code:
    select band_id
         , substring_index(substring_index(release_ids, ',', i), ',', -1)
      from table1
      join i
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast

  7. #7
    SitePoint Zealot
    Join Date
    May 2007
    Posts
    109
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    thanks. I tried your solution, but it's floating memory on very big tables

    Greetings

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    longneck described the general approach but you really have to test this stuff to have any confidence in it
    Code:
    CREATE TABLE table1
    ( id INTEGER NOT NULL 
    , band_id INTEGER
    , release_ids VARCHAR(99)
    );
    INSERT INTO table1 VALUES
     ( 1, 1, '1,2,3' )
    ,( 2, 2, '4,5,6' )
    ,( 3, 3, '9,3,7' )
    ,( 4, 4, '1' )
    ,( 5, 5, '2,2,2,2' )
    ,( 6, 6, '8' )
    ,( 7, 7, '' )
    ;
    notice a few more varied samples...

    now let me explain the query...
    Code:
    SELECT band_id
         , SUBSTRING_INDEX(
           SUBSTRING_INDEX(release_ids,',',i)
                           , ',', -1) AS release_id
      FROM table1
    CROSS
      JOIN integers
     WHERE i BETWEEN 1 AND
           ( SELECT MAX(CHAR_LENGTH(release_ids) -
                        CHAR_LENGTH(REPLACE(release_ids,',','')) )
               FROM table1 )
    my numbers tables is called the integers table and i can't remember how many rows it has... 0 through some large number... which is why i use the WHERE condition on the cross join

    the nested SUBSTRING_INDEX technique is clever, but pretty hard to make general enough to handle cases like band_ids 4, 5, and 7, each of which has different problems
    Code:
    band_id	release_id
    1	1
    1	2
    1	3
    2	4
    2	5
    2	6
    3	9
    3	3
    3	7
    4	1
    4	1
    4	1
    5	2
    5	2
    5	2
    6	8
    6	8
    6	8
    7	
    7	
    7
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Zealot
    Join Date
    May 2007
    Posts
    109
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Finally got it !

    With a small modification it also works on the special cases:

    Code MySQL:
    SELECT
        DISTINCT
        band_id,
        SUBSTRING_INDEX(
            SUBSTRING_INDEX( release_ids, ',', i )
            , ','
            , -1
        ) AS release_id
    FROM table1
    CROSS JOIN i
    having release_id != ""

    If the number of comma-separated alements vary a lot, I will use the where part of r937
    Pretty cool, and very fast! The result can be inserted into another table using a subquery, so I can do everything in one query what will take around 1 second. Thanks a lot !!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •