SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 29
  1. #1
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    MySQL join by field with comma seperated list

    Hi Guys,

    Is it possible to do a join on a table where the contents of the join field are comma separated list. So for example I want to do a join on the table below:

    Code:
    SELECT jobs.job_title, 
    companies.company_name, 
    companies.company_url, 
    companies.logo_thumbnail, 
    jobs.job_url, 
    jobs.job_salary, 
    jobs.job_ote, 
    jobs.job_perks, 
    jobs.job_description, 
    jobs.location, 
    jobs.date_posted, 
    industries.industry, 
    count(featured_jobs.id) AS featured
    FROM jobs
    INNER JOIN companies ON companies.id = jobs.company_id
    INNER JOIN industries ON industries.id = jobs.industry
    LEFT OUTER JOIN featured_jobs ON featured_jobs.job_id = jobs.id
    WHERE jobs.status = 'on'
    GROUP BY jobs.id
    ORDER BY featured DESC , date_posted DESC
    LIMIT 0 , 30
    I want to get the industry name and the industry is stored by id (but there can be more that one and the IDs are separated by comma).. For example the contents of the jobs.industry might be "1011,1016,1090"

    Can I do a join in this way?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    two problems

    yes, you can do the join on a comma-delimited column, but it will always require a table scan (and therefore, be slow as cold glue)

    secondly, if there is more than one industry per job, you won't be able to just "get the industry name" because there will be more than one industry, and how did you want to display them? you're doing a GROUP BY on the job, so you're only going to see one row per job in the results
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    two problems

    yes, you can do the join on a comma-delimited column, but it will always require a table scan (and therefore, be slow as cold glue)

    secondly, if there is more than one industry per job, you won't be able to just "get the industry name" because there will be more than one industry, and how did you want to display them? you're doing a GROUP BY on the job, so you're only going to see one row per job in the results
    Hi r937,

    I think it would probably be better for me in the longer term to recreate the database structure so that the industries are stored in a separate table. That would be much easier/faster to join then.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Zaggs View Post
    That would be much easier/faster to join then.
    totally agree

    but you will still ahve to decide which of the industries you want to show for each job
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    totally agree

    but you will still ahve to decide which of the industries you want to show for each job
    Hmmm yes, how would I get a list of industries associated with each job? What is the ideal scenario here?

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    you might have to do a GROUP_CONCAT on the industry names, just to keep the results to one row per job
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you might have to do a GROUP_CONCAT on the industry names, just to keep the results to one row per job
    At the moment the industries are stored in the jobs table in the following format:

    1024, 1025, 1089

    Would it be better to store them in their own table (i.e a row for each) and then do a group_concat?

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Zaggs View Post
    Would it be better to store them in their own table (i.e a row for each) and then do a group_concat?
    absolutely, yes

    as i said before, performing a join on a comma-delimited list will be slow as cold glue
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    absolutely, yes

    as i said before, performing a join on a comma-delimited list will be slow as cold glue
    Can I just ask what you would see as the best possible structure for the new table? Would I need an ID field (primary key) or would it just be best to create an index on the job_id field?

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Code:
    CREATE TABLE job_industries 
    ( job_id INTEGER NOT NULL
    , industry_id INTEGER NOT NULL 
    , PRIMARY KEY ( job_id, industry_id )
    , INDEX industry_jobs ( industry_id, job_id )
    , FOREIGN KEY ( job_id ) REFERENCES jobs ( id )
    , FOREIGN KEY ( industry_id ) REFERENCES industries ( id )
    );
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    Code:
    CREATE TABLE job_industries 
    ( job_id INTEGER NOT NULL
    , industry_id INTEGER NOT NULL 
    , PRIMARY KEY ( job_id, industry_id )
    , INDEX industry_jobs ( industry_id, job_id )
    , FOREIGN KEY ( job_id ) REFERENCES jobs ( id )
    , FOREIGN KEY ( industry_id ) REFERENCES industries ( id )
    );
    Perfect, thank you! And... on that basis how would I run the original query to include my group_concat?

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    you'd need to join to a subquery that did the aggregation

    give it a try, come on...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you'd need to join to a subquery that did the aggregation

    give it a try, come on...
    Im trying this, but it returns the industry as a BLOB field and I can't see the data within it. Any idea why?

    Code:
    SELECT jobs.job_title, companies.company_name, companies.company_url, companies.logo_thumbnail, jobs.job_url, jobs.job_salary, jobs.job_ote, jobs.job_perks, jobs.job_description, jobs.date_posted, 
    GROUP_CONCAT( jobs_sectors.sector, ', ' ) AS industry, 
    count( featured_jobs.id ) AS featured
    FROM jobs
    INNER JOIN companies ON companies.id = jobs.company_id
    INNER JOIN jobs_sectors ON jobs_sectors.job_id = jobs.id
    LEFT OUTER JOIN featured_jobs ON featured_jobs.job_id = jobs.id
    WHERE jobs.status = 'on'
    GROUP BY jobs.id
    ORDER BY featured DESC , date_posted DESC

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    no, you need to replace the join to jobs_sectors with a join to a subquery, which does the GROUP_CONCAT and has a GROUP BY jobs_sectors.job_id

    you've never seen a join to a subquery before?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    no, you need to replace the join to jobs_sectors with a join to a subquery, which does the GROUP_CONCAT and has a GROUP BY jobs_sectors.job_id

    you've never seen a join to a subquery before?
    No do you have a link where I can see how to join a subquery?

  16. #16
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I tried this, but no luck..Am I getting close ;-)

    Code:
    SELECT jobs.job_title, companies.company_name, companies.company_url, companies.logo_thumbnail, jobs.job_url, jobs.job_salary, jobs.job_ote, jobs.job_perks, jobs.job_description, jobs.date_posted, count( featured_jobs.id ) AS featured
    FROM jobs
    INNER JOIN companies ON companies.id = jobs.company_id
    INNER JOIN (
    SELECT GROUP_CONCAT( jobs_sectors.sector, ', ' )
    FROM jobs_sectors
    GROUP BY jobs_sectors.job_id
    ) AS industry
    LEFT OUTER JOIN featured_jobs ON featured_jobs.job_id = jobs.id
    WHERE jobs.status = 'on'
    GROUP BY jobs.id
    ORDER BY featured DESC , date_posted DESC

  17. #17
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Code:
    SELECT jobs.job_title
         , companies.company_name
         , companies.company_url
         , companies.logo_thumbnail
         , jobs.job_url
         , jobs.job_salary
         , jobs.job_ote
         , jobs.job_perks
         , jobs.job_description
         , jobs.date_posted
         , i.sectors
         , f.featured
      FROM jobs
    INNER 
      JOIN companies 
        ON companies.id = jobs.company_id
    INNER 
      JOIN ( SELECT job_id
                  , GROUP_CONCAT(sector) AS sectors
               FROM jobs_sectors
             GROUP 
                 BY job_id ) AS i
        ON i.job_id = jobs.id
    LEFT OUTER 
      JOIN ( SELECT job_id
                  , COUNT(*) AS featured
               FROM featured_jobs
             GROUP 
                 BY job_id ) AS f
        ON f.job_id = jobs.id
     WHERE jobs.status = 'on'
    ORDER 
        BY featured DESC 
         , date_posted DESC
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  18. #18
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks, although the derived sectors field is still being shown as a BLOB field - is this right?

  19. #19
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    use LEFT(GROUP_CONCAT(...),255)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  20. #20
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    use LEFT(GROUP_CONCAT(...),255)
    Thanks r937!

    That brings me to my final question...

    If I wanted to perform a search on the sectors and find all rows (jobs) which are listed in more than one sector - how would I do that?

    For example the jobs_sectors table might contain the following:

    job_id sector
    1 22
    1 28
    2 22
    3 25

    Now if I did a search for sector 22, it should return both job_id 1 and job_id 2.

    Thanks in advance, you've been a great help!

  21. #21
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Zaggs View Post
    That brings me to my final question...
    are you sure?

    Quote Originally Posted by Zaggs View Post
    If I wanted to perform a search on the sectors and find all rows (jobs) which are listed in more than one sector - how would I do that?
    by joining your jobs table to the following subquery before joining to all the other tables you gots already:
    Code:
    SELECT job_id   
      FROM jobs_sectors
     WHERE sector in (22,23)
    GROUP
        BY job_id
    HAVING COUNT(*) = 2
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  22. #22
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi guys!

    I am using the following SQL query which is perfectly valid. However, it's not returning the desired results. The location, job_type, industry and occupation columns are all being returned blank even though a value exists in all of the join tables (i.e. jobs_locations, jobs_types, jobs_sectors and jobs_occupations).

    I can't understand what's wrong. Does anyone have any ideas?

    Code:
    SELECT jobs . * , l.location, j.job_type, i.industry, o.occupation, count( featured_jobs.id ) AS featured, count( job_of_week.id ) AS jobofweek
    FROM jobs
    LEFT OUTER JOIN featured_jobs ON featured_jobs.job_id = jobs.id
    LEFT OUTER JOIN job_of_week ON job_of_week.job_id = jobs.id
    LEFT OUTER JOIN (
    
    SELECT job_id, group_concat( sector
    SEPARATOR ', ' ) AS industry
    FROM jobs_sectors
    ) AS i ON i.job_id = jobs.id
    LEFT OUTER JOIN (
    
    SELECT job_id, group_concat( location
    SEPARATOR ', ' ) AS location
    FROM jobs_locations
    ) AS l ON l.job_id = jobs.id
    LEFT OUTER JOIN (
    
    SELECT job_id, group_concat( job_type
    SEPARATOR ', ' ) AS job_type
    FROM jobs_types
    ) AS j ON j.job_id = jobs.id
    LEFT OUTER JOIN (
    
    SELECT job_id, group_concat( occupation
    SEPARATOR ', ' ) AS occupation
    FROM jobs_occupations
    ) AS o ON o.job_id = jobs.id
    WHERE jobs.id = '1001'
    GROUP BY jobs.id

  23. #23
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    instead of GROUP_CONCAT(...), try LEFT(GROUP_CONCAT(...),255)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  24. #24
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    instead of GROUP_CONCAT(...), try LEFT(GROUP_CONCAT(...),255)
    Does that slow the whole query down though? I need this query to be very fast as it will be searching a lot of results.

  25. #25
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,334
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Zaggs View Post
    Does that slow the whole query down though?
    i have two things to say:

    1. what happened when you tested it?

    2. i don't think so
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •