I have two tables, ‘post’ and ‘postmeta’. The post table basically just contains the author, title and body of the post. The ‘postmeta’ table contains additional metadata about the post, such as the type of post, category, tags, etc. The ‘postmeta’ table looks like this…
id post_id name value
-- ------- ---- -----
1 1 type blog
2 1 category technology
How would I, for example, get all posts with the ‘type’ of ‘blog’, and the ‘category’ of ‘technology’? Getting all posts which match just a single bit of metadata is easy using this…
SELECT * FROM post
JOIN postmeta ON postmeta.post_id = post.id
WHERE postmeta.name = 'type' AND postmeta.value = 'blog'
But I’m am stumped as to how to the same thing, but matching on multiple rows of metadata. I do know however that this will almost certainly require a sub-query of some sort.
SELECT
something
, anything
, just_not_the_dreaded_star
FROM
post p
INNER JOIN
postmeta pm1
ON
pm1.post_id = post.id
INNER JOIN
postmeta pm2
ON
pm2.post_id = post.id
WHERE
pm1.name = 'type'
AND
pm1.value = 'blog'
AND
pm2.name='category'
AND
pm2.value='technology'
Ah yes, that method rings a big bell. It’s been a while since I’ve touched MySQL, but I’ve definitely done that sort of query before. The old multiple joins on the same table trick
I assume this is the only, or at least the most efficient, way?
there is another wy, and it is ultimately a lot more flexible than adding an additional join for every additional attribute/value condition
SELECT something
, anything
, just_not_the_dreaded_evil_select_star
FROM ( SELECT post_id
FROM postmeta
WHERE name = 'type' AND value = 'blog'
OR name = 'category' AND value='technology'
GROUP
BY post_id
HAVING COUNT(*) = 2 ) AS meta
INNER
JOIN post
ON post.id = meta.post_id
the WHERE clause in the subquery finds all postmeta rows that are required – and the conditions have to be combined with OR, because each row can satisfy only one of them
then the GROUP BY and HAVING clauses make sure all the required rows were found for a given post_id
the subquery returns the post_id of all posts that satisfy all requirements, and then the INNER JOIN in the main query retrieves those posts
this method is much more flexible because in the subquery you can do things like “must have at least 3 of these 5 conditions” which is pretty damned tricky if you have to have 5 joins, which would have to be LEFT OUTER JOINS and then how do you check that there were at least 3 non-null results? … ugly!
Yeah nice one r937. I actually thought of your suggestion as a possible solution before posting this (using GROUP BY and COUNT in a sub-query), but didn’t know if there were any easier or better ways.
I can’t help but think however, that scenario’s like this highlight a weakness in SQL as a query language. I imagine this would be easily solvable in an object-orientated type query language where you would join the ‘postmeta’ table as a set of attributes against the ‘post’, so you could just do something like: WHERE post.postmeta.type = ‘blog’ AND post.postmeta.category = ‘technology’. I guess that it all comes back to performance though?