I have two tables, ‘post’ and ‘postmeta’. The post table basically just contains the author, title and body of the post. The ‘postmeta’ table contains additional metadata about the post, such as the type of post, category, tags, etc. The ‘postmeta’ table looks like this…
id post_id name value
-- ------- ---- -----
1 1 type blog
2 1 category technology
How would I, for example, get all posts with the ‘type’ of ‘blog’, and the ‘category’ of ‘technology’? Getting all posts which match just a single bit of metadata is easy using this…
SELECT * FROM post
JOIN postmeta ON postmeta.post_id = post.id
WHERE postmeta.name = 'type' AND postmeta.value = 'blog'
But I’m am stumped as to how to the same thing, but matching on multiple rows of metadata. I do know however that this will almost certainly require a sub-query of some sort.
Any help is much appreciated as always.
pm1.post_id = post.id
pm2.post_id = post.id
pm1.name = 'type'
pm1.value = 'blog'
Ah yes, that method rings a big bell. It’s been a while since I’ve touched MySQL, but I’ve definitely done that sort of query before. The old multiple joins on the same table trick
I assume this is the only, or at least the most efficient, way?
I don’t know any other way, so I guess so too. But I’d be happy if someone could correct me on that
there is another wy, and it is ultimately a lot more flexible than adding an additional join for every additional attribute/value condition
FROM ( SELECT post_id
WHERE name = 'type' AND value = 'blog'
OR name = 'category' AND value='technology'
HAVING COUNT(*) = 2 ) AS meta
ON post.id = meta.post_id
the WHERE clause in the subquery finds all postmeta rows that are required – and the conditions have to be combined with OR, because each row can satisfy only one of them
then the GROUP BY and HAVING clauses make sure all the required rows were found for a given post_id
the subquery returns the post_id of all posts that satisfy all requirements, and then the INNER JOIN in the main query retrieves those posts
this method is much more flexible because in the subquery you can do things like “must have at least 3 of these 5 conditions” which is pretty damned tricky if you have to have 5 joins, which would have to be LEFT OUTER JOINS and then how do you check that there were at least 3 non-null results? … ugly!
Yeah nice one r937. I actually thought of your suggestion as a possible solution before posting this (using GROUP BY and COUNT in a sub-query), but didn’t know if there were any easier or better ways.
I can’t help but think however, that scenario’s like this highlight a weakness in SQL as a query language. I imagine this would be easily solvable in an object-orientated type query language where you would join the ‘postmeta’ table as a set of attributes against the ‘post’, so you could just do something like: WHERE post.postmeta.type = ‘blog’ AND post.postmeta.category = ‘technology’. I guess that it all comes back to performance though?