Need help with complex MySQL SELECT

Hi,

Here is what I want to do.
We have 3 Tables:
Users
Articles
Replies

I want a select command that will get the list of articles which match a given Keyword but also tells me how
many replies each of these articles have gotten and what the name of the author of that article is (which comes from Users table). And of course if an article has 0 replies then it should say
that it list it in the results with 0 for the replies of that article.

I tried this:

select users.nick_name, articles.article_id, articles.user_id,  articles.title AS subject, num_views, articles.date_posted, COUNT(replies.article_id) AS num_replies 
FROM articles, replies, users 
WHERE articles.body LIKE '%Fly rods%' 
AND articles.article_id = replies.article_id 
AND articles.user_id = users.user_id 
GROUP by replies.article_id 
ORDER BY articles.date_posted DESC LIMIT 10 

However, there is a problem with above command and the problem is that if the article has 0 replies, then it is not listed.

Regards,

Weird. MySQL on my PC doesn’t give that warning when I run the query (similar database, only table names and field names are different).

Ah well, here ya go:


SELECT
   users.nick_name
 , articles.article_id
 , articles.user_id
 , articles.title AS subject
 , articles.num_views
 , articles.date_posted
 , COUNT(r2.article_id) AS num_replies
FROM
  articles
  INNER JOIN
  users
    ON
    users.user_id = articles.user_id
  LEFT OUTER JOIN
     replies AS r1
     ON
     r1.article_id = articles.article_id
  LEFT OUTER JOIN
     replies AS r2
     ON
     r2.article_id = articles.article_id
WHERE
   r1.body LIKE '%Canada\\'s population%'
   AND
   display_stat = 'news'
GROUP BY
   articles.article_id
ORDER BY
   replies.date_posted DESC
LIMIT 10

:slight_smile:

Yup, that’s because of the WHERE clause: WHERE replies.body LIKE '%Canada\‘s population%’
Because that is there MySQL throws away all rows from the “replies” table from the result that do not adhere to this condition.

In order to overcome this you need to JOIN replies twice:


SELECT
   users.nick_name
 , articles.article_id
 , articles.user_id
 , articles.title AS subject
 , articles.num_views
 , articles.date_posted
 , COUNT(r2.article_id) AS num_replies
FROM
  articles
  INNER JOIN
  users
    ON
    users.user_id = articles.user_id
  LEFT OUTER JOIN
     replies AS r1
     ON
     r1.article_id = articles.article_id
  LEFT OUTER JOIN
     replies AS r2
     ON
     r2.article_id = articles.article_id
WHERE
   r1.body LIKE '%Canada\\'s population%'
   AND
   display_stat = 'news'
ORDER BY
   replies.date_posted DESC
LIMIT 10

:slight_smile:

If you want a book on SQL, r937 has written one: Simply SQL

Hi,

Actually one more question, actually a different iteration of that problem.
That is I need to generate a list which this time finds matches for Keywords in replies to articles, not the articles themselves, and then tells us who the Author of that Article is, how many replies/comments that article has gotten, etc. The SELECT command that I have generated as per your suggestion for searching in articles works OK in all respects except it is reporting tho wrong number of replies for a given article. For example for Articles that have 35 replies it is reporting 2 replies Only!

So what is wrong?

Here is the modified command:


SELECT users.nick_name , articles.article_id , articles.user_id , articles.title AS subject , articles.num_views ,
articles.date_posted , COUNT(replies.article_id) AS num_replies
FROM articles
INNER JOIN users ON users.user_id = articles.user_id
LEFT OUTER JOIN replies ON replies.article_id = articles.article_id
WHERE replies.body LIKE '%Canada\\'s population%' AND display_stat = 'news'
GROUP BY replies.article_id ORDER BY replies.date_posted DESC LIMIT 10;

Nope :slight_smile:

yes this is the majority of actual inner joins that you will find “in the wild” but the ON condition does not need to be an equality condition, nor does it need to involve a column from either table

inner joins are symmetric only if they involve equality join conditions

well, yeah, but i thought you were flipping over the As and Bs in the ON condition too

:slight_smile:

1st, ThanX for the prompt info.
And your SELECT does work just fine.

2nd, You are right. I need to get a much better command of that darn INNER and OUTER Joins :slight_smile:
Dont know why I am having such a HARD time to quite grasp it :frowning:
I guess I will try reading your book about this subject again. Although to be frank I have read that section few times and still dont quite get it!!

ThanX again.

ThanX for your lucid description.
I read it once. I need to read it a few more time, since as of now I still do not quite grasp the power of these Inner & Outer joins. Obviously they are very powerful and I am not quite putting this power to use and as a search engine and social networking service we have so many Tables and Users that I better have a solid grasp of these powers.

BTW, have you written a book on MySQL?
Or is there a book about MySQL that you would recommend that would not be a huge read? I mean something about 100 pages gem of read.

ThanX again.

I fail to see why A INNER JOIN B ON A.some_field > B.some_other_field would produce different results than B INNER JOIN A ON A.some_field > B.some_other_field …

if you use aggregate functions like COUNT or SUM, then the only time you can get away without coding a GROUP BY clause is if the aggregates are the only expressions in the SELECT clause – any other non-aggregate in the SELECT clause, and it has to be in the GROUP BY clause

that’s the general rule, but mysql lets you bend the rule

HOWEVER, you had better understand exactly how to bend the rule if you want reliable results

for anyone other than a very advanced sql developer, just stick to the rule that any non-aggregate in the SELECT clause must also be in the GROUP BY clause

Hi,

I am getting an Error message trying your code. here:
Mixing of GROUP columns (MIN(),MAX(),COUNT(),…) with no GROUP columns is illegal if there is no GROUP BY clause

Which makes sense, since where is the Group By clause!

i made two changes here –

SELECT users.nick_name
     , articles.article_id
     , articles.user_id
     , articles.title AS subject
     , articles.num_views
     , articles.date_posted
     , COUNT(replies.article_id) AS num_replies 
  FROM articles
INNER
  JOIN users
    ON users.user_id = articles.user_id
LEFT OUTER
  JOIN replies
    ON replies.article_id = articles.article_id
 WHERE articles.body LIKE '%Fly rods%' 
GROUP 
    BY articles.article_id
ORDER 
    BY articles.date_posted DESC LIMIT 10

first, i’m using explicit JOIN syntax, and i guess it is time for you to learn it

secondly, a change in the GROUP BY clause

let me know if you don’t understand either change

Suppose you are joining tables A and B.
An INNER JOIN only joins rows from table A and B together if there is a field in B that has the value as a field in A (specified by the ON clause: A.somefield=B.somefield).
An OUTER JOIN on the other hand selects all rows from table A, regardless if there is a row in B that has the same value as the field in A, and adds the info from B if there is a matching row, or NULLs if there isn’t.

To make it more practical, consider an example database of member info for a website. Every member can optionally fill in a profile that contains details about themselves and that info is stored in a table separate from their primary user info (username,password,etc). Now, if you wanted to list all the info for all users in a table (including all profile info) and you do an INNER JOIN on both tables the members who haven’t filled in a user profile won’t show. If you use an OUTER JOIN they do :slight_smile:

Beware that INNER JOINs are symmetric, i.e., A INNER JOIN B produces the same result as B INNER JOIN A, whereas OUTER JOINS are not.