Ordering within IN clause


It seems that ordering within an IN clause does not work.

select user_id
from table1
where user_id in (select user_id from table2 order by created_date desc) #desc so that I get the newest entries only
limit 25

Running this query will always give me the same results. It seems that the user_id is not being sorted correctly.

I have been googling and it seems that the field() function may work for me.
But based on the documentation and examples I’ve seen so far, it seems that I have to run the select sub query twice? I’m not sure if I’m understanding it correctly.

select user_id
from table1
where user_id in (select user_id from table2)
order by field(user_id, select user_id from table2) #I need desc here
limit 25

But I’m afraid this would also be slow (I’m moving the query from a join to IN instead) because the indexed column would be inside the function. :-/

I may result in using a stored procedure in order to sort the query and then pass the result set within the IN clause.

If anyone can shed some light please and thanks.

here you go :slight_smile:

SELECT table1.user_id
  FROM table1
  JOIN table2
    ON table2.user_id = table1.user_id 
    BY table1.created_date DESC LIMIT 25

If you want to order the output of the query you should put ORDER BY on the outer query, not on the inner query.

Putting ORDER BY on the inner query does nothing but make the query slower.

r937 that was the original query I had. Unfortunately it takes about 3 seconds for the whole operation. There are over 100k rows in each table. That is why I was avoiding a join to begin with.

rewriting a join as a IN (subquery) construction is often worse performance

are your join columns properly indexed? and is all you really want to return just the userid?

Wait, are you saying IN is faster than an INNER JOIN? It really shouldn’t be!

Do you have indexes on table1.user_id and table2.user_id ? If not, create them and try again with the INNER JOIN.


You beat me to it again
That’s twice in one thread (:

I actually got better performance out of using the IN clause. I brought it down from 3 seconds to ~ .01 seconds. The only issue I had was getting the sorting right.

All I need is the user id and both columns are indexed.

I think the problem my be how mysql was compiled. It seem’s that the cache is not working right neither. Running the same query multiple times from my machine works fine (although slow, the second+ times are much faster due to caching) and terrible on production.


Oops, it seems that I made a mistake when posting the query. I actually need their profile_link column as it contains important information.

It should be like this:

SELECT t1.profile_link
FROM table t1
WHERE t1.user_id IN (SELECT t2.user_id FROM table2 t2 ORDER BY t2.created_date desc) #desc so that I get the newest entries only

table2 contains data that is inserted all the time and I would like the newest entries only.

So why did you put the LIMIT on the outer query and not on the inner query? :shifty:

Because you can’t put a limit on the sub query using the IN clause.

I also forgot to mention that whatever the limit is, it’s also the maximum amount of IDs within the IN clause.

I may have to solve this programmatically if I can’t find a valid solution. Maybe query it through the application code and pass the results into this query. Or somewhere around those lines.

I’m nearly home and will be testing out the same tables but under inno db instead (currently running under myisam).

I just joined 500,000 from each table and it ran extremely fast (at home). I think the slow speeds are due to the way mysql was configured. I’ll have to do more research on the production server.

Do you guys have any advice on any flags I should be looking for? I’m going to be checking the cnf file and see if there is anything there weird.


Maybe it’s just hardware issue. Hm.

maybe it’s just your database structure being wrong :slight_smile:

Seeing you mention that you were going to try setting it up with innodb instead of myIsam, I had a feeling it would be quicker.

Joining Innodb tables is more efficient than with MyISAM tables and, you have the added benefit of being able to enforce referential integrity.

more efficient as in, afaik, you can’t join MyIsam tables.

Ok, so I figured out why it was slow.

It’s using a group by clause.

SELECT table1.user_id
  FROM table1
    ON table2.user_id = table1.user_id
GROUP BY table2.user_id 
    BY table1.created_date DESC LIMIT 25

Looks like when I issue a group by it creates a temporary table. I need to filter out duplicates from this query. :-/

No problems with JOINs on MyISAM table types, you can mix them with INNODB tables as well.

it could also be caused by your ORDER BY clause

please note, whenever you use a GROUP BY clause, then the ORDER BY clause can name only columns that are in the SELECT clause

you’re trying to sort by a column that isn’t included in the SELECT clause

not only is this problematic (you can’t sort by a column that isn’t included in the rows being sorted), but it also doesn’t make sense, since if you don’t display it, you can’t really discern the sequence of the results properly

what’s your actual query, i.e. with real not fake column and table names?

SELECT p.profile_link
  FROM profile p
    ON n.user_id = p.user_id
GROUP BY n.user_id 
    BY n.created_date DESC LIMIT 25

yeah, it is as i thought

you can’t sort by created_date because it isn’t in the SELECT clause

and the GROUP BY clause makes no sense to me at all, you will get unique profile_link values for each separate user, but no way to know which user each profile_link belongs to

in the news table the same user can insert new rows. In fact, new rows are inserted by the same user around 5 times a day on a average. If the latest rows are by the same user I would like to avoid repeating the same user multiple times.

could you please do a SHOW CREATE TABLE for both tables

and perhaps some sample rows that illustrate your problem

Kinda like pulling teeth isn’t it?