How would you do a twitter timeline?

I’m making an application where the users can follow each other, and I want to make like a twitter/facebook home page where there is a timeline of the users you follow.
Problem is, if a user is following 1000 users, how you would design the system (DB, Queries) to do that?

Simple way: SELECT * FROM posts WHERE followed_id = $id OR followed_id = $id2 OR …
Abbreviated: SELECT * FROM posts WHERE followed_id in ($id,$id2,$id3…)

But if you are facing perhaps more than 1000 id, is this a good way?
And you can’t really reduce the ids previously with other parameters, because to reduce the list with the more liked posts you have to test them all at some point anyways.

The only concern that I have is that it seems to me very inefficient, but I don’t see any other way.

you show only one table – posts

you need a table for who-follows-whom – note this is a directed relationship, i.e. john follows mary, but mary doesn`t follow john

your query should start with this table, then join the posts table

Well that’s already clear, I just wanted to simplify the question.

Here is what I have, not showing all but enough:

CREATE TABLE follow (
  user_id int(9) unsigned NOT NULL,
  board_id int(10) unsigned NOT NULL,
  UNIQUE KEY id_user (user_id,board_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE posts (
  post_id int(11) unsigned NOT NULL AUTO_INCREMENT,
  post_board_id int(11) unsigned NOT NULL,
  post_title varchar(50) COLLATE utf8_bin NOT NULL,
  post_cont text COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (post_id,post_b_id)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

So if I already made a SELECT and I have an array if id, I will do what I was saying on the first post:

SELECT * FROM posts WHERE post_board_id in ($id[1],$id[2],$id[3]…)

And the join you said will be:

SELECT posts.post_id, posts.post_board_id, posts.post_title, posts.post_cont FROM posts INNER JOIN follow ON posts.post_board_id=follow.board_id WHERE follow.user_id = '{$user_id}'

I showed the first way because you can more easily see how much the DB is working, that’s my only concern, the standard way of doing it is clear (I believe).
But I keep thinking, if I want 30 results, and the thirtieth post of the boards that is following (1000) is 100.000 posts away, that means testing an average of 999.9 id in 100k posts, seems a potentially pretty heavy query to me, I don’t see any way around it, but I just wanted to ask if someone has a more efficient method.

PD: Oh, to clarify, users follow boards, not other users themselves.

i’m sorry, i still don’t know what you’re asking

where does this list of ids come from?

also, if you’re going to join to the follow table on board_id, there should really be an index on it

i’m sorry, i still don’t know what you’re asking

Make these queries and following system as efficient as possible.

where does this list of ids come from?

Are just the ids of the boards that the user is following.

SELECT board_id FROM follow WHERE user_id = '{$user_id}'

But it was only an exemple, I’m not using any of these, I’m still thinking how I will do this.

also, if you’re going to join to the follow table on board_id, there should really be an index on it

I tested the join query, and it worked, is that what should have happened without the index?
You mean that I have to add an index on post_board_id from table posts, so it’s not enough to be a PRIMARY KEY?

PD: Sorry, PRIMARY KEY (post_id,post_b_id) would be PRIMARY KEY (post_id,post_board_id), I changed some names to make it easier to read, but I forgot to change this :frowning:

Ok sorry an index on board_id from follow table.

okay, when you have an actual question, we’ll be here

:slight_smile: