Performance related with MySQL GROUP_CONCAT & PHP parsing

From what I understand, you want to display a news feed from any users I follow.

If this is the case, I would suggest changing your query to this:
> SELECT

      stream.*
    FROM 
      stream
      INNER JOIN follows
        ON (stream.user_id=follows.following_user)
    WHERE 
      follows.user_id = '0'
    ORDER BY
      stream.user_id
      , stream.stream_date DESC

This will give you all of the data, sorted by user and date. You then parse this on the PHP side before displaying it. You would do this by user_id and date, combining all the items matching this etc.

The stockexchange issue, is slightly different than what you mention now. If I understand your requirements correctly, the suggestion above will by far be the one with most performance. Then of course add a cache layer on it as well, so you do not recalculate this on every page load.

1 Like

I don’t see how different it is. Explain. AFAIK, I want to group things, e.g.
a user posted two posts (same day, with time stamp differentiating very little).

As for your query, I have already one that is very similar, however, joins more tables.

On the stockexchange topic, they join users who has purchased the same products. With other words, if user A and B bought the same product, their news feed would say “User A and User B bought product A and product D”, then under “User A bought product B” since User B did not purchase this.

On the query you have on sqlfiddle, you only group on the records per user I follow. This means the entire query is useless, and expensive for what it does. Since you anyway need to pull these records to find out what they did.

If it had been enough for you with the specific value that was “group_concat” to know what they did, then it would be useful, but since you cant, it is more efficient to do a similar query that the one I mention, joining the tables you need to get the result set. I.e. since you need to do that afterwards with the sqlfiddle query anyway.

1 Like

What about grouping a users’ activities, e.g. UserA posted two posts, and then within that <li></li>
list two divs, that represents those posts in DESC order. Should this be done with PHP? Or is there a faster way?

Again, it comes down to traffic.

It will make more sense, in a human reading code sense, to parse the people individually and search for activity for each person individually. But in a large traffic/activity site, it might make sense to get all the data in one large array, then parse the array accordingly - it’s just going to be “less clean” when reviewing the code and harder to maintain cleanly.

1 Like

What tips do you have?

I personally would leave it as separate queries until you absolutely have to.

Why exactly? E.g. querying in a for-loop or foreach-loop is considered bad, e.g. performance issues are caused. Maybe they are at micro-level.

Says who?

See this discussion, https://wordpress.org/support/topic/list-all-posts-on-a-page-split-them-by-year

That’s a little different - if you look, both of those queries are against the same table, so the one query makes sense, and the indexing of the array isn’t too bad. It has nothing to do with querying within a foreach loop…

I seee I seee. So you suggest, that it is a pretty good practice to query something in a loop?

From an entity/class standpoint, yes. From a human readable standpoint, yes.

Is it good practice? Depends on the situation. Like the WP link, if you CAN do it in one, then it’s appropriate. But if you’re dealing with disparate data, then splitting is appropriate.

1 Like

Do you have any good example with PHP & mysql?

I might not be understanding something here, e.g. how are they actually using the data they get from GROUP_CONCAT with the PHP to display stuff?

I would have to disagree, doing additional queries inside a loop is always a bad idea. Having queries like this, if it is not needed cost a lot of when you consider resources needed to process the request.

Sometimes it cant be avoided though. In these cases it is important to set this up the most efficient way (look below for an example).

In your case you need to run additional queries, due to from my understanding the stream table will contain a lot of actions the member could do, this means to get it in one query would require a lot of left joins.

To reduce the additional queries you need to do, on the initial while loop that receives the data from the query. Have additional arrays where you insert the keys for the new tables you need information from. On these you insert the record id for all records, across all users.

Then you run these queries after, and tie the information received to the information you have about the user. Before finally outputting it. Though as mentioned above, this is a rather expensive process, so I recommend caching the result for 5-10min.

By doing this, you can limit the additional queries to perhaps 3 additional queries, instead of 50 compared to if you did it per user.

In this case, the solution contain the information already, so they dont need to do anything else than to display it. I.e. they have the user names and the products they bought.

Your problem is that you need more information from this table, than the group_concat can give you. This is why it is better to pull all the records, and sort them in PHP instead. Since this allow you to get the information you need on the first pass.

1 Like

I get this in my sqlfiddle:

@TheRedDevil, where do you see the usernames and products they bought or hearted in this result?

Not in your query, but in the stackexchange topic you linked to above.

That is why the solution presented there, is not an optimal solution for your use.

Hence why I said it depends on the situation and the traffic. Sometimes the less optimal way is easier to understand and maintain, and sometimes the data requirements are too high for that to matter.

1 Like