Performance related with MySQL GROUP_CONCAT & PHP parsing

So I have a query, which returns:

[complete_collection] => 0,1

Now, both 0 and 1 here are two IDs, from a Database table. I use a foreach-loop in PHP,
to render something, a certain amount of items. Now, my question is more related to performance e.g.
is it a good practice to run another query and parse additional results using the IDs I get from my first result?

I know I would do like this to query for more data using the IDs in the group concatenated result:

"SELECT column 
, column 
, column FROM table WHERE id = IN ($result)"

I am not certain I understand the question.

Do you ask if it is more efficient to use GROUP_CONCAT to create a list of items (string), or to create the list using a while loop in PHP? If, then GROUP_CONCAT will be the most efficient solution, but note that if you pull thousands or records, it is an idea to run multiple queries, just pulling for example 1000 records in each. Limiting the size of the GROUP_CONCAT string. The exact number of records will depend on the size of the column you put into the string, so run some tests to find your sweet spot for best performance.

Or are you asking if it is more efficient to instead use a JOIN on the first query instead of doing additional queries in the loop from the result you received from the GROUP_CONCAT? If then using a JOIN on the first query and using a while loop to process the results will give a significantly better performance in most cases.

Though note in rare cases, if you have many joins on a table or the joins are very inefficient, it can be more efficient to divide it into several queries. However this is something you will notice when looking into the performance of the query.

1 Like

Thanks for your reply! Please take a moment and see this: http://dba.stackexchange.com/questions/31515/group-activities-in-activity-feed-by-users-and-products

It’s very (almost) similar to what I am trying to achieve here.

The answer depends on how much traffic the website will have and how many records that has to be processed, and how it will be displayed.

The SQL solution in the thread you linked to, will be faster than a php solution in most cases.

When you also add some caching, it should not be a performance issue caused by traffic. And when you have too many records that matches, it has to be changed, but again then that show that the site has become successful, so ironically it will be a good problem to fix at that time (With that said, I recommend putting this feature in its own class or method, so its easy to refactor how it work down the road).

1 Like

I have already done that part, everything is OOP here. See this example in my sqlfiddle, http://sqlfiddle.com/#!9/9ee94/1, so far I get the ID’s id, id. Do you get it?

Now, using those IDs, that appear in complete_collection column, should I re-run a new query and parse the output of those.

The part in bold is my biggest dilemma right now, any advice or help would be appreciated.

From what I understand, you want to display a news feed from any users I follow.

If this is the case, I would suggest changing your query to this:
> SELECT

      stream.*
    FROM 
      stream
      INNER JOIN follows
        ON (stream.user_id=follows.following_user)
    WHERE 
      follows.user_id = '0'
    ORDER BY
      stream.user_id
      , stream.stream_date DESC

This will give you all of the data, sorted by user and date. You then parse this on the PHP side before displaying it. You would do this by user_id and date, combining all the items matching this etc.

The stockexchange issue, is slightly different than what you mention now. If I understand your requirements correctly, the suggestion above will by far be the one with most performance. Then of course add a cache layer on it as well, so you do not recalculate this on every page load.

1 Like

I don’t see how different it is. Explain. AFAIK, I want to group things, e.g.
a user posted two posts (same day, with time stamp differentiating very little).

As for your query, I have already one that is very similar, however, joins more tables.

On the stockexchange topic, they join users who has purchased the same products. With other words, if user A and B bought the same product, their news feed would say “User A and User B bought product A and product D”, then under “User A bought product B” since User B did not purchase this.

On the query you have on sqlfiddle, you only group on the records per user I follow. This means the entire query is useless, and expensive for what it does. Since you anyway need to pull these records to find out what they did.

If it had been enough for you with the specific value that was “group_concat” to know what they did, then it would be useful, but since you cant, it is more efficient to do a similar query that the one I mention, joining the tables you need to get the result set. I.e. since you need to do that afterwards with the sqlfiddle query anyway.

1 Like

What about grouping a users’ activities, e.g. UserA posted two posts, and then within that <li></li>
list two divs, that represents those posts in DESC order. Should this be done with PHP? Or is there a faster way?

Again, it comes down to traffic.

It will make more sense, in a human reading code sense, to parse the people individually and search for activity for each person individually. But in a large traffic/activity site, it might make sense to get all the data in one large array, then parse the array accordingly - it’s just going to be “less clean” when reviewing the code and harder to maintain cleanly.

1 Like

What tips do you have?

I personally would leave it as separate queries until you absolutely have to.

Why exactly? E.g. querying in a for-loop or foreach-loop is considered bad, e.g. performance issues are caused. Maybe they are at micro-level.

Says who?

See this discussion, https://wordpress.org/support/topic/list-all-posts-on-a-page-split-them-by-year

That’s a little different - if you look, both of those queries are against the same table, so the one query makes sense, and the indexing of the array isn’t too bad. It has nothing to do with querying within a foreach loop…

I seee I seee. So you suggest, that it is a pretty good practice to query something in a loop?

From an entity/class standpoint, yes. From a human readable standpoint, yes.

Is it good practice? Depends on the situation. Like the WP link, if you CAN do it in one, then it’s appropriate. But if you’re dealing with disparate data, then splitting is appropriate.

1 Like

Do you have any good example with PHP & mysql?