SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Counting products in each category - php loop getting slow...

    Greetings,

    I have a category page that lists all the products in a category. In addition, there is a side-navigation menu with a list of say 100 sub-categories along with an item count, kind of like this example:

    Art:
    • Paintings (20 items)
    • Sculptures (2 items)
    • Carvings (4 items)
    • ...


    On my navigation script, I go through a PHP loop of each sub-category within Art and count all of the products within those sub-categories. I have it set up where it records the timestamp so it doesn't count every time the page is loaded, only updates once per hour. Otherwise it will display the count stored in the database instead (which is quicker). The problem now is I have 50,000 items and this counting method is starting to get a bit slow.

    My count script is currently like this within the sub-category loop:

    Category "Art" is parent categorylevel1, while a sub-category like "Paintings" is categorylevel2. I also allow members to list a product in 2 different categories (category1 and category2):

    PHP Code:
    if ($timestamp 1 hour...) { //loops through 100 sub-categories to count products in each subcategory and store count in database.
          
    $sqlcount "SELECT COUNT(*) as num FROM products WHERE (category1level2 = '$categorylevel2id' OR category2level2 = '$categorylevel2id')";
          
    $result mysql_fetch_array(mysql_query($sqlcount));
          
    $count $result['num'];
          
    mysql_query("UPDATE categories SET count = '$count', timestamp = '$onehour' WHERE id = '$categorylevel2id'") or die(mysql_error());

    Let me know if there is a better system. More efficient MySQL query? Cron Jobs?

    Thanks

  2. #2
    SitePoint Enthusiast rajeev13's Avatar
    Join Date
    Nov 2012
    Location
    New Delhi,India
    Posts
    35
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by peppy View Post
    Greetings,

    I have a category page that lists all the products in a category. In addition, there is a side-navigation menu with a list of say 100 sub-categories along with an item count, kind of like this example:

    Art:
    • Paintings (20 items)
    • Sculptures (2 items)
    • Carvings (4 items)
    • ...

    Let me know if there is a better system. More efficient MySQL query? Cron Jobs?

    Thanks
    i use a parent_id for each subcategory and in the product table i store the category_id for each product .
    now i have only to make a join query to retrieve the related record.
    its far better to perform a sql query instead of for loop .

  3. #3
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the tip. I also have the parent category ids for each sub category and the global category id of the entire category itself stored in my products table, so I'll try experimenting with this a bit.

    Although, I'm not certain on how to retrieve a count of all of the products listed in each subcategory without having to do the loop?

    Kind regards

  4. #4
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    UPDATE:

    I've tried doing something like this to get out of php looping mysql counting queries, but it still takes a long time:

    PHP Code:
    $sql "
    SELECT categories.*,
            (
                SELECT    count(*)
                FROM    products
                WHERE    (products.category1level2 = categories.level2id OR products.category2level2 = categories.level2id)
            ) AS productcount
    FROM categories WHERE level1id = '
    $level1id'"

    It works, looks simple, but it's taking just as long as it was before. Let me know if there is a way to improve this.

    Thanks

  5. #5
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, I've come up with another method using JOIN with the categories and items table:

    PHP Code:
    $sql "
    SELECT categories.*, COUNT(products.id) AS productcount 
    FROM categories
    LEFT OUTER JOIN products ON (categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2)
    WHERE level1id = '
    $level1id'
    GROUP BY categories.level2id"

    This is still taking just as long, up to 20-30 seconds for 100+ subcategories. Is this the way I'm supposed to be doing it or is there another way?

    Thanks



    EDIT:

    In addition, even if I do something like this to JOIN categories/products tables without counting (and without grouping), it still takes just as long. Some parent categories might hold 100 sub-categories but only have 2 products total within all sub-categories, and it STILL takes 20-30 seconds:

    PHP Code:
    $sql "
    SELECT categories.*, products.id
    FROM categories
    LEFT OUTER JOIN products ON (categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2)
    WHERE level1id = '
    $level1id

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,338
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    please do a SHOW CREATE TABLE for each table, so that we can see the columns as well as the indexes
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here is my categories table with relevant columns:

    CREATE TABLE `categories` (
    `id` smallint(5) NOT NULL AUTO_INCREMENT,
    `level1id` smallint(5) unsigned NOT NULL,
    `level2id` smallint(5) unsigned NOT NULL,
    `level3id` smallint(5) unsigned NOT NULL,
    `level4id` smallint(5) unsigned NOT NULL,
    `level5id` smallint(5) unsigned NOT NULL,
    `level6id` smallint(5) unsigned NOT NULL,
    `parent` tinyint(1) NOT NULL,
    `count` int(8) NOT NULL,
    `timestamp` datetime NOT NULL,
    PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=29048 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT


    Here is my products table with relevant columns:

    CREATE TABLE `items` (
    `id` int(8) NOT NULL AUTO_INCREMENT,
    `global_cat_id` smallint(5) NOT NULL,
    `cat1level1` smallint(5) NOT NULL,
    `cat1level2` smallint(5) NOT NULL,
    `cat1level3` smallint(5) NOT NULL,
    `cat1level4` smallint(5) NOT NULL,
    `cat1level5` smallint(5) NOT NULL,
    `cat1level6` smallint(5) NOT NULL,
    `global_cat_id2` smallint(5) NOT NULL,
    `cat2level1` smallint(5) NOT NULL,
    `cat2level2` smallint(5) NOT NULL,
    `cat2level3` smallint(5) NOT NULL,
    `cat2level4` smallint(5) NOT NULL,
    `cat2level5` smallint(5) NOT NULL,
    `cat2level6` smallint(5) NOT NULL,
    PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=20870 DEFAULT CHARSET=utf8

    In categories table, the level1id is the parent category id of level2id, and level2id is the parent category id of level3id and so on up to 6 levels.

    In the items table, cat1level1 is the parent category id of cat1level2 and so on. We also allow people to select a second category, which are 6 levels of "cat2level#". global_cat_id corresponds to the category "id" column, the actual id number of the category.

    I hope this helps.

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,338
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    the immediate problem, poor execution time, is due to the absence of indexes on the join columns --

    categories.level2id = products.cat1level2 OR categories.level2id = products.cat2level2

    as well as the absence of an index on the filter column --

    WHERE level1id = '$level1id'

    but a deeper, more pernicious problem, is the structure of the tables in having a built-in hierarchy

    please do some research on first normal form and then ask if you need further help
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the help, I've added the 18 relevant columns (the 6 in categories table and 12 from items table) into indexes, and it appears to be working much faster now. Hopefully it's the real thing and not a query from the cache.

    Do you know how well the performance speed will be after there are millions of items in the item table?

    Thanks
    Kind regards

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,338
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by peppy View Post
    Do you know how well the performance speed will be after there are millions of items in the item table?
    approximately the same
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    One last thing... Let's say my items table has a "live" column that is either 0 for not live (disabled) and 1 for live. Is there a way to count the items only when items.live = 1?

    Here is what I got so far:

    PHP Code:
    SELECT categories.*, COUNT(items.id) AS count1 FROM categories
    LEFT OUTER JOIN items ON 
    (categories.level2id items.cat1level2 OR categories.level2id items.cat2level2)
    WHERE level1id '$level1id'
    GROUP BY categories.level2id 
    Like instead of "COUNT(items.id) AS count1" , do something like "COUNT(items.live = 1) AS count1" ?

    Thanks

  12. #12
    SitePoint Addict
    Join Date
    Jul 2007
    Posts
    233
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    UPDATE:

    I replaced "COUNT(items.id) AS count1" with: SUM(CASE WHEN items.live = '1' THEN 1 ELSE 0 END) AS count1 . In addition, I added items.live column to the index.

    This seems to be working well, but I'm not sure if this is the proper way or if it will degrade performance later on when millions of items are added - since this is a SUM function instead of COUNT. I was going to use "COUNT(CASE WHEN items.live = '1' THEN 1 ELSE 0 END)", but all empty sub-categories were showing "1" item by default instead of "0".

    Let me know what you think.

    Thanks
    Kind regards

  13. #13
    SitePoint Member
    Join Date
    Feb 2009
    Posts
    3
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Seems like a lot of code bloat.

    Enter the number of items in the DB once and then do +1 or -1 each time you add or remove an item.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •