SitePoint Sponsor |
|
User Tag List
Results 1 to 11 of 11
-
May 5, 2006, 16:42 #1
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Looping through millions of records taking forever
Hello,
I have a page that builds rows for a table based on mysql data. However, there are millions of records in the table I am querying and I notice that it takes 20 or so seconds to build 7 table rows.
Summary:
Each table row queries millions of website visits to determine that days total for web traffic (I use a basic select count for overall day traffic AND select count(distinct field) for unique.
This happens 7 times in the loop
Should I create a function instead that builds the rows/queries and just put the function in the loop? I am using LInux/Mysql , not shared, and I confirmed the table structure earlier is fine for querying.
Any suggestions for performance based on your experience? I can show code, but my question is mostly general as what I learn would also possibly help later on
Thank you
-
May 5, 2006, 17:29 #2
- Join Date
- Aug 2000
- Location
- Philadephia, PA
- Posts
- 20,578
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
Performance this poor means that either you're performing multiple queries where you only need one, your queries are poorly written, or you're not making appropriate use of indexes on your tables.
You can select all of your websites, the day's total traffic, and the day's unique traffic counts all in a single query.
If you're already using a single query to do this, put the word EXPLAIN in front of the query and run it (through the mysql command-line or your favorite interface). MySQL will tell you what it's doing to build your result set, so you can spot potential keys for indexing or portions of the query that should be rewritten.Try Improvely, your online marketing dashboard.
→ Conversion tracking, click fraud detection, A/B testing and more
-
May 5, 2006, 17:35 #3
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Hi Dan,
The reason I don't use one query is because I am looping through 7 days of traffic. 7 queries for each specific day (timestamped)
I'm having a mental block as to how to avoid this. I keep thinking I had to do a separate query per day, where I loop 7 times for 7 days
Each loop is something like this:
select count(ip) from traffic table where day = '$timestamp'
That's my dilema. Getting those results in a loop to show with *one* query so I can build 7 rows in a table for all 7 days (as in, a summary of the last 7 days)
-
May 5, 2006, 17:43 #4
- Join Date
- Aug 2000
- Location
- Philadephia, PA
- Posts
- 20,578
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
Code:SELECT ID, website, (SELECT COUNT(IP) FROM traffic WHERE TO_DAYS(NOW()) - TO_DAYS(day) < 7) AS hitCount, (SELECT COUNT(DISTINCT(IP)) FROM traffic WHERE TO_DAYS(NOW()) - TO_DAYS(day) < 7) AS uniqueCount FROM traffic
Off Topic:
You should probably use DATE_SUB instead of TO_DAYS but I'm not comfortable enough with that function to write an untested example.Try Improvely, your online marketing dashboard.
→ Conversion tracking, click fraud detection, A/B testing and more
-
May 5, 2006, 17:52 #5
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Interesting. I will test this. I didn't realize I could loop through 7 rows with a unique date like this. This opened my eyes to TO_DAYS
Thanks. I'll let you know if it burps ,but I trust it should work.
-
May 5, 2006, 18:01 #6
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Seem to always get query errors. Do I not use a timestamp or single quote inside the TO_DATE?
Error:
#1064 - You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT COUNT(IP) FROM alltraffic WHERE TO_DAYS('1146801600') -
I'll keeep testing, of course...just fyi
-
May 5, 2006, 18:06 #7
- Join Date
- Aug 2000
- Location
- Philadephia, PA
- Posts
- 20,578
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
Do you have MySQL 4.1 or newer?
Try Improvely, your online marketing dashboard.
→ Conversion tracking, click fraud detection, A/B testing and more
-
May 5, 2006, 18:21 #8
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Damnit, I thought I did. I was sure I did, but it's 4.0+. I could use the concept of subtracting dates, just can't use that function I guess. Sorry...
I'm going to use my vars for weekago/today and see if I can get that to work
-
May 5, 2006, 18:22 #9
- Join Date
- Aug 2000
- Location
- Philadephia, PA
- Posts
- 20,578
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
It's not the TO_DAYS() you can't use, it's all subqueries. You can probably rewrite that query using only JOINs, but you're missing out on a lot by not upgrading.
4.1 went into production release October, 2004. Current version is 5.1.
As a side note... you still could've eliminated 6 queries per loop by computing two timestamps representing a 7 day period and selecting rows where the date is between those values.Try Improvely, your online marketing dashboard.
→ Conversion tracking, click fraud detection, A/B testing and more
-
May 5, 2006, 18:34 #10
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Six queries per loop? I want to make sure I explained that each loop only queries
disctinct and non disctinct. 7 looped rows, 2 queries per row (distinct/non distinct where date = that day). Problem was I did between for every DAY I loop through. Yeah...
regardless, I'll look to upgrade...
Thanks
-
May 6, 2006, 12:24 #11
- Join Date
- Mar 2002
- Posts
- 608
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
We upgraded to 4.1+
I put in the query and I get
"Client ran out of memory"
Bookmarks