Go Back   SitePoint Forums > Forum Index > Program Your Site > Databases > MySQL
Newsletter FAQ Members List Calendar Mark Forums Read

New to SitePoint Forums? Register here for free!

SitePoint Sponsor
 
Reply
 
Thread Tools Display Modes
Old Nov 1, 2005, 19:20   #1
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
Will Clustering or Increasing My Table Cache Increase My SELECT Speed?

I have a program, which is the only program quering my database at a time.

It performs over 3,000,000 cross checking queries on 4 tables consisting of 350,000 rows total and 94 MB. My table cache is set to 64 MB. Program takes about 12 hours to complete.

1. If I'm opening a database and using tables that are greater than 64 MB, should I be increasing my table cache?

2. If I only open 1 connection to the DB and start rapid firing SELECT queries, would a MYSQL cluster enhance the speed of my application? Or does clustering only help you out with multiple connections?

When I look at my processes, mysqld appears to only be uisng about 24MB (System has 768MB of memory never going over 40% usage) and very low processor percentage, under 2%. Plus the script (PERL Script) is using about 100 MB of Memory since I load 300,000 rows of data into memory at the start and under 5% of processor.

Are there any mysql commands that will show you the bottlenecks?
awohld is offline   Reply With Quote
Old Nov 1, 2005, 20:13   #2
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
first off, here's something you may not realize about the connectors for mysql. when you do a query and retrieve the results, the ENTIRE result set is copied from the mysql server to a variable in your application. so if your app and mysql run on the same server, that results set has to be held in memory TWICE.

so, based ont ath, you might be tempted to separate your app server from your mysql server. i have found that it's faster to keep them running ont he same server because the network connection between the servers gives you a hit more painful than buying more memory.

if your mysql server is on the same box as your app server, then you can also decrease the amount of time it takes to copy the records by switching to pipes or shared memory instead of using tcp connections.
Quote:
Originally Posted by awohld
1. If I'm opening a database and using tables that are greater than 64 MB, should I be increasing my table cache?
not necessarily. the first thing to check is that your key cache has enough memory to hold the keys for all of your tables. see http://dev.mysql.com/doc/refman/5.0/...variables.html and pay attention to the parts about the key_% variables, specifically:
Quote:
Key_read_requests

The number of requests to read a key block from the cache.

Key_reads

The number of physical reads of a key block from disk. If Key_reads is big, then your key_buffer_size value is probably too small. The cache miss rate can be calculated as Key_reads/Key_read_requests.
basically, if your key_reads keeps going up during your application running, then your key cache is not big enough to hold all of your keys for all of your tables.
Quote:
2. If I only open 1 connection to the DB and start rapid firing SELECT queries, would a MYSQL cluster enhance the speed of my application? Or does clustering only help you out with multiple connections?
clustering would only help with multiple connections.
Quote:
When I look at my processes, mysqld appears to only be uisng about 24MB (System has 768MB of memory never going over 40% usage) and very low processor percentage, under 2%. Plus the script (PERL Script) is using about 100 MB of Memory since I load 300,000 rows of data into memory at the start and under 5% of processor.
sounds like your problem is disk access and not processor power. as an expreriment, try setting mysql to use 64mb for key cache, 128mb for table cache, and 32mb for the sort cache.
Quote:
Are there any mysql commands that will show you the bottlenecks?
read the second half of this page: http://dev.mysql.com/doc/refman/5.0/...arameters.html
longneck is offline   Reply With Quote
Old Nov 2, 2005, 13:10   #3
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
Quote:
Originally Posted by longneck
basically, if your key_reads keeps going up during your application running, then your key cache is not big enough to hold all of your keys for all of your tables.
While running my app here is I got for: Key_read_requests - 3445531911 and Key_reads - 210229, then a few minutes later it's Key_read_requests 3464518463 and Key_reads - 210229.

So by this it looks like my key cache is good right?


I never thought about doing it in SQL, I'll have to look at that!
awohld is offline   Reply With Quote
Old Nov 8, 2005, 08:46   #4
psyon
SitePoint Addict
 
psyon's Avatar
 
Join Date: May 2004
Location: Cedar Rapids
Posts: 298
Quote:
Originally Posted by longneck
first off, here's something you may not realize about the connectors for mysql. when you do a query and retrieve the results, the ENTIRE result set is copied from the mysql server to a variable in your application. so if your app and mysql run on the same server, that results set has to be held in memory TWICE.
That actually depends on the language you use, and the function you call. In the MySQL API, mysql_use_result does NOT copy the whole result set. This is ok to use for single threaded processes. mysql_store_result does copy the whole result set. If you are using multi-threaded applications, this is what you would want to use.

I would say the first thing I would look at if queries are taking a long time to return, is to check you indexes. I have a table with over 4 million rows, that I can get a result from in less than a second. Also remember to only select the rows you need. That eliminates some of the overhead of transfering the result set.
psyon is offline   Reply With Quote
Old Nov 27, 2005, 11:11   #5
kgun
SitePoint Addict
 
Join Date: Nov 2005
Location: Moss, Norway.
Posts: 280
You have to know the following:

1. Everything digital is a bitstream (dynamic or static).
2. In large projects you have to figure out bottlenecks.
- Memory.
- Connections
- Processor speed.
- Lenguage speed ((Assembler), C is is faster than C++). But some say
that processor speed is much cheaper than programming time. There
is a tradeoff.
- Disk capacity.
3. Generally, minimalise connections.
4. Use distributed processing (on multiple servers) only in really large
projects.
5. I agree, operate on the the data(base) structure where possible.
6. Do you use 1 --> n relations or k --> n relations.
7 Use (memory) references (pointers) in stead of copies where possible.
9. Last but not least, the algorithm and datastructures you use are
important. E.g. some sorting algorithmes run in n*log(n) time, while
others run in linear time, e.g. counting sort and radix sort in some cases.
A lot of databaseoperations are sorting or sorting related, so be sure that
you use the most effective algorithm.
kgun is offline   Reply With Quote
Old Nov 1, 2005, 20:17   #6
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
btw, why are you pulling 300,000 rows in to your app?
longneck is offline   Reply With Quote
Old Nov 2, 2005, 02:30   #7
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
Quote:
Originally Posted by longneck
btw, why are you pulling 300,000 rows in to your app?
One of the 4 tables has a little over 300,000 rows that have fields that relate to the other 3 tables. This isn't a true DB, but configuration data from a comunications network.

So I pull all the 300,000 rows once and store it in an associative array. I'm using perl and the other option would be to pull a new row 300,000 times. I've read in Perl that it's faster to fetch all your results once and store it in a data structure.

So for each of the 300,000 rows I cross check each of the three other tables to make sure each row is properly populated.

Thanks for your help, I'll try your suggestions. Maybe I should try a SCSI disk also. I'll keep you posted.

Thanks!
awohld is offline   Reply With Quote
Old Nov 2, 2005, 07:15   #8
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
ok, what are you doing with those 300,000 rows once you get them in that array? if it's something you can be doing in SQL instead, do it there as mysql can probably do it faster.
longneck is offline   Reply With Quote
Old Nov 2, 2005, 13:44   #9
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
Quote:
Originally Posted by longneck
ok, what are you doing with those 300,000 rows once you get them in that array? if it's something you can be doing in SQL instead, do it there as mysql can probably do it faster.
I base my queries on the three other tables based on the values row by row in the 300,000 row table. Some values I have to convert manually in my program, like 800 should be converted to 0 before I crosscheck. And I have to find how many rows are returned for duplicates and nonexistent rows. I may be able to do it in SQL, I'll have to up my knowing of SQL a few levels.
awohld is offline   Reply With Quote
Old Nov 2, 2005, 14:39   #10
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
Quote:
Originally Posted by awohld
I base my queries on the three other tables based on the values row by row in the 300,000 row table.
that is exactly why your program takes so long. post the sequence of queries here and we might be able to tell you how to do it in one query.
Quote:
Originally Posted by awohld
Created_tmp_tables is staying at 860, so I'll trying increasing the table cache tonight when I can reboot the mysql server.
based on your previous statement about how you are querying the database, this probably won't help you.
longneck is offline   Reply With Quote
Old Nov 3, 2005, 00:51   #11
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
All I have to say is that my server is burning away!!!

I changed my.cnf to :
Code:
[mysqld]
set-variable = table_cache=128
query_cache_type=1
query_cache_size=16M
Thus enabling query cache and increasing my table cache.

My proc usage for mysql used to be 2%, now it's 80.9% and rising. And the proc usage for the perl script went from 5% to 12.5%.

What took me 9 hours before is now taking about 20 minutes.

Check out my Query Cache!
Code:
 Qcache free blocks   1  
 Qcache free memory   11557312  
 Qcache hits   308167  
 Qcache inserts   5088  
 Qcache lowmem prunes   0  
 Qcache not cached   23  
 Qcache queries in cache   5088  
 Qcache total blocks   10179
I'll have to experiment to find out if it's the query cache or table cache that did this! I've never seen my mysql processor usage so high!

Here's the staus of everything, I'm getting Qcache Low Memory Prunes, looks like I should add some more Query Cache!
Code:
 Runtime Information 
This MySQL server has been running for 0 days, 0 hours, 46 minutes and 52 seconds. It started up on Nov 03, 2005 at 01:30 AM. 
Server traffic: These tables show the network traffic statistics of this MySQL server since its startup.
 Traffic   ø per hour  
 Received   123,464 KB   158,062 KB  
 Sent   255,892 KB   327,600 KB  
 Total   379,356 KB   485,662 KB  
  Connections   ø per hour   %  
 Failed attempts   0   0.00   0.00 %  
 Aborted   0   0.00   0.00 %  
 Total   65   83.21   100.00 %  
 

Query statistics: Since its startup, 924,457 queries have been sent to the server.  Total   ø per hour   ø per minute   ø per second  
 924,457   1,183,515.36   19,725.26   328.75  
 
 Query type   ø per hour   %  
 admin commands   0   0.00   0.00 %  
 alter db   0   0.00   0.00 %  
 alter table   0   0.00   0.00 %  
 analyze   0   0.00   0.00 %  
 backup table   0   0.00   0.00 %  
 begin   0   0.00   0.00 %  
 change db   22   28.17   0.00 %  
 change master   0   0.00   0.00 %  
 check   0   0.00   0.00 %  
 checksum   0   0.00   0.00 %  
 commit   0   0.00   0.00 %  
 create db   0   0.00   0.00 %  
 create function   0   0.00   0.00 %  
 create index   0   0.00   0.00 %  
 create table   0   0.00   0.00 %  
 dealloc sql   0   0.00   0.00 %  
 delete   0   0.00   0.00 %  
 delete multi   0   0.00   0.00 %  
 do   0   0.00   0.00 %  
 drop db   0   0.00   0.00 %  
 drop function   0   0.00   0.00 %  
 drop index   0   0.00   0.00 %  
 drop table   0   0.00   0.00 %  
 drop user   0   0.00   0.00 %  
 execute sql   0   0.00   0.00 %  
 flush   0   0.00   0.00 %  
 grant   0   0.00   0.00 %  
 ha close   0   0.00   0.00 %  
 ha open   0   0.00   0.00 %  
 ha read   0   0.00   0.00 %  
 help   0   0.00   0.00 %  
 insert   0   0.00   0.00 %  
 insert select   0   0.00   0.00 %  
 kill   0   0.00   0.00 %  
 load   0   0.00   0.00 %  
 load master data   0   0.00   0.00 %  
 load master table   0   0.00   0.00 %  
 lock tables   0   0.00   0.00 %  
 optimize   0   0.00   0.00 %  
 preload keys   0   0.00   0.00 %  
 prepare sql   0   0.00   0.00 %  
 purge   0   0.00   0.00 %  
 purge before date   0   0.00   0.00 %  
 rename table   0   0.00   0.00 %  
  Query type   ø per hour   %  
 repair   0   0.00   0.00 %  
 replace   0   0.00   0.00 %  
 replace select   0   0.00   0.00 %  
 reset   0   0.00   0.00 %  
 restore table   0   0.00   0.00 %  
 revoke   0   0.00   0.00 %  
 revoke all   0   0.00   0.00 %  
 rollback   0   0.00   0.00 %  
 savepoint   0   0.00   0.00 %  
 select   18,497   23,680.37   2.00 %  
 set option   126   161.31   0.01 %  
 show binlog events   0   0.00   0.00 %  
 show binlogs   25   32.01   0.00 %  
 show charsets   31   39.69   0.00 %  
 show collations   31   39.69   0.00 %  
 show column types   0   0.00   0.00 %  
 show create db   0   0.00   0.00 %  
 show create table   0   0.00   0.00 %  
 show databases   6   7.68   0.00 %  
 show errors   0   0.00   0.00 %  
 show fields   0   0.00   0.00 %  
 show grants   3   3.84   0.00 %  
 show innodb status   0   0.00   0.00 %  
 show keys   0   0.00   0.00 %  
 show logs   0   0.00   0.00 %  
 show master status   0   0.00   0.00 %  
 show new master   0   0.00   0.00 %  
 show open tables   0   0.00   0.00 %  
 show privileges   0   0.00   0.00 %  
 show processlist   9   11.52   0.00 %  
 show slave hosts   0   0.00   0.00 %  
 show slave status   0   0.00   0.00 %  
 show status   12   15.36   0.00 %  
 show storage engines   0   0.00   0.00 %  
 show tables   40   51.21   0.00 %  
 show variables   64   81.93   0.01 %  
 show warnings   0   0.00   0.00 %  
 slave start   0   0.00   0.00 %  
 slave stop   0   0.00   0.00 %  
 truncate   0   0.00   0.00 %  
 unlock tables   0   0.00   0.00 %  
 update   0   0.00   0.00 %  
 update multi   0   0.00   0.00 %  
 

More status variables
 Variable   Value  
 Binlog cache disk use   0  
 Binlog cache use   0  
 Created tmp disk tables   3096  
 Created tmp files   3  
 Created tmp tables   3096  
 Delayed errors   0  
 Delayed insert threads   0  
 Delayed writes   0  
 Flush commands   1  
 Handler commit   0  
 Handler delete   0  
 Handler discover   0  
 Handler read first   11241  
 Handler read key   2  
 Handler read next   220202776  
 Handler read prev   0  
 Handler read rnd   4724  
 Handler read rnd next   77785956  
 Handler rollback   0  
 Handler update   0  
 Handler write   41842  
  Variable   Value  
 Key blocks not flushed   1  
 Key blocks unused   4729  
 Key blocks used   2533  
 Key read requests   85286952  
 Key reads   5547  
 Key write requests   6337  
 Key writes   0  
 Max used connections   6  
 Not flushed delayed rows   0  
 Open files   16  
 Open streams   0  
 Open tables   7  
 Opened tables   18  
 Qcache free blocks   526  
 Qcache free memory   1676336  
 Qcache hits   905531  
 Qcache inserts   18426  
 Qcache lowmem prunes   3743  
 Qcache not cached   71  
 Qcache queries in cache   14683  
 Qcache total blocks   29896  
  Variable   Value  
 Rpl status   NULL  
 Select full join   0  
 Select full range join   0  
 Select range   0  
 Select range check   0  
 Select scan   18427  
 Slave open temp tables   0  
 Slave running   OFF  
 Slave retried transactions   0  
 Slow launch threads   0  
 Slow queries   0  
 Sort merge passes   0  
 Sort range   0  
 Sort rows   4724  
 Sort scan   3094  
 Table locks immediate   18463  
 Table locks waited   0  
 Threads cached   0  
 Threads connected   4  
 Threads created   64  
 Threads running   2

Last edited by awohld; Nov 3, 2005 at 01:21.
awohld is offline   Reply With Quote
Old Nov 2, 2005, 13:26   #12
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
ok, sounds like your keycache is a good size.

next, i would look at this:
Quote:
Created_tmp_tables

The number of in-memory temporary tables created automatically by the server while executing statements. If Created_tmp_disk_tables is big, you may want to increase the tmp_table_size value to cause temporary tables to be memory-based instead of disk-based.
look at the value, run the query a couple times, then look at the value again and see if it goes up. if it does, do as the manual suggests.

if it doesn't increasing your table cache to hold the entire table in memory may help.

but still, the best possible thing you can do is see if you can get mysql to do whatever you are doing with the data instead of perl.
longneck is offline   Reply With Quote
Old Nov 2, 2005, 13:55   #13
awohld
SitePoint Enthusiast
 
Join Date: Feb 2004
Location: Glen Ellyn
Posts: 56
Quote:
Originally Posted by longneck
if it doesn't increasing your table cache to hold the entire table in memory may help.
Created_tmp_tables is staying at 860, so I'll trying increasing the table cache tonight when I can reboot the mysql server.
awohld is offline   Reply With Quote
Old Nov 3, 2005, 12:23   #14
cyberlot
SitePoint Zealot
 
Join Date: May 2003
Location: Midwest
Posts: 108
The default settings for mysql are very light to say the least, designed to ensure mysql will run on systems will very little memory.

You might want to try replacing my.cnf with my-large.cnf or my-medium.cnf you should have thse on your server somewhere.

These will give you larger sort and key buffers and other settings which should give you better overall improvement.

MySQL caching is nice but its not a solution for a poorly configured server.. Get your server running the best you can FIRST, then turn on query caching.
cyberlot is offline   Reply With Quote
Old Nov 3, 2005, 12:25   #15
longneck
reads the Community Crier
silver trophybronze trophy
SitePoint Award Recipient
 
longneck's Avatar
 
Join Date: Feb 2004
Location: Tampa, FL (US)
Posts: 9,896
i agree completely with cyberlot, especially the bit about "MySQL caching is nice but its not a solution for a poorly configured server.. Get your server running the best you can FIRST, then turn on query caching."
longneck is offline   Reply With Quote
Old Nov 4, 2005, 16:10   #16
sowhat
SitePoint Zealot
 
sowhat's Avatar
 
Join Date: Jun 2005
Posts: 165
I am glad things worked better for you

I have a similar case where I have to process 4+ million entries from one table joining another table. Calibrating MySQL and apache makes that work in 5 minutes!
sowhat is offline   Reply With Quote
Old Oct 8, 2006, 16:34   #17
Marapets.com
SitePoint Enthusiast
 
Join Date: Apr 2006
Posts: 43
Quote:
Originally Posted by awohld
I have a program, which is the only program quering my database at a time.

It performs over 3,000,000 cross checking queries on 4 tables consisting of 350,000 rows total and 94 MB. My table cache is set to 64 MB. Program takes about 12 hours to complete.

1. If I'm opening a database and using tables that are greater than 64 MB, should I be increasing my table cache?

2. If I only open 1 connection to the DB and start rapid firing SELECT queries, would a MYSQL cluster enhance the speed of my application? Or does clustering only help you out with multiple connections?

When I look at my processes, mysqld appears to only be uisng about 24MB (System has 768MB of memory never going over 40% usage) and very low processor percentage, under 2%. Plus the script (PERL Script) is using about 100 MB of Memory since I load 300,000 rows of data into memory at the start and under 5% of processor.

Are there any mysql commands that will show you the bottlenecks?

You should definately use Mysql Cluster if you use simple select with a primary key.. its amazingly fast and boasts something crazy like 100,000 processes a second.
Marapets.com is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread | Next Thread »

Thread Tools
Display Modes

 
Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Sponsored Links
 
Forum Jump


All times are GMT -7. The time now is 17:59.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Copyright 1998-2009, SitePoint Pty Ltd. All Rights Reserved