Mysql keep restarting

Dear All,
I have dedicated machine with 32G Ram and 16cpus and only running mysql.Below is my top during the day which is peak time and below is also the mysql config file. What I notice in between sometimes there is quite a number restarts happening. I dont get it why when I have so much of dedicated resources why does mysql goes down? Any reason for it? I am also looking into the slow query but I do have enough resources right?

Top.

top - 12:54:21 up 1 day, 23:31, 2 users, load average: 2.36, 2.00, 2.34
Tasks: 278 total, 3 running, 275 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.0%us, 0.8%sy, 0.0%ni, 94.6%id, 0.2%wa, 0.6%hi, 1.8%si, 0.0%st
Mem: 33009800k total, 22447692k used, 10562108k free, 200920k buffers
Swap: 35061752k total, 0k used, 35061752k free, 18498676k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3011 mysql 20 0 9899m 1.8g 4148 R 127.3 5.8 105:13.53 mysqld
8763 root 20 0 14876 1176 776 R 2.0 0.0 0:00.01 top
1 root 20 0 4080 856 608 S 0.0 0.0 0:01.58 init

My.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
skip-innodb
skip-bdb
max_connections = 1000
key_buffer = 8192M
myisam_sort_buffer_size = 64M
join_buffer_size = 32M
read_buffer_size = 2M
sort_buffer_size = 4M
table_cache = 2048
thread_cache_size = 32
wait_timeout = 200
connect_timeout = 10
max_allowed_packet = 16M
max_connect_errors = 10
query_cache_limit = 4096M
query_cache_size = 1G
query_cache_type = 1
server-id=1283835628
log-bin=mysql-bin
log-error=mysql-bin.err
binlog_do_db=fms,sms
log-slow-queries = /var/log/mysql/mysql-slow.log
long_query_time = 10
log-queries-not-using-indexes
log_warnings = 2

[mysqld_safe]
err-log=/var/log/mysqld.log
open_files_limit = 10000

[mysqldump]
quick
max_allowed_packet = 16M

[isamchk]
key_buffer = 64M
sort_buffer = 64M
read_buffer = 16M
write_buffer = 16M

[myisamchk]
key_buffer = 64M
sort_buffer = 64M
read_buffer = 16M
write_buffer = 16M

[mysql.server]
#user=mysql

Do you have any single particular query running each time MySQL is restarting?

Dear Space,
The problem here I have one web application and around 3000 gps devices all using this db. So I know from .err file when the times the db gets restarted and unfortunately the slow query log does not have the time stamp for me to further analyse and know which query is causing it. So any idea how to get to the query? Thank you.

Do you have any cron jobs (scheduled tasks) due to run at the times that MySQL has been restarting?

Dear Space,
No I already check there is no cron job. So how to get to the root cause of this behavior?

Look into all the system logs - /var/log/messages or /var/log/syslog - the master logs in there may give more of a clue if something specifically is restrting mysql.

It could be crashing, but you’d expect an error to be thrown in its error log (does /var/log/mysqld/ contain anything useful?)

Top shows you aren’t using all the ram by a long shot, so its not that you are running out of memory. CPU load is low too.

Dear Tim,
Below is the latest contenr from /var/log/message and var/lib/mysql. I cant find syslog only the message. As you can see for 16/2/11 itself there is few restarts? I dont find anything in the message log to guide us further. How can I enable such that the mysql error log file to give more information rather than just say restart any idea?

/var/log/message
Feb 15 04:02:06 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 15 04:02:06 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 15 04:02:06 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 15 04:02:06 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 16 04:02:03 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 16 04:02:03 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 16 04:02:03 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found
Feb 16 04:02:03 dbserver pcscd: winscard.c:302:SCardConnect() Reader E-Gate 0 0 Not Found

/var/lib/mysql

110215 04:25:01 mysqld started
110215 04:40:06 mysqld ended

110215 04:40:07 mysqld started
110216 00:47:54 mysqld ended

110216 00:47:55 mysqld started
110216 03:02:18 mysqld ended

110216 03:02:19 mysqld started
110216 03:31:49 mysqld ended

110216 03:31:50 mysqld started
110216 08:56:22 mysqld ended

110216 08:56:22 mysqld started
110216 11:32:05 mysqld ended

110216 11:32:05 mysqld started
110216 11:37:01 mysqld ended

110216 11:37:02 mysqld started
110216 17:45:20 mysqld ended

110216 17:45:21 mysqld started

the pscsd errors are something else and, assuming you aren’t using it, can be ignored.

the mysql log shows the service is being restarted and not crashing, you don’t have any system monitoring or similar that might be killing mysql for you ? or a cron (scheduled event) that runs and restarts mysql when it shouldn’t?

Dear Tom,
Nope I don’t any cron or monitoring utility running.so I would like to know why these
Restarts is happening. Thx.

You really need to find out what is actually restarting the server, to me, from the logs pasted, it looks like its a genuine restart and not a crash restart.

Dear Tim,
Just to share with you at times the web application which show “too many connections”. So only at those instances we do restart db. How to know whether the too many connection is due to db or apache itself? Because in my mysql log file I dont notice this error too.

too many connections would imply that the server is under load and not processing the queries quick enough.

Is the database stored on a raid array, single drive? Could this be a bottleneck?

Have you checked the slow query log to find which queries need work to improve their performance? Have you tried tuning mysql to help with any queries / checking indexes on tables etc.

Dear Tim,
I am not too sure about raid but I know it is stored only on a single drive. Yes I am also working on the slow log query still got many more queries need to be optimise too. But the question is I have so much of resources and why mysql cant take myload. For instance if you notice today’s snippet of mysqld log file as below. If you notice the time it end and restart is so fast and we did not restart today the db server too. So is it a mysql bug itself ?

110216 17:45:21 mysqld started
110217 12:12:41 mysqld ended

110217 12:12:41 mysqld started
110217 14:03:15 mysqld ended

110217 14:03:15 mysqld started
110217 14:23:09 mysqld ended

110217 14:23:10 mysqld started
110217 15:06:58 mysqld ended

110217 15:06:59 mysqld started

What version of mysql are you on, have you made sure you are running the latest stable version in your line? 5.0 / 5.1 / 5.5

Dear Tim,
I am using the default which came with fedora that is 5.0.67. For us to move to new db will be quite big challenge cause our one table is consist of few hundred millions of lines. Before this mysql behave well. Off course resources should not a be a problem in my case right? When you said raid array means more thant one hard disk rite?

You have more than enough ram, if anything, you could increase the memory / connections of mysql without too many issues there, try pushing it up to 1250 connections and see how that behaves?

RAID is indeed an array of multiple hard drives (0-striped, speed. 1-mirrored, redundancy. and others that combine both).

If mysql was behaving ok until recently, what have you done that may increase load? collecting more data? increase of traffic to the front end?

Dear Tim,
What else in the .cnf file do you think I can tune to over come this problem? Yes the load became higher become add more program which does complex select on the db. How to know if my server is indeed raid or not ? Thank you.

You’ll know from what you purchased / how its been upgraded.

If you’ve added a complex select recently, you should be looking into that and trying to work out if you can optimise it at all, use the EXPLAIN syntax on the query, this should help you.

Dear Tim,
The problem is that the server is now on a remote location. Is there any linux command where I can know exactly how many physical hard disk is there? Actually raid is just about setting rite or do we have special hard disk just for raid. The problem we have add quite a number of new program and together with a lot of more new queries. My curiosity is that even with that additional mysql is showing it using very minimal resources but then why it fails?

RAID is about how the drives are set up, it could be software raid (LVM) or Hardware raid, in which case Linux would see the raid drive presented rather tahn the pile of drives that make it up.

Mysql is using resources, however, its possible you’ve got ONE query thats locking tables, in that case connections build up until you run out of spare connections which causes the server to appear to fail. You really need to be looking at what the server is doing when it starts to struggle, rather than just resetting it.