Hi I am currently planning to run PHP and MySql to store large amounts of personal data.
I am doing tests on my local machine, with a database with these fields
id age sex postcode mortgage married prop_val employed car_own head_house children
I currently have 2025150 rows in my database to execute a query to return all these rows takes 17.76 seconds
In the final app there will be 30million plus records and many more fields. I would expect people pulling results of around 10000 to 150000, with up to 20 simultaneous conections. Ideally I would like this as an ajax style app where the results are filtered on the fly altering the count.
Does anyone have any suggestions to maximise the speed.
I have already considered having the post code limiter on a separate page non ajax to limit the results.
Yes ha ha this morning I thought of that, realised I only need to select id, I now gat 2208226 in 7.69 seconds. Instead of 18 seconds. I had always been told to only select the fields you need but in most applications the time saving is minimal in this it is clear to see a huge performance boost.
Slightly off-topic, but I wonder which is more performant, SELECT * or SELECT fields when you actually do need all the fields in the table…
I guess what I’m pondering is whether or not there is any overhead incurred by the RDBMS when asked to pluck out the fields rather than just getting 'em all.
R927 the reason for returning all the ids is that I really only want a count of the data, as the idea is your create a count of your required data, limiting on various data types like age, sex, etc, etc. You then pay for your list and you will receive a link to download the data.
All I am doing is refining the search to determine the final count of data.
ScallioXTX this is my test database, I have noticed there is no index, I thought I setup an index on id with auto increment obviously not.
CREATE TABLE IF NOT EXISTS data ( id int(10) NOT NULL, age int(3) NOT NULL, sex int(1) NOT NULL, postcode varchar(25) NOT NULL, mortgage int(10) NOT NULL, married int(1) NOT NULL, prop_val int(10) NOT NULL, employed int(1) NOT NULL, car_own int(1) NOT NULL, head_house int(1) NOT NULL, children int(2) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Can you tell me more about Shards?
I also think that it is PHP slowing the process down, as on php my admin the query is always a small amount but the page refresh is very slow.
This is how bad habits learned when working with tiny data sets (SELECT *, returning a long result set and counting in PHP, not designing indexes, using bad data types, returning too many rows, etc.) come back to bite you.
None of that stuff makes a difference with 100 rows, but needs to be paid attention to now.
Using EXPLAIN to ensure your queries use indexes, and caching where appropriate will probably give you the biggest performance gains.
@AnthonySterling - it’s my understanding (never tested it) that even when you do need to select all fields it is faster to specify them all in the query.