SitePoint Sponsor |
|
User Tag List
Results 1 to 16 of 16
-
Oct 9, 2007, 02:45 #1
- Join Date
- Mar 2005
- Location
- Belgium
- Posts
- 214
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Inserting massive amount of records with PHP
I'm building a mailing tool for a client of ours and need to send an email to all their client contacts (over 130 000 emails). This is the first time i need to insert such a huge amount of records to the mailqueue table.
Previously i inserted them one by one with an insert query. After a while that wasn't performant enough while inserting 1000+ records so i modified the query to insert multiple records at once [insert into table (field, ...) values (val, ...), (val, ...), (val, ...), ...]
I was already running into php memory limit problems when inserting the 1000+ records one by one. This is now solved by the above query but i'm afraid i'm going to run into memory limit problems again when inserting the 130 000 records.
Anyone has another solution how i can add 130 000 records to a database? Perhaps with sessions and a page that reloads itself???
Any idea's are welcome and will be considered
Thanks!█ Webdevelopment : Skyrocket Concepts • Inventis Web Architects
█ Ain't got time for the future or the past.
█ Live for the moment, make it last.
-
Oct 9, 2007, 03:06 #2
- Join Date
- May 2006
- Location
- Lancaster University, UK
- Posts
- 7,062
- Mentioned
- 2 Post(s)
- Tagged
- 0 Thread(s)
I think your best option is to have a cron-job that runs the mail queue at some time in the night where you have the least amount of visitors. This means that the code you have above would work, while it's running faster and affecting the users less.
Jake Arkinstall
"Sometimes you don't need to reinvent the wheel;
Sometimes its enough to make that wheel more rounded"-Molona
-
Oct 9, 2007, 03:15 #3
- Join Date
- Aug 2005
- Posts
- 207
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
LOAD DATA INFILE
PHP Code:<?php
define ( 'DB_HOST', 'localhost:3306' ); // should not need to change
define ( 'DB_USER', 'user' ); // enter the user name for this database
define ( 'DB_PASS', 'pass' ); // enter the password for this database
define ( 'DB_NAME', 'my_db' ); // enter the database name you are connecting to
define ( 'DB_TABLE', 'my_table' ); // enter the NAME of the database TABLE to INSERT data into
define ( 'DBF_CSVP', './path_to/csv.txt' ); // enter the PATH and NAME of the CSV FILE to IMPORT
mysql_connect ( DB_HOST, DB_USER, DB_PASS ) or die ( 'Connection Error: ' . mysql_error () );
mysql_select_db ( DB_NAME ) or die ( 'Select DB (' . DB_NAME . ') Error: ' . mysql_error () );
/* change or add any (terminated, enclosed or escaped) option that you need */
$query = "LOAD DATA INFILE '" . DBF_CSVP . "' INTO TABLE " . DB_TABLE . " FIELDS TERMINATED BY '\\t\\t' OPTIONALLY ENCLOSED BY '' ESCAPED BY '\\' LINES TERMINATED BY '\\r\\n'";
mysql_query ( $query );
?>
-
Oct 9, 2007, 03:23 #4
- Join Date
- Mar 2005
- Location
- Belgium
- Posts
- 214
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
This is a nice solution but doesn't this just move the problem to creating the file to load in the LOAD DATA INFILE query?
The problem is not the sending of the mails but the moment the mails (with their body and headers) get inserted into the mailqueue table. I already have a cronjob in place that sends out 50 mails from the mailqueue table each minute.█ Webdevelopment : Skyrocket Concepts • Inventis Web Architects
█ Ain't got time for the future or the past.
█ Live for the moment, make it last.
-
Oct 9, 2007, 03:35 #5
- Join Date
- May 2006
- Location
- Lancaster University, UK
- Posts
- 7,062
- Mentioned
- 2 Post(s)
- Tagged
- 0 Thread(s)
This is a nice solution but doesn't this just move the problem to creating the file to load in the LOAD DATA INFILE query?Jake Arkinstall
"Sometimes you don't need to reinvent the wheel;
Sometimes its enough to make that wheel more rounded"-Molona
-
Oct 9, 2007, 12:00 #6
- Join Date
- Mar 2005
- Location
- Belgium
- Posts
- 214
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Anyone can come up with another solution?
█ Webdevelopment : Skyrocket Concepts • Inventis Web Architects
█ Ain't got time for the future or the past.
█ Live for the moment, make it last.
-
Oct 9, 2007, 12:01 #7
- Join Date
- May 2006
- Location
- Lancaster University, UK
- Posts
- 7,062
- Mentioned
- 2 Post(s)
- Tagged
- 0 Thread(s)
Is it only a one-time insert?
Jake Arkinstall
"Sometimes you don't need to reinvent the wheel;
Sometimes its enough to make that wheel more rounded"-Molona
-
Oct 9, 2007, 12:05 #8
- Join Date
- Aug 2007
- Location
- GR
- Posts
- 352
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
What are the records you have to insert?
Aren't the e-mail supposed to be the same for everyone?
-
Oct 9, 2007, 12:08 #9
-
Oct 9, 2007, 17:10 #10
- Join Date
- Sep 2004
- Location
- Norway
- Posts
- 1,198
- Mentioned
- 4 Post(s)
- Tagged
- 1 Thread(s)
130 000 records is nothing, your database should have no problems with handeling that and much more if its setup properly and of course if your database layout is good.
Since you mention that your afraid of running into memory problems, are you talking about php or mysql specificly?
I assume you mean php, in that case you could of course just increase the memory limit. A better solution would be to limit the number of rows inserted at a time and instead to several inserts. By doing that you would make sure that the php memory limit is never exeeded.
Another thing to consider is how long time the script will use to execute.
Even if you set the php settings to keep running until the script is finished, the browser window could time out before your script finishes. So if the code could run for a longer amount of time before its finish it could be a good idea to start the script by a asyncron javascript request (popularly called ajax these days), and then update the status of the "run program" when you get a reply back. By doing that you can also restrict that the user restarts the process before its finished.
-
Oct 9, 2007, 17:36 #11
- Join Date
- Sep 2007
- Posts
- 136
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
need more info, post some code
-
Oct 15, 2007, 13:05 #12
- Join Date
- Mar 2005
- Location
- Belgium
- Posts
- 214
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Well the database layout is simple ... i've got a table containing html templates which contains wildcards. The wildcards get replaced by the corresponding columns from the usertable (name, email, etc ...) so we can personalize the mails. This makes that no 2 mails are the same.
What i do is when a user clicks on send mailing i run over all the users in the usertable and replace all wildcards in the mailing template and that result gets saved in the mailQueue table along with the subject, sender address and receiver address.
The mailqueue table gets processed 100 records at a time with a cronjob that connects to our SMTP server we set up specifically for this purpose.
The bottleneck here is like TheRedDevil said the moment the template gets 'parsed' and 130.000 records get inserted in the mailQueue table in one go.
Would an async request solve this? The script runs with ignore_user_abort() and timelimit 0. I also use output buffering to output a message to the browser before the insertion starts to prevent the user from staring at a blank window untill the script finishes.
I hope this is enough info :-)█ Webdevelopment : Skyrocket Concepts • Inventis Web Architects
█ Ain't got time for the future or the past.
█ Live for the moment, make it last.
-
Oct 15, 2007, 13:11 #13
- Join Date
- Mar 2005
- Location
- Belgium
- Posts
- 214
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
I've come up with an idea where i store only the id's of the users that get the mail in a temp table along with the template_id and then run a cronjob every X seconds that does the template parse and insert 100 records at a time? Good idea?
█ Webdevelopment : Skyrocket Concepts • Inventis Web Architects
█ Ain't got time for the future or the past.
█ Live for the moment, make it last.
-
Oct 15, 2007, 13:25 #14
- Join Date
- Sep 2004
- Location
- Norway
- Posts
- 1,198
- Mentioned
- 4 Post(s)
- Tagged
- 1 Thread(s)
I fail to see how that method would be different from the other method? You would only split it into several segments.
What you try to do should be possible to archive in one go. Ive processed much more data than that in cron jobs that only run 4-5 seconds before.
Have you gone over looking where the possible resourse/memory hog can be located?
An asyncron request to start the script will not solve a problem in the script itself, but it would improve the user experience of the script.
The questions you need to look over is:
How do you handle the user information?
How many emails do you process at a time?
Are you using the best methods/functions for the task? (I.e. the fastest and most resource/memory friendly ones)
It is very difficult for us to help you, after all we are just describing how to proceed in similar situations. If you post the actual code, it would be easier seeing where you could optimalize it.
-
Oct 15, 2007, 18:40 #15
- Join Date
- Aug 2007
- Location
- GR
- Posts
- 352
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
ArcanE, I think you are approaching it the wrong way.
There is no need to add all the records in the mail queue table,
I mean there i no need for a mail queue table either.
I have written a mailing list manager and what the cron does is search for records of e-mail messages that are flagged "on hold".
If it finds any it marks them "processing" and then retrieves some addresses
from the subscribers table to sent. What I hold in the db is after each processing is a value of how many have been mailed which works in conjunction with "... LIMIT $next, 50" and that's all.
-
Oct 15, 2007, 18:58 #16
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
can you post the actual code you're currently using to retrieve and insert the data?
Bookmarks