Slower than molasses in January

Hello!

I’m running some PHP code that does what I want, but it’s insanely slow and I’m not sure if there’s anything that can be done about it. If you go to:

Create Algebra Assignment and enter a date, and Create Class Schedule, the routine will run. I call the following functions, located in an external file:

function convertLocalTimetoUnixTimestamp ($local_time_zone,$formatted_time) {
// create the DateTimeZone object for later
$dtzone = new DateTimeZone($local_time_zone);
 
// now create the DateTime object for this time and user time zone
$dtime = new DateTime($formatted_time, $dtzone);
 
// print the timestamp
$timestamp = $dtime->format('U');
return $timestamp;
}

function convertUnixTimestamptoLocalTime ($local_time_zone,$unix_time,$time_format) {

// create the DateTimeZone object for later
$dtzone = new DateTimeZone($local_time_zone);
 
// first convert the timestamp into a string representing the local time
$time = date('r', $unix_time);
 
// now create the DateTime object for this time
$dtime = new DateTime($time);
 
// convert this to the user's timezone using the DateTimeZone object
$dtime->setTimeZone($dtzone);
 
// print the time using your preferred format
// print the time using your preferred format
$local_time_formatted = $dtime->format($time_format);
return $local_time_formatted;
	
	
	
	
}

function getDueHomePageHomeSchoolAssignments($read,$user_id) {
	$sql = "SELECT homepage_homeschool_schedule.assignment_id, homepage_homeschool_schedule.assignment_start_date, homepage_homeschool_schedule.time_zone,     			                homepage_homeschool_schedule.assignment_due_date, assignments_instructors.assignment_name
			FROM homepage_homeschool_schedule
			INNER JOIN
			assignments_instructors
			WHERE assignments_instructors.assignment_id = homepage_homeschool_schedule.assignment_id
			AND user_id = $user_id 
			ORDER BY assignment_due_date ASC";
	return $read->fetchAll($sql); 
	
}

function getAssignmentTopics($read,$assignment_id) {
	$sql = "SELECT distinct(section_page_title), book_order, assignments_topics.section_name, assignments_topics.chapter_name
		    FROM assignments_topics
			INNER JOIN questions
			ON assignments_topics.section_name=questions.section_name
			WHERE questions.question_id 
			IN
			(SELECT question_id
               FROM assignments_questions
			WHERE assignment_id=$assignment_id)
            ORDER BY (book_order);";
	return $read->fetchAll($sql);
}

The actual code to run the process is:

<table id="users_table">
  <?php $due_assignments = getDueHomePageHomeSchoolAssignments($dbRead,$user_id);
  if ($due_assignments) { ?>
  <tr>
    <th>Assignment</th>
    <th>Topics</th>
    <th>Start Date</th>
    <th>Due Date</th>
  </tr> <?php ;} ?>
  <?php foreach($due_assignments as $row) { //repeat region 

  $assignment_topics = getAssignmentTopics($dbRead,$row['assignment_id']);
 ?>
  <tr>
    <td><?php echo $row['assignment_name'] ?></td>
    
    <td>
    <?php foreach($assignment_topics as $assignment_topic){
		echo $assignment_topic['section_page_title']."<br />";}?></td>
        <td>
    <?php 
    echo convertUnixTimestamptoLocalTime ($row['time_zone'],$row['assignment_start_date'],"m/j/y") . " at " . convertUnixTimestamptoLocalTime ($row['time_zone'],$row['assignment_start_date'],"g:i a");
	
	?></td>
    <td><?php 
     echo convertUnixTimestamptoLocalTime ($row['time_zone'],$row['assignment_due_date'],"m/j/y") . " at " . convertUnixTimestamptoLocalTime ($row['time_zone'],$row['assignment_due_date'],"g:i a");
	  ?>
    </td>
  </tr>
  <?php } // end of repeat region ?>
</table>

Are there any thoughts as to why it should take upwards of 45 seconds for the full menu to print out would be appreciated (I have Comcast, by the way).

Thank you…

-Eric

Almost certainly it’s (one of) your queries that are causing the delay.

Try using an ON clause in the first (don’t know if it makes a difference, but it’s worth a try):


SELECT 
    homepage_homeschool_schedule.assignment_id
  , homepage_homeschool_schedule.assignment_start_date
  , homepage_homeschool_schedule.time_zone
  , homepage_homeschool_schedule.assignment_due_date
  , assignments_instructors.assignment_name             
FROM homepage_homeschool_schedule             
INNER JOIN assignments_instructors             
ON assignments_instructors.assignment_id = homepage_homeschool_schedule.assignment_id             
WHERE user_id = $user_id              
ORDER BY assignment_due_date ASC

And you could do an EXPLAIN of the two queries to get more info about the bottleneck.
It might be a question of adding the right indexes.

You have nested queries. If getDueHomePageHomeSchoolAssignments returns 10 assignments, and getAssignmentTopics returns 5 topics per assignment, you will be executing 50 queries.

Another possible cause could be memory thrashing and garbage collection if the server you’re hosting on is a bit… taxed in that department. You’re passing the entire array set as the result, ESPECIALLY if you have a very large list of results, is basically making THREE copies in RAM that have to be allocated and released on every query. While this may not be a big deal on the outermost one, the one inside the loop can cause memory fragmentation and make PHP’s garbage collection routines go off to la-la land.

It’s also why sometimes joined queries can in fact end up SLOWER than separate queries if the result set it is large enough. There have been several times I’ve debugged code where the obvious place to put the blame was the query, when it was the result set blowing out RAM as the issue.

Basically the query creates the result set in RAM, which you then copy with fetchAll, which gets copied by the return, that gets copied into your variable – can you see how that might be an issue?

There are other minor optimizations that might help – those function calls for example are extra overhead that may be contributing to your problems – or the opening and closing PHP for nothing over and over… the use of double quotes being slower than single quotes (and the leading contributor to garbage HTML output from PHP).

Also I’m not familiar with any object that would let you pass a query to fetchall… sounds like some made-up fairytale land SQL system that could be grossly inefficient – I’d suggest swapping that out for PDO or mySQLi – probably PDO, it’s the better of the two.

In fact… if that object is handling that query internally that could be ANOTHER giant memory move/copy inside the object.

It’s really why you see code that iterates through fetch() instead of calling fetchAll… the RAM usage of fetchAll (much less it’s cleanup) can gut you quick, especially inside a loop.

I’d really need to see a lot more of your code (like whatever you’re using for a database object) to weigh in more… but if I was doing that with PDO I’d be swinging an axe at those extrernal functions… and probably use prepared queries stored in an array instead.

… and your date function routines seem needlessly complex too, since you could just use date_default_timezone_set.

My approach to that would probably go something like this:

<?php
// assumes $db is a PDO object connected to the database

$queryStrings=array(
	'getDueAssignments' => '
		SELECT homepage_homeschool_schedule.assignment_id, homepage_homeschool_schedule.assignment_start_date, homepage_homeschool_schedule.time_zone,																 homepage_homeschool_schedule.assignment_due_date, assignments_instructors.assignment_name 
		FROM homepage_homeschool_schedule 
		INNER JOIN 
		assignments_instructors 
		WHERE assignments_instructors.assignment_id = homepage_homeschool_schedule.assignment_id 
		AND user_id = :userId  
		ORDER BY assignment_due_date ASC
	',
	'getAssignmentTopics' => '
		SELECT distinct(section_page_title), book_order, assignments_topics.section_name, assignments_topics.chapter_name 
		FROM assignments_topics 
		INNER JOIN questions 
		ON assignments_topics.section_name=questions.section_name 
		WHERE questions.question_id  
		IN (
			SELECT question_id 
			FROM assignments_questions 
			WHERE assignment_id = :assignmentId
		) 
		ORDER BY (book_order)
	'
);

$sDueAssignments=$db->prepare($queryStrings['getDueAssignments']);
if ($sDueAssignments->execute(array('userId' => $user_id))) {
	if ($assignment=$sDueAssignments->fetch()) {
		$oldTz=date_default_timezone_get();
		echo '
		<table id="users_table"> 
			<thead>
				<tr> 
					<th>Assignment</th> 
					<th>Topics</th> 
					<th>Start Date</th> 
					<th>Due Date</th> 
				</tr>
			</thead><tbody>';
		$sAssignmentTopics=$db->prepare($queryStrings['getAssignmentTopics');
		do {
			echo '
				<tr>
					<th>',$assignment['assignment_name'],'</th>
					<td>';
			if ($sAssignmentTopics->execute(array($assignment['assignment_id'])) {
				while ($topic=$sAssignmentTopics->fetch()) {
					echo '
						',$topic['sectionPageTitle'],'<br />';
				}
			}
			date_default_timezone_set($assignment['time_zone']);
			echo '
					</td><td>
						',date('m/j/y \\a\	 d:i a',$assignment['assignment_start_date']),'
					</td><td>
						',date('m/j/y \\a\	 d:i a',$assignment['assignment_due_date']),'
					</td>
				</tr>';
		} while ($assignment=$sDueAssignments->fetch());
		
		echo '
			</tbody>
		</table>';
		date_default_timezone_set($oldTz);
	}
}

?>

Which I believe would run faster. Of course that’s set up to use PDO – not even sure what you’re using there.

Oh, on your second query, you don’t seem to be actually using MOST of the values returned – in fact that’s true of both queries. Getting rid of return values you aren’t using can speed things up a lot too…

It MAY be worth trying to get those into a single query from a handshake to SQL/request point of view as some others have mentioned – though doing so could flush performance by chewing on RAM like it was candy as the results set would grow and grow and grow. It really depends on how many results and how big the data is.

Just curious – just how many results are these queries returning? If you’re pulling more than a hundred at a time you should probably look at applying LIMIT and pagination!

It’s only a minor point compared to optimising the query, but it may be worth putting an index on each field you use in the WHERE/ORDER BY sections of the query, ie user_id, assignment_id, etc.

Alex

I checked for that and missed it completely :frowning:

That inner one being a particularly nasty piece of work – I’ve got the feeling a bunch of stuff is broken up into separate tables that shouldn’t be – or some other part of the database structure is needlessly convoluted given what a … complex query that inner one is.

Guys,

Thanks all for such detailed responses. As usual, I’m learning a bunch from this site from asking lots of questions. I took DeathShadow’s advice (who by the way, helped me clean up some non-validating html just a few days ago — going from 46 validation errors down to 1 after he lit the torch for me) and used his script as presented in this post(after a minor change: I think that you meant to write

if ($sAssignmentTopics->execute(array('assignmentId' => $assignment['assignment_id'])))

). I haven’t seen this syntax of getting into my database as I’ve been using the Zend_framework (why, you ask? Because I’m a newbie…what else can I say!) for almost all of my database calls. And, I kind of wish that I knew about it before. In addition, I didn’t know about the built in time-zone feature :goof:

Regardless, I fear that even with DeathShadow’s new code, it still runs incredibly slowly (though again, I’m happy to have learned something new from the experience).

Any other thoughts at this point would be appreciated. And, if it helps, the way that it prints out is waiting, waiting, waiting then it spits out 10 or so rows, with the process repeating — in other words, it’s not slow one column at a time. Also, altogether, maybe 150 records are pulled which to me doesn’t seem like that many (although am I naive in this case? Are nested queries just slow by nature?)

Finally, regarding my 2nd query being a bit convoluted (as suggested by DeathShadow), there are 3 tables with very distinct data: course topics, questions, and assignments. I tried to adhere to the principle of “one data concept per table” in my design and I do think that in this particular case, it was the way to go.

So…if there might be some other concept that I’m not aware of, I’m all ears. If not, I’ll break the table into “master and detail” pages without too much problem. But, it would be super cool if I could make things happen this way.

Thanks!
-Eric

As mentioned, your biggest hurdle now is the nested query. Running a query inside a loop, much less one as complex as that is bound to take a good long while.

The thing about your second query is really, you’re only pulling one field; if you need three tables for one field something’s awry.

Zend Framework, like a lot of frameworks, is a memory hog. The really sad part of it is that frankly, it’s no more or less complex than PDO or mySQLi from the little bit I tried to use it a while back – at which point just use PDO or mySQLi. That it’s returning fetchAll style results is proof enough of that.

THOUGH – I think that if you could reduce it to just two queries, the first one a fetchAll where you use a quick loop to build and replace it’s keys with your lookup, and then a second one that pulls up ALL the possible results of your second query – and iterate through that result normally – you could use a lookup in the first one’s array for the bits from that keeping the query size under control.

But to even try to put together something like that I’d have to see the full table structures and understand what’s what. Can you post your structures with a brief explanation of what’s pointing at what?

Deathshadow,

Thanks for the continued ideas. I MAY have actually found the real cause…but before jumping up and down for joy, I want to see if it the query goes quickly over the next couple of days (I just now have an unrelated issue where I can’t get into my site and have sent a note to my hosting provider). If in fact I did find the real culprit, I’m going to still post back a question because it will be something that I still need to address (but nothing having to do with the inner query!) – as it may have to do with indexing.

Until then…Happy Friday!

-Eric

(I guess you can ignore the email that I just wrote!)

Deathshadow,

I think that I’ve come to the same conclusion about the Zend_framework being more than what I actually needed. I guess that I started using it since I had bought a book trying to learn PHP/MySql (David Powers: PHP Training From the Source), and though it was a great help for me to get started, in retrospect, I would have been much better off learning basic PDO or mySQLi syntax (random question then: any good online/book resources for PDO?)

For starters, as per your request, below are the table structures as produced by my nightly CRON database dump. I wasn’t quite sure what you meant in your post as “a brief explanation of what’s pointing at what”.
Regardless, here are the structures of my tables:

CREATE TABLE homepage_homeschool_schedule (
user_id int(11) NOT NULL,
assignment_id text NOT NULL,
assignment_start_date text NOT NULL,
assignment_due_date text NOT NULL,
time_zone text NOT NULL,
dbTime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE assignments_instructors (
assignment_id int(11) NOT NULL AUTO_INCREMENT,
assignment_name varchar(50) NOT NULL,
assignment_type text NOT NULL,
due_date text NOT NULL,
course_id int(30) NOT NULL,
length int(11) NOT NULL,
dbTime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (assignment_id)
) ENGINE=MyISAM AUTO_INCREMENT=157 DEFAULT CHARSET=latin1;

CREATE TABLE assignments_topics (
book_order int(11) NOT NULL,
menu_chapter_name text NOT NULL,
chapter_name text NOT NULL,
topic text NOT NULL,
section_name text NOT NULL,
Issues varchar(1000) NOT NULL,
section_page_title text NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE assignments_questions (
assignment_id int(11) NOT NULL,
question_id int(11) NOT NULL,
dbTime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE questions (
question_id int(10) unsigned NOT NULL AUTO_INCREMENT,
author_id int(11) NOT NULL,
chapter_name varchar(50) NOT NULL,
section_name varchar(50) NOT NULL,
text text NOT NULL,
solution text NOT NULL,
incorrect_solution1 text NOT NULL,
incorrect_solution2 text NOT NULL,
incorrect_solution3 text NOT NULL,
solution_order varchar(8) NOT NULL,
filename varchar(30) DEFAULT NULL,
question_type varchar(30) NOT NULL,
public_access varchar(10) NOT NULL,
Explanation text,
dbTime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (question_id)
) ENGINE=MyISAM AUTO_INCREMENT=930 DEFAULT CHARSET=latin1;

However, before diving into this approach, I’d like your opinion on yet another possible solution that I’ve been mulling over that will definitely be fast BUT take up database space. Basically, my website is an online interactive math book. The whole point of this piece is to generate a dynamic “syllabus” for each student, based on questions and assignments input by the instructors (if a course has 50 students, then there would be 1 syllabus per course). Perhaps instead of accessing the syllabii dynamically, it would make more sense for me to just create 1 syllabus per course? Then, my query would be “Hey, get this syllabus that’s already set up in the database from JoeShmoe instructor”.
If you go to Create Algebra Assignment, you’ll see how much data we’re talking about per course. If there were 1000 or 10,000 different syllabii, is this “a lot” of data for a database?

-Eric

Just a quick thanks to everyone who responded to this particular post. After thinking about things for a day, I’m going to in the direction of using Pagination (one of DeathShadows suggestions) in order to minimize the loading time. And, considering that most people will look at these schedules on a week to week basis, that probably makes more sense regardless.

Also, I’m glad to have learned about “PDO” and potential pitfalls with nested queries.

Anyway, enjoy your collective weekends. :slight_smile:

-Eric