Key Takeaways
- Background processing can significantly enhance page load times by allowing heavy tasks to be done in the background, freeing up the main thread to focus on loading the page. This results in a smoother and faster browsing experience for the user.
- Background processing involves two key components: a job queue and worker(s). The application creates jobs to be handled while workers wait and take one job at a time from the queue. Multiple worker instances can be created to speed up processing.
- The Beanstalkd job queue can be used to store jobs, with the Symfony Console component implementing workers as console commands, and Supervisor managing worker processes. This technology stack can be utilized to organize and manage background processing.
- In the context of an image gallery, instead of resizing images on the first request which adds overhead to the initial load, an optimized approach would be to render thumbnails after a gallery is created. This task can be moved to the background, improving user experience and scalability.
- It’s crucial to be aware of potential challenges with background processing, such as ensuring tasks are completed successfully and in the correct order, and handling errors in background tasks. These can be mitigated by using a task queue and implementing robust error handling and logging mechanisms.
This article is part of a series on building a sample application — a multi-image gallery blog — for performance benchmarking and optimizations. (View the repo here.)
In a previous article, we’ve added on-demand image resizing. Images are resized on the first request and cached for later use. By doing this, we’ve added some overhead to the first load; the system has to render thumbnails on the fly and is “blocking” the first user’s page render until image rendering is done.
The optimized approach would be to render thumbnails after a gallery is created. You may be thinking, “Okay, but we’ll then block the user who is creating the gallery?” Not only would it be a bad user experience, but it also isn’t a scalable solution. The user would get confused about long loading times or, even worse, encounter timeouts and/or errors if images are too heavy to be processed. The best solution is to move these heavy tasks into the background.
Background Jobs
Background jobs are the best way of doing any heavy processing. We can immediately notify our user that we’ve received their request and scheduled it for processing. The same way as YouTube does with uploaded videos: they aren’t accessible after the upload. The user needs to wait until the video is processed completely to preview or share it.
Processing or generating files, sending emails or any other non-critical tasks should be done in the background.
How Does Background Processing Work?
There are two key components in the background processing approach: job queue and worker(s). The application creates jobs that should be handled while workers are waiting and taking from the queue one job at a time.
You can create multiple worker instances (processes) to speed up processing, chop a big job up into smaller chunks and process them simultaneously. It’s up to you how you want to organize and manage background processing, but note that parallel processing isn’t a trivial task: you should take care of potential race conditions and handle failed tasks gracefully.
Our tech stack
We’re using the Beanstalkd job queue to store jobs, the Symfony Console component to implement workers as console commands and Supervisor to take care of worker processes.
If you’re using Homestead Improved, Beanstalkd and Supervisor are already installed so you can skip the installation instructions below.
Installing Beanstalkd
Beanstalkd is
a fast work queue with a generic interface originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.
There are many client libraries available that you can use. In our project, we’re using Pheanstalk.
To install Beanstalkd on your Ubuntu or Debian server, simply run sudo apt-get install beanstalkd
. Take a look at the official download page to learn how to install Beanstalkd on other OSes.
Once installed, Beanstalkd is started as a daemon, waiting for clients to connect and create (or process) jobs:
/etc/init.d/beanstalkd
Usage: /etc/init.d/beanstalkd {start|stop|force-stop|restart|force-reload|status}
Install Pheanstalk as a dependency by running composer require pda/pheanstalk
.
The queue will be used for both creating and fetching jobs, so we’ll centralize queue creation in a factory service JobQueueFactory
:
<?php
namespace App\Service;
use Pheanstalk\Pheanstalk;
class JobQueueFactory
{
private $host = 'localhost';
private $port = '11300';
const QUEUE_IMAGE_RESIZE = 'resize';
public function createQueue(): Pheanstalk
{
return new Pheanstalk($this->host, $this->port);
}
}
Now we can inject the factory service wherever we need to interact with Beanstalkd queues. We are defining the queue name as a constant and referring to it when putting the job into the queue or watching the queue in workers.
Installing Supervisor
According to the official page, Supervisor is a
client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
We’ll be using it to start, restart, scale and monitor worker processes.
Install Supervisor on your Ubuntu/Debian server by running
sudo apt-get install supervisor
. Once installed, Supervisor will be running in the background as a daemon. Use supervisorctl
to control supervisor processes:
$ sudo supervisorctl help
default commands (type help <topic>):
=====================================
add exit open reload restart start tail
avail fg pid remove shutdown status update
clear maintail quit reread signal stop version
To control processes with Supervisor, we first have to write a configuration file and describe how we want our processes to be controlled. Configurations are stored in /etc/supervisor/conf.d/
. A simple Supervisor configuration for resize workers would look like this:
[program:resize-worker]
process_name=%(program_name)s_%(process_num)02d
command=php PATH-TO-YOUR-APP/bin/console app:resize-image-worker
autostart=true
autorestart=true
numprocs=5
stderr_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stderr.log
stdout_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stdout.log
We’re telling Supervisor how to name spawned processes, the path to the command that should be run, to automatically start and restart the processes, how many processes we want to have and where to log output. Learn more about Supervisor configurations here.
Resizing images in the background
Once we have our infrastructure set up (i.e., Beanstalkd and Supervisor installed), we can modify our app to resize images in the background after the gallery is created. To do so, we need to:
- update image serving logic in the
ImageController
- implement resize workers as console commands
- create Supervisor configuration for our workers
- update fixtures and resize images in the fixture class.
Updating Image Serving Logic
So far we’ve been resizing images on the first request: if the image file for a requested size doesn’t exist, it’s created on the fly.
We’ll now modify ImageController to return image responses for requested size only if the resized image file exists (i.e., only if the image has already been resized).
If not, the app will return a generic placeholder image response saying that the image is being resized at the moment. Note that the placeholder image response has different cache control headers, since we don’t want to cache placeholder images; we want to have the image rendered as soon as the resize process is finished.
We’ll create a simple event called GalleryCreatedEvent with Gallery ID as a payload. This event will be dispatched within the UploadController
after Gallery is successfully created:
...
$this->em->persist($gallery);
$this->em->flush();
$this->eventDispatcher->dispatch(
GalleryCreatedEvent::class,
new GalleryCreatedEvent($gallery->getId())
);
$this->flashBag->add('success', 'Gallery created! Images are now being processed.');
...
Additionally, we’ll update the flash message with “Images are now being processed.” so the user knows we still have some work to do with their images before they’re ready.
We’ll create GalleryEventSubscriber event subscriber that will react to the GalleryCreatedEvent
and request a resize job for every Image in the newly created Gallery:
public function onGalleryCreated(GalleryCreatedEvent $event)
{
$queue = $this->jobQueueFactory
->createQueue()
->useTube(JobQueueFactory::QUEUE_IMAGE_RESIZE);
$gallery = $this->entityManager
->getRepository(Gallery::class)
->find($event->getGalleryId());
if (empty($gallery)) {
return;
}
/** @var Image $image */
foreach ($gallery->getImages() as $image) {
$queue->put($image->getId());
}
}
Now, when a user successfully creates a gallery, the app will render the gallery page, but some of the images won’t be displayed as their thumbnails while still not ready:
Once workers are finished with resizing, the next refresh should render the full Gallery page.
Implement resize workers as console commands
The worker is a simple process doing the same job for every job he gets from the queue. Worker execution is blocked at the $queue->reserve()
call until either job is reserved for that worker, or a timeout happens.
Only one worker can take and process a Job. The job usually contains payload — e.g., string or serialized array/object. In our case, it’ll be UUID of a Gallery that’s created.
A simple worker looks like this:
// Construct a Pheanstalk queue and define which queue to watch.
$queue = $this->getContainer()
->get(JobQueueFactory::class)
->createQueue()
->watch(JobQueueFactory::QUEUE_IMAGE_RESIZE);
// Block execution of this code until job is added to the queue
// Optional argument is timeout in seconds
$job = $queue->reserve(60 * 5);
// On timeout
if (false === $job) {
$this->output->writeln('Timed out');
return;
}
try {
// Do the actual work here, but make sure you're catching exceptions
// and bury job so it doesn't get back to the queue
$this->resizeImage($job->getData());
// Deleting a job from the queue will mark it as processed
$queue->delete($job);
} catch (\Exception $e) {
$queue->bury($job);
throw $e;
}
You may have noticed that workers will exit after a defined timeout or when a job is processed. We could wrap worker logic in an infinite loop and have it repeat its job indefinitely, but that could cause some issues such as database connection timeouts after a long time of inactivity and make deploys harder. To prevent that, our worker lifecycle will be finished after it completes a single task. Supervisor will then restart a worker as a new process.
Take a look at ResizeImageWorkerCommand to get a clear picture how the Worker command is structured. The worker implemented in this way can also be started manually as a Symfony console command: ./bin/console app:resize-image-worker
.
Create Supervisor configuration
We want our workers to start automatically, so we’ll set an autostart=true
directive in the config.
Since the worker has to be restarted after a timeout or a successful processing task, we’ll also set an autorestart=true
directive.
The best part about background processing is the ease of parallel processing. We can set a numprocs=5
directive and Supervisor will spawn five instances of our workers. They will wait for jobs and process them independently, allowing us to scale our system easily. As your system grows, you’ll probably need to increase the number of processes. Since we’ll have multiple processes running, we need to define the structure of a process name, so we’re setting a process_name=%(program_name)s_%(process_num)02d
directive.
Last but not least, we want to store workers’ outputs so we can analyze and debug them if something goes wrong. We’ll define stderr_logfile
and stdout_logfile
paths.
The complete Supervisor configuration for our resize workers looks like this:
[program:resize-worker]
process_name=%(program_name)s_%(process_num)02d
command=php PATH-TO-YOUR-APP/bin/console app:resize-image-worker
autostart=true
autorestart=true
numprocs=5
stderr_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stderr.log
stdout_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stdout.log
After creating (or updating) the configuration file located in /etc/supervisor/conf.d/
directory, you have to tell Supervisor to re-read and update its configuration by executing the following commands:
supervisorctl reread
supervisorctl update
If you’re using Homestead Improved (and you should be!) you can use scripts/setup-supervisor.sh to generate the Supervisor configuration for this project: sudo ./scripts/setup-supervisor.sh
.
Update Fixtures
Image thumbnails won’t be rendered on the first request anymore, so we need to request rendering for every Image explicitly when we’re loading our fixtures in the LoadGalleriesData fixture class:
$imageResizer = $this->container->get(ImageResizer::class);
$fileManager = $this->container->get(FileManager::class);
...
$gallery->addImage($image);
$manager->persist($image);
$fullPath = $fileManager->getFilePath($image->getFilename());
if (false === empty($fullPath)) {
foreach ($imageResizer->getSupportedWidths() as $width) {
$imageResizer->getResizedPath($fullPath, $width, true);
}
}
Now you should feel how fixtures loading is slowed down, and that’s why we’ve moved it into the background instead of forcing our users to wait until it’s done!
Tips and Tricks
Workers are running in the background so that even after you deploy a new version of your app, you’ll have outdated workers running until they aren’t restarted for the first time.
In our case, we’d have to wait for all our workers to finish their tasks or timeout (5 minutes) until we’re sure all our workers are updated. Be aware of this when creating deploy procedures!
Frequently Asked Questions (FAQs) on Background Processing to Speed Up Page Load Times
What is the role of background processing in speeding up page load times?
Background processing plays a crucial role in enhancing the speed of page load times. It allows certain tasks to be performed in the background, freeing up the main thread to focus on loading the page. This means that the user doesn’t have to wait for these tasks to complete before the page loads, resulting in a faster, smoother browsing experience.
How does Symfony Process Component aid in background processing?
The Symfony Process Component is a powerful tool that allows you to execute commands in sub-processes. It provides a simple, object-oriented API for running system commands and managing their output. This can be particularly useful for background processing, as it allows you to run tasks in separate processes without blocking the main thread.
What are some common use cases for background processing?
Background processing is commonly used in situations where tasks can be performed independently of the main thread. This includes tasks such as sending emails, processing images, running complex calculations, and more. By running these tasks in the background, you can improve the performance of your application and provide a better user experience.
How can I execute a background process in PHP?
Executing a background process in PHP can be achieved using the exec()
function. This function allows you to run a command in a sub-process and then continue with the rest of your script without waiting for the command to complete. Here’s a simple example:exec("php background_task.php > /dev/null &");
In this example, background_task.php
is the script you want to run in the background.
What is the Symfony Messenger Component and how does it relate to background processing?
The Symfony Messenger Component is a message bus system that can be used to dispatch messages to handlers asynchronously. This means that you can send a message to the bus and then continue with your script without waiting for the message to be handled. This is a form of background processing, as the handling of the message can be done in a separate process.
How can I use background processing to improve the performance of my website?
By offloading tasks to the background, you can free up the main thread to focus on loading the page. This can significantly improve the performance of your website, especially if you have tasks that are time-consuming or resource-intensive. Some common tasks that can be offloaded to the background include sending emails, processing images, and running complex calculations.
What are some potential challenges with background processing and how can they be mitigated?
One of the main challenges with background processing is ensuring that tasks are completed successfully and in the correct order. This can be mitigated by using a task queue, which ensures that tasks are executed in the order they were added. Another challenge is handling errors in background tasks. This can be addressed by implementing robust error handling and logging mechanisms.
Can background processing be used in conjunction with other performance optimization techniques?
Yes, background processing can be used in conjunction with other performance optimization techniques. For example, you can use caching to store the results of expensive operations, and then use background processing to update the cache at regular intervals. This allows you to serve up-to-date data without slowing down your application.
How can I monitor the progress of my background tasks?
Monitoring the progress of background tasks can be achieved using various tools and techniques. One common approach is to use logging to record the status of each task. You can also use tools like Symfony’s Messenger Component, which provides built-in support for monitoring and debugging background tasks.
Are there any security considerations when using background processing?
Yes, there are several security considerations when using background processing. For example, you need to ensure that your background tasks do not expose sensitive information, and that they are not vulnerable to injection attacks. You should also ensure that your tasks are run in a secure environment, and that they do not have more permissions than they need to complete their job.
Zoran Antolović is an engineer and problem solver from Croatia. CTO, Startup Owner, writer, motivational speaker and usually the good guy. He adores productivity, hates distractions and sometimes overengineers things.