Building an Image Gallery Blog with Symfony Flex: Data Testing

This article is part of a series on building a sample application — a multi-image gallery blog — for performance benchmarking and optimizations. (View the repo here.)

In the previous article, we demonstrated how to set up a Symfony project from scratch with Flex, and how to create a simple set of fixtures and get the project up and running.

The next step on our journey is to populate the database with a somewhat realistic amount of data to test application performance.

Note: if you did the “Getting started with the app” step in the previous post, you’ve already followed the steps outlined in this post. If that’s the case, use this post as an explainer on how it was done.

As a bonus, we’ll demonstrate how to set up a simple PHPUnit test suite with basic smoke tests.

Key Takeaways

Utilize Symfony Flex to efficiently build and optimize a multi-image gallery blog, focusing on performance testing and data management.
Implement batch processing in Doctrine to manage memory usage effectively, ensuring stable memory consumption across data fixture loads.
Optimize the image handling process by pre-selecting a set of images and reusing them, which significantly reduces the time and resources required for generating large datasets.
Conduct comprehensive performance testing using tools like Siege, facilitated by Docker, to simulate user interactions and measure the application’s responsiveness and scalability.
Establish a robust testing framework with PHPUnit to ensure the reliability of the application through functional and smoke tests, verifying that all critical components perform correctly under various conditions.

More Fake Data

Once your entities are polished, and you’ve had your “That’s it! I’m done!” moment, it’s a perfect time to create a more significant dataset that can be used for further testing and preparing the app for production.

Simple fixtures like the ones we created in the previous article are great for the development phase, where loading ~30 entities is done quickly, and it can often be repeated while changing the DB schema.

Testing app performance, simulating real-world traffic and detecting bottlenecks requires bigger datasets (i.e. a larger amount of database entries and image files for this project). Generating thousands of entries takes some time (and computer resources), so we want to do it only once.

We could try increasing the COUNT constant in our fixture classes and seeing what will happen:

// src/DataFixtures/ORM/LoadUsersData.php
class LoadUsersData extends AbstractFixture implements ContainerAwareInterface, OrderedFixtureInterface
{
    const COUNT = 500;
    ...
}

// src/DataFixtures/ORM/LoadGalleriesData.php
class LoadGalleriesData extends AbstractFixture implements ContainerAwareInterface, OrderedFixtureInterface
{
    const COUNT = 1000;
    ...
}

Now, if we run bin/refreshDb.sh, after some time we’ll probably get a not-so-nice message like PHP Fatal error: Allowed memory size of N bytes exhausted.

Apart from slow execution, every error would result in an empty database because EntityManager is flushed only at the very end of the fixture class. Additionally, Faker is downloading a random image for every gallery entry. For 1,000 galleries with 5 to 10 images per gallery that would be 5,000 – 10,000 downloads, which is really slow.

There are excellent resources on optimizing Doctrine and Symfony for batch processing, and we’re going to use some of these tips to optimize fixtures loading.

First, we’ll define a batch size of 100 galleries. After every batch, we’ll flush and clear the EntityManager (i.e., detach persisted entities) and tell the garbage collector to do its job.

To track progress, let’s print out some meta information (batch identifier and memory usage).

Note: After calling $manager->clear(), all persisted entities are now unmanaged. The entity manager doesn’t know about them anymore, and you’ll probably get an “entity-not-persisted” error.

The key is to merge the entity back to the manager $entity = $manager->merge($entity);

Without the optimization, memory usage is increasing while running a LoadGalleriesData fixture class:

> loading [200] App\DataFixtures\ORM\LoadGalleriesData
100 Memory usage (currently) 24MB / (max) 24MB
200 Memory usage (currently) 26MB / (max) 26MB
300 Memory usage (currently) 28MB / (max) 28MB
400 Memory usage (currently) 30MB / (max) 30MB
500 Memory usage (currently) 32MB / (max) 32MB
600 Memory usage (currently) 34MB / (max) 34MB
700 Memory usage (currently) 36MB / (max) 36MB
800 Memory usage (currently) 38MB / (max) 38MB
900 Memory usage (currently) 40MB / (max) 40MB
1000 Memory usage (currently) 42MB / (max) 42MB

Memory usage starts at 24 MB and increases for 2 MB for every batch (100 galleries). If we tried to load 100,000 galleries, we’d need 24 MB + 999 (999 batches of 100 galleries, 99,900 galleries) * 2 MB = ~2 GB of memory.

After adding $manager->flush() and gc_collect_cycles() for every batch, removing SQL logging with $manager->getConnection()->getConfiguration()->setSQLLogger(null) and removing entity references by commenting out $this->addReference('gallery' . $i, $gallery);, memory usage becomes somewhat constant for every batch.

// Define batch size outside of the for loop
$batchSize = 100;

...

for ($i = 1; $i <= self::COUNT; $i++) {
    ...

    // Save the batch at the end of the for loop
    if (($i % $batchSize) == 0 || $i == self::COUNT) {
        $currentMemoryUsage = round(memory_get_usage(true) / 1024);
        $maxMemoryUsage = round(memory_get_peak_usage(true) / 1024);
        echo sprintf("%s Memory usage (currently) %dKB/ (max) %dKB \n", $i, $currentMemoryUsage, $maxMemoryUsage);

        $manager->flush();
        $manager->clear();

        // here you should merge entities you're re-using with the $manager
        // because they aren't managed anymore after calling $manager->clear();
        // e.g. if you've already loaded category or tag entities
        // $category = $manager->merge($category);

        gc_collect_cycles();
    }
}

As expected, memory usage is now stable:

> loading [200] App\DataFixtures\ORM\LoadGalleriesData
100 Memory usage (currently) 24MB / (max) 24MB
200 Memory usage (currently) 26MB / (max) 28MB
300 Memory usage (currently) 26MB / (max) 28MB
400 Memory usage (currently) 26MB / (max) 28MB
500 Memory usage (currently) 26MB / (max) 28MB
600 Memory usage (currently) 26MB / (max) 28MB
700 Memory usage (currently) 26MB / (max) 28MB
800 Memory usage (currently) 26MB / (max) 28MB
900 Memory usage (currently) 26MB / (max) 28MB
1000 Memory usage (currently) 26MB / (max) 28MB

Instead of downloading random images every time, we can prepare 15 random images and update the fixture script to randomly choose one of them instead of using Faker’s $faker->image() method.

Let’s take 15 images from Unsplash and save them in var/demo-data/sample-images.

Then, update the LoadGalleriesData::generateRandomImage method:

private function generateRandomImage($imageName)
    {
        $images = [
            'image1.jpeg',
            'image10.jpeg',
            'image11.jpeg',
            'image12.jpg',
            'image13.jpeg',
            'image14.jpeg',
            'image15.jpeg',
            'image2.jpeg',
            'image3.jpeg',
            'image4.jpeg',
            'image5.jpeg',
            'image6.jpeg',
            'image7.jpeg',
            'image8.jpeg',
            'image9.jpeg',
        ];

        $sourceDirectory = $this->container->getParameter('kernel.project_dir') . '/var/demo-data/sample-images/';
        $targetDirectory = $this->container->getParameter('kernel.project_dir') . '/var/uploads/';

        $randomImage = $images[rand(0, count($images) - 1)];
        $randomImageSourceFilePath = $sourceDirectory . $randomImage;
        $randomImageExtension = explode('.', $randomImage)[1];
        $targetImageFilename = sha1(microtime() . rand()) . '.' . $randomImageExtension;
        copy($randomImageSourceFilePath, $targetDirectory . $targetImageFilename);

        $image = new Image(
            Uuid::getFactory()->uuid4(),
            $randomImage,
            $targetImageFilename
        );

        return $image;
    }

It’s a good idea to remove old files in var/uploads when reloading fixtures, so I’m adding rm var/uploads/* command to bin/refreshDb.sh script, immediately after dropping the DB schema.

Loading 500 users and 1000 galleries now takes ~7 minutes and ~28 MB of memory (peak usage).

Dropping database schema...
Database schema dropped successfully!
ATTENTION: This operation should not be executed in a production environment.

Creating database schema...
Database schema created successfully!
  > purging database
  > loading [100] App\DataFixtures\ORM\LoadUsersData
300 Memory usage (currently) 10MB / (max) 10MB
500 Memory usage (currently) 12MB / (max) 12MB
  > loading [200] App\DataFixtures\ORM\LoadGalleriesData
100 Memory usage (currently) 24MB / (max) 26MB
200 Memory usage (currently) 26MB / (max) 28MB
300 Memory usage (currently) 26MB / (max) 28MB
400 Memory usage (currently) 26MB / (max) 28MB
500 Memory usage (currently) 26MB / (max) 28MB
600 Memory usage (currently) 26MB / (max) 28MB
700 Memory usage (currently) 26MB / (max) 28MB
800 Memory usage (currently) 26MB / (max) 28MB
900 Memory usage (currently) 26MB / (max) 28MB
1000 Memory usage (currently) 26MB / (max) 28MB

Take a look at the fixture classes source: LoadUsersData.php and LoadGalleriesData.php.

Performance

At this point, the homepage rendering is very slow — way too slow for production.

A user can feel that the app is struggling to deliver the page, probably because the app is rendering all the galleries instead of a limited number.

Instead of rendering all galleries at once, we could update the app to render only the first 12 galleries immediately and introduce lazy load. When the user scrolls to the end of the screen, the app will fetch next 12 galleries and present them to the user.

Performance tests

To track performance optimization, we need to establish a fixed set of tests that will be used to test and benchmark performance improvements relatively.

We will use Siege for load testing. Here you can find more about Siege and performance testing. Instead of installing Siege on my machine, we can utilize Docker — a powerful container platform.

In simple terms, Docker containers are similar to virtual machines (but they aren’t the same thing). Except for building and deploying apps, Docker can be used to experiment with applications without actually installing them on your local machine. You can build your images or use images available on Docker Hub, a public registry of Docker images.

It’s especially useful when you want to experiment with different versions of the same software (for example, different versions of PHP).

We’ll use the yokogawa/siege image to test the app.

Testing the home page

Testing the home page is not trivial, since there are Ajax requests executed only when the user scrolls to the end of the page.

We could expect all users to land on the home page (i.e., 100%). We could also estimate that 50% of them would scroll down to the end and therefore request the second page of galleries. We could also guess that 30% of them would load the third page, 15% would request the fourth page, and 5% would request the fifth page.

These numbers are based on predictions, and it would be much better if we could use an analytics tool to get an actual insight in users’ behavior. But that’s impossible for a brand new app. Still, it’s a good idea to take a look at analytics data now and then and adjust your test suite after the initial deploy.

We’ll test the home page (and lazy load URLs) with two tests running in parallel. The first one will be testing the home page URL only, while another one will test lazy load endpoint URLs.

File lazy-load-urls.txt contains a randomized list of lazily loaded pages URLs in predicted ratios:

10 URLs for the second page (50%)
6 URLs for third page (30%)
3 URLs for fourth page (15%)
1 URLs for fifth page (5%)

http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=4
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=3
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=4
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=4
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=3
http://blog.app/galleries-lazy-load?page=3
http://blog.app/galleries-lazy-load?page=3
http://blog.app/galleries-lazy-load?page=5
http://blog.app/galleries-lazy-load?page=3
http://blog.app/galleries-lazy-load?page=2
http://blog.app/galleries-lazy-load?page=3

The script for testing homepage performance will run 2 Siege processes in parallel, one against home page and another one against a generated list of URLs.

To execute a single HTTP request with Siege (in Docker), run:

docker run --rm -t yokogawa/siege -c1 -r1 blog.app

Note: if you aren’t using Docker, you can omit the docker run --rm -t yokogawa/siege part and run Siege with the same arguments.

To run a 1-minute test with 50 concurrent users against the home page with a 1-second delay, execute:

docker run --rm -t yokogawa/siege -d1 -c50 -t1M http://blog.app

To run a 1-minute test with 50 concurrent users against URLs in lazy-load-urls.txt, execute:

docker run --rm -v `pwd`:/var/siege:ro -t yokogawa/siege -i --file=/var/siege/lazy-load-urls.txt -d1 -c50 -t1M

Do this from the directory where your lazy-load-urls.txt is located (that directory will be mounted as a read-only volume in Docker).

Running a script test-homepage.sh will start 2 Siege processes (in a way suggested by this Stack Overflow answer) and output results.

Assume we’ve deployed the app on a server with Nginx and with PHP-FPM 7.1 and loaded 25,000 users and 30,000 galleries. The results from load testing the app home page are:

./test-homepage.sh

Transactions:               499 hits
Availability:               100.00 %
Elapsed time:               59.10 secs
Data transferred:           1.49 MB
Response time:              4.75 secs
Transaction rate:           8.44 trans/sec
Throughput:                 0.03 MB/sec
Concurrency:                40.09
Successful transactions:    499
Failed transactions:        0
Longest transaction:        16.47
Shortest transaction:       0.17

Transactions:               482 hits
Availability:               100.00 %
Elapsed time:               59.08 secs
Data transferred:           6.01 MB
Response time:              4.72 secs
Transaction rate:           8.16 trans/sec
Throughput:                 0.10 MB/sec
Concurrency:                38.49
Successful transactions:    482
Failed transactions:        0
Longest transaction:        15.36
Shortest transaction:       0.15

Even though app availability is 100% for both home page and lazy-load tests, response time is ~5 seconds, which is not something we’d expect from a high-performance app.

Testing a single gallery page

Testing a single gallery page is a little bit simpler: we’ll run Siege against the galleries.txt file, where we have a list of single gallery page URLs to test.

From the directory where the galleries.txt file is located (that directory will be mounted as a read-only volume in Docker), run this command:

docker run --rm -v `pwd`:/var/siege:ro -t yokogawa/siege -i --file=/var/siege/galleries.txt -d1 -c50 -t1M

Load test results for single gallery pages are somewhat better than for the home page:

./test-single-gallery.sh
** SIEGE 3.0.5
** Preparing 50 concurrent users for battle.
The server is now under siege...
Lifting the server siege...      done.

Transactions:               3589 hits
Availability:               100.00 %
Elapsed time:               59.64 secs
Data transferred:           11.15 MB
Response time:              0.33 secs
Transaction rate:           60.18 trans/sec
Throughput:                 0.19 MB/sec
Concurrency:                19.62
Successful transactions:    3589
Failed transactions:        0
Longest transaction:        1.25
Shortest transaction:       0.10

Tests, Tests, Tests

To make sure we’re not breaking anything with improvements we implement in the future, we need at least some tests.

First, we require PHPUnit as a dev dependency:

composer req --dev phpunit

Then we’ll create a simple PHPUnit configuration by copying phpunit.xml.dist created by Flex to phpunit.xml and update environment variables (e.g., DATABASE_URL variable for the test environment). Also, I’m adding phpunit.xml to .gitignore.

Next, we create basic functional/smoke tests for the blog home page and single gallery pages. Smoke testing is a “preliminary testing to reveal simple failures severe enough to reject a prospective software release”. Since it’s quite easy to implement smoke tests, there’s no valid reason why you should avoid them!

These tests would only assert that URLs you provide in the urlProvider() method are resulting in a successful HTTP response code (i.e., HTTP status code is 2xx or 3xx).

Simple smoke testing the home page and five single gallery pages could look like this:

namespace App\Tests;

use App\Entity\Gallery;
use Psr\Container\ContainerInterface;
use Symfony\Bundle\FrameworkBundle\Test\WebTestCase;
use Symfony\Component\Routing\RouterInterface;

class SmokeTest extends WebTestCase
{
    /** @var  ContainerInterface */
    private $container;

    /**
     * @dataProvider urlProvider
     */
    public function testPageIsSuccessful($url)
    {
        $client = self::createClient();
        $client->request('GET', $url);

        $this->assertTrue($client->getResponse()->isSuccessful());
    }

    public function urlProvider()
    {
        $client = self::createClient();
        $this->container = $client->getContainer();

        $urls = [
            ['/'],
        ];

        $urls += $this->getGalleriesUrls();

        return $urls;
    }

    private function getGalleriesUrls()
    {
        $router = $this->container->get('router');
        $doctrine = $this->container->get('doctrine');
        $galleries = $doctrine->getRepository(Gallery::class)->findBy([], null, 5);

        $urls = [];

        /** @var Gallery $gallery */
        foreach ($galleries as $gallery) {
            $urls[] = [
                '/' . $router->generate('gallery.single-gallery', ['id' => $gallery->getId()],
                    RouterInterface::RELATIVE_PATH),
            ];
        }

        return $urls;
    }

}

Run ./vendor/bin/phpunit and see if tests are passing:

./vendor/bin/phpunit
PHPUnit 6.5-dev by Sebastian Bergmann and contributors.

...

5 / 5 (100%)

Time: 4.06 seconds, Memory: 16.00MB

OK (5 tests, 5 assertions)

Note that it’s better to hardcode important URLs (e.g., for static pages or some well-known URLs) than to generate them within the test. Learn more about PHPUnit and TDD here.

Stay Tuned

Upcoming articles in this series will cover details about PHP and MySQL performance optimization, improving overall performance perception and other tips and tricks for better app performance.

Frequently Asked Questions (FAQs) about Building an Image Gallery Blog with Symfony Flex and Data Testing

How can I install Symfony Flex for my project?

To install Symfony Flex, you need to use Composer, a tool for dependency management in PHP. You can install it by running the command composer require symfony/flex. This command will add Symfony Flex as a dependency to your project and it will be automatically installed. Remember, Symfony Flex is not a Symfony component, but a tool that manages Symfony applications.

What is the purpose of the .env file in Symfony Flex?

The .env file in Symfony Flex is used to define environment variables for your application. These variables can be used to configure different aspects of your application, such as database connections, mail servers, and API keys. This file is not committed to your version control system, ensuring that sensitive information is not shared publicly.

How can I use the Image constraint in Symfony?

The Image constraint in Symfony is used to validate that a file uploaded through a form is an image and meets certain criteria. You can use it by adding the @Assert\Image annotation to your entity class. You can also specify additional options such as minWidth, maxWidth, minHeight, and maxHeight to further restrict the size of the image.

How can I set up a database for my Symfony Flex project?

To set up a database for your Symfony Flex project, you need to configure the DATABASE_URL environment variable in your .env file. This variable should contain the connection string for your database. Once this is done, you can create the database by running the command php bin/console doctrine:database:create.

How can I use the LazilyRefreshDatabase trait in Laravel?

The LazilyRefreshDatabase trait in Laravel is used to refresh the database before each test. This ensures that your tests are always running against a clean database. To use it, simply include the use LazilyRefreshDatabase; statement in your test class.

How can I add images to my Symfony Flex blog?

To add images to your Symfony Flex blog, you need to create a form that allows users to upload images. You can use the FileType form field type for this. Once the form is submitted, you can handle the uploaded file in your controller and save it to your desired location.

How can I test my Symfony Flex application?

Symfony Flex provides several tools for testing your application. You can use PHPUnit for unit testing and Behat for behavior-driven development (BDD). Symfony also provides a WebTestCase class that allows you to simulate HTTP requests and assert responses in your functional tests.

How can I deploy my Symfony Flex application?

There are several ways to deploy your Symfony Flex application. You can use traditional methods such as FTP or SSH, or you can use modern deployment tools such as Docker or Kubernetes. Symfony also provides a symfony command-line tool that includes commands for deploying your application.

How can I optimize the performance of my Symfony Flex application?

Symfony Flex provides several tools for optimizing the performance of your application. You can use the profiler to identify performance bottlenecks, and you can use the cache component to cache data and reduce database queries. You can also optimize your code by following best practices for PHP and Symfony development.

How can I secure my Symfony Flex application?

Symfony Flex provides several tools for securing your application. You can use the security component to handle authentication and authorization, and you can use the csrf component to protect against cross-site request forgery attacks. Symfony also provides a security-checker command that checks your project dependencies for known security vulnerabilities.