Came across a post today doing another framework benchmark amongst the most popular ones, comparing requests per second, memory usage, execution time and more. Check it out here.
Phalcon still looks very appealing, but it’s future is unknown, regarding the extension API changes coming in PHP 7…
Those results are pretty much in line with what I would expect, particular at either end, although I am quite surprised about Silex’s postion. I wonder why he didn’t benchmark zend though - I’m sure that would have knocked laravel out of the worst performing spot.
Though it’s questionable how well configured those frameworks are: https://twitter.com/taylorotwell/status/582990945366319107
When benching phalcon is the phalcon.so turned off for the other benchmarks? When I was testing this I found that loading phalcon.so had some overhead.
Do ALL those tests pull fresh data where something would otherwise be cached in the framework? Several of those frameworks have added “caching” that would probably screw the results. Especially if one used cached resources and others were rebuilding them.
I’m disappointed that this is yet another Hello World test. A swath of features that matter tremendously in real world applications never get tested. Page caching. Partial caching. Template rendering. Query pre-fetching or query lazy loading. Nor do we compare features. How easy or hard is it to manage security in each framework? Logging, profiling and debuggin in each framework? Combining and minifying frontend assets? Any many more.
If we want to truly compare frameworks, then we would need to define requirements for a real world application and have a knowledgeable person per framework build it. Then we can compare real world application vs real world application.
I would completely agree and go as far to say these types of tests are useless except for *pushing one system over another to the uneducated.
Why is Laravel 5 being so slow and memory-consuming? I thought it was a medium-sized framework, or at least more lightweight as compared to Symfony and Zend 2.
I’m not surprised by Symfony because Symfony offers very little out of the box. All the power is in adding dependencies and configuring them as you need them. Laravel pretty much takes that and adds a bunch of things by default that are normally required in a dev workflow for medium to large sized projects. The consequences of adding a bunch of dependencies by default would be a increase in bootstrap time. You take all the default dependencies of Laravel and add them to Symfony as bundles and you would probably get a closer bootstrap time. Also I think the DI container in Symfony is cached where as it is rebuilt each time in Laravel by default. I’m sure those are settings that can tuned.
This however, does raise an important question about the design of the underlying framework being tested. If these things aren’t used on a given request, why are they being loaded and having a performance impact? I’d argue these benchmarks (phalcon aside) do give an overview of the framework performance out of the box. Saying “well you can make laravel faster” is like comparing laptops where one is manufacturer-spec and the other is one where you replaced the HDD with an SSD, it’s simply not a fair test any more.
Whether performance is really an indicator of anything useful is another question of course, but it does give an overview of the overhead of the framework, from which you can infer some idea of the underling complexity.
What’s really needed is a second metric to evaluate it with. It would be interesting to see does SLOC or # of classes correlate with performance? How about class size, inheritance tree size? It’s also interesting to look at code re-use by comparing SLOC to SLOC that were executed. If SLOC and number of lines executed are roughly equivalent then we know that code re-use is low or if # lines executed is ten times SLOC then we know that code reuse is high… then it’s possible to ask questions like “How does code reuse impact performance?” which would be rather more interesting to explore… and if we answer that question and discover that reuse and performance strongly correlate (my hunch is they would, if nothing else because fewer files are likely loaded) then we can correctly state that higher performance generally infers better-written software in terms of software reusability.
That would be such an awesome benchmark to data mine for correlations…
The Hello World test is a fair test for comparing what we’re getting with the “off the shelf” editions.
How many home users buy a factory spec machine and then proceed to tune it up for max performance, or even know that they might be able to tweak their clock settings in the bios? Obviously we, as the “customers” should have a better idea of what we need to do to get the most out of each request but it doesn’t automatically follow that every downloader does just that.
If Taylor Otwell can “fix the crappy setup” (his words) to boost the performance of “of the shelf” laravel, well, I’d love to see the download offering split into two editions. The current one being labelled the “Learning Edition”, leaving room for a “Pro Edition” where the developer has to add back in just the dependencies that his or her application requires.
From my small amount of time with Laravel I would tend to believe that the thing that takes the longest is discovering providers and initiating them. If that could somehow be cached than the system would naturally become much faster. Drupal 8 does this with the Symfony DIC for obvious performance reasons.
It’s not about “off the shelf”. It’s about how many features get exercised. Caching? Database? Templates? Real applications use them all; Hello World applications use none.
Let’s try this another way. Let’s say fictitious framework A is really good at scaling and has a fictitious big-O of
O(log n + 100). That is, as complexity and load go up, it still maintains good and steady performance, but it comes with an overhead of 100. Then let’s say fictitious framework B has a fictitious big-O of
O(n^2 + 2). That is, it doesn’t scale very well at all, but it has almost no overhead. The
log n vs
n^2 is the most important part of that comparison, yet this is exactly what Hello World doesn’t test. Hello World tests only the overhead, which doesn’t tell you squat about how a real application will perform.
I don’t think it was claimed that it does? But if you did want to work out how well something scaled you’d need to do the hello world tests and tests at various levels of “scaled” depending on how you want to implement that… which could the be extrapolated to any point on the “scaled” scale you like. The problem is, how do you define those data points?
The inverse of what you say is also true of course, just because something is slow in the hello world test does not mean it will scale better than something that was fast.
Also, given the following scenario:
- Framework A scores 10 seconds in the hello world test and 100 seconds in a more complex test
- Framework B scores 1 second in the hello world test and 100 seconds in the more complex test
The better choice is framework B because it scales in a more linear fashion and requests to the site will not all be the top level of complexity. You’re right that we cannot infer any of this from the hello world test, but it’s important to remember that we do need it as a data point before drawing any further conclusions.
I’m not saying it does either. But because it doesn’t, I’m saying the benchmark doesn’t tell us anything useful. It doesn’t tell us how each framework will perform with real applications, where things like queries and caching are far more significant than the framework’s overhead.
The problem with doing a more complex test is that you’re introducing multiple variables, which makes the test unfair. If you’re benchmarking the framework with caching then you’re actually benchmarking the caching layer, which is a worthwhile tests on its own but once you’re testing caching + framework the results become meaningless because you don’t know which of the two variables are affecting the performance.
It does tell us something useful, it tells us the overhead of each framework relative to one another. Considering I should be able to use any template system or ORM with any framework, bundling these into the test becomes pointless unless you use the same template system and ORM for each test… which will give the same overhead and all you’re really testing is the underlying framework relative performance… which is exactly what the hello world test does.
edit: You could use the above propsed test with the hello world test to work out the overhead of using a given ORM with a given framework ( Framework with ORM performance - Naked ORM performance - Hello world benchmark = Overhead of using a given ORM with a given framework) which might be interesting, but a lot of work.
I built a static site generator in Slim last week. It’s a real application, it uses framework+caching in its operation.
But it’d be totally unfair to try benchmarking its performance against any other framework based application, since my app only really gets involved in delivering /short/urls (or building out pages that don’t yet exist on the filesystem)
Which is important to do. Some frameworks have better caching layers than others, and the degree to which we can leverage cache is a huge factor in how well a real world application will perform.
TL;DR: But but but… Then you’re putting yourself back to square one within the context of achieving a fair comparison.
You need a team of pro coders who had never touched any of these frameworks before, give them a spec (waterfall! Ugh!), give them all the time that they’d require to learn, build, tune and optimise and then you could benchmark.
But that’s the same as the hello world test in the first place.
- Uniform coder ability
- there’s only one guy doing it so there’s no variance in ability across each test project
- Given the same spec
- it’s just “get the framework to display hello world”
- Learn and build
- optimisation is excluded from this test so that it’s uniform across all test projects and can therefore be discounted
- check and compare the results for each of the test projects.
Whether a comparison is even relevant is questionable as you quite rightly say, a real world application is a different beast to a controlled framework comparison study, which this is intended to be.