If you are a web developer or a programmer in general, you have most likely written algorithms for various tasks. Examples are: searching through a table, sorting an array of numbers by descending value, calculating the shortest path to get to a location… But what qualifies an algorithm to be a good algorithm?

One specification of an algorithm is its correctness. You will probably assume that your algorithm works after testing it out a few times. However, if you can mathematically prove that your algorithm will work as expected for every input value possible, this is a huge bonus. I will not go further in to that subject in this writing.

Another specification is its efficiency: how does the computing time relate to the amount of input? Is it a linear relation? Does computing time rise exponentially for the doubling of input? That’s what this article will be about.

## Time Complexity

Time complexity is, as mentioned above, the relation of computing time and the amount of input. This is usually about the size of an array or an object. Time complexity also isn’t useful for simple functions like fetching usernames from a database, concatenating strings or encrypting passwords. It is used more for sorting functions, recursive calculations and things which generally take more computing time.

This is not because we don’t care about that function’s execution time, but because the difference is negligible. We don’t care if it takes 10ms instead of 3ms to fetch a username. However, if we have a recursive sorting algorithm which takes 400ms and we can reduce that to 50ms, that would be an interesting thing to do.

As you might guess, the lower the computing time, the more efficient the algorithm is. The question that follows is: ‘how can we define time complexity in an universal way?’. That’s where we’ll use the ‘Big O notation’.

## Big O notation

The Big O notation is a notation for the time complexity of an algorithm. It is a mathematical representation of the upper bound of the limit of the scaling factor of the algorithm. For example, if we double the size of an input array, by how much does the computing time increase? This might become clear with two examples:

``````\$numbers = array(14,82,4,0,24,28);
foreach(\$numbers as \$number)
{
echo \$number;
}``````

Imagine that the `\$numbers` array is the argument of the function. We have a foreach loop running through its items. If we calculate the time that the code takes to run, what happens if we double the size of the array? We can easily see in this example that it will double the time to run. We see that there is a linear relationship between the size of the array and the computing time. So if we write the size of the array as n, we can write the time complexity as O(n).
Another example:

``````\$numbers = array(14,82,4,0,24,28);
foreach(\$numbers as \$number1)
{
foreach(\$numbers as \$number2)
{
if(\$number1 >= \$number2)
{
echo \$number1 . " is greater than or equal to " . \$number2;
}
else
{
echo \$number1 . " is smaller than " . \$number2;
}
}
}``````

In this piece of code, there is a foreach loop located inside another foreach loop. Let’s say ‘n’ is the size of `\$numbers`. Then we loop ‘n’ times through ‘n’. That makes the total amount of loops n². As you might guess, we write the time complexity as O(n²).

The big O notation expresses the scaling of computing time and uses some sort of mixture between the upper bound and the limit of that scaling. For example:

``````\$numbers = array(14,82,4,0,24,28);
foreach(\$numbers as \$number1)
{
foreach(\$numbers as \$number2)
{
if(\$number1 >= \$number2)
{
echo \$number1 . " is greater than or equal to " . \$number2;
}
else
{
echo \$number1 . " is smaller than " . \$number2;
}
}
}

foreach(\$numbers as \$number)
{
echo \$number;
}``````

You might feel the urge to write that time complexity as O(n²+n). While, technically, it is not wrong, it is rather meaningless: you always define time complexity as the mathematical limit to infinity. If you take the limit to infinity of a polynomial, it is always the variable with the highest exponent that matters. Since time complexity applies to the rate of change of time, factors are never written before the variables. This means that, for example, you can replace O(5n) by O(n).

## Efficient algorithms

Now that we know how to express time complexity, we can take a look at some examples of efficient algorithms.

For the first one, I want to introduce another special notation: O(log(n)), which shows a logarithmic relationship. An example of an algorithm that uses this is the binary search algorithm. For those too lazy to read the full article: you want to find a name in an alphabetically ordered list and so you go to the center. If the name you search comes before that, you go to the center between the center page and the beginning (so the 1/4th). You continue that until you find the right name. The time complexity of that algorithm is O(log(n)).

If you were to find the name by looping through the list entry after entry, the time complexity would be O(n). While that isn’t bad, O(log(n)) is many times better. It can be qualified as an efficient algorithm.

## Inefficient algorithms

Just as there are efficient algorithms, we have inefficient algorithms as well. One of them is the bogosort algorithm. While (fortunately) nobody actually uses it, it’s used as a demonstration of how you should not do it. When using it for sorting a list of numbers in descending order, it will, at random, choose an order for the list. It will then check if the list is in the correct order, if it is not, it will randomize it again. As you see, that algorithm isn’t very efficient and it has a time complexity of O(n x n!) (with n! as factorial of n). If you want to sort arrays in a time efficient manner, look for another algorithm, Heapsort for example.

## Writing an algorithm and optimizing it

I will now demonstrate how we can apply time complexity by first writing an algorithm, and then writing a better one. You will see why the latter is better by looking at its complexity. I want to write a function with an array as argument. The array will consist of a number of positive integers. The function then will return a new array, containing these integers, sorted by increasing size.

The first algorithm I will use is called insertion sort. In short: it will loop through the array, and if an integer is smaller than the next one, it will switch them. A more detailed description can be read here.

I implemented the algorithm this way:

``````function insertionSort(\$array)
{
\$currentNumber;

for(\$i = 1; \$i < count(\$array); \$i++)
{
\$currentNumber = \$array[\$i];

while ((\$i-1 >= 0) && (\$currentNumber < \$array[\$i-1])) //While there is a smaller number to the left
{
\$array[\$i] = \$array[\$i-1]; //replace the current number by the number to its left
\$i--;
}
//there are no smaller numbers to the left anymore
\$array[\$i] = \$currentNumber; //set the current number to the number that originally had index i
}

return \$array;
}

\$array = array(4,29,9,2,9);
print_r(insertionSort(\$array));``````

You see that there is a while loop inside a for loop. The worst case scenario is a time complexity of O(n²). While the algorithm does a good job at what it’s designed for, O(n²) is not good if you’re dealing with bigger arrays. I will now demonstrate a better algorithm for the job: this algorithm will first find the maximum of the array that is passed as argument. It will then create an associative array named `\$counting`, which counts the number of times that each index appears in the original array. Finally it loops through the counting array and it adds every index ‘n’ times to a new array, where ‘n’ is the value of the index. For example, if the value of \$counting[23] is ‘3’, it will add 23 3 times to the new array.

``````function findMax(\$array)
{
\$maximum = \$array[0];
for(\$i = 1; \$i < count(\$array); \$i++)
{
\$maximum = (\$array[\$i] > \$maximum ? \$array[\$i] : \$maximum);
}

return \$maximum;
}

function increasingSort(\$array)
{
\$size = findMax(\$array);

\$counting = array();

for(\$i = 0; \$i <= \$size; \$i++)
{
\$counting[\$i] = 0;
}

for(\$i = 0; \$i < count(\$array); \$i++)
{
\$counting[\$array[\$i]]++;
}

\$ordered = array();

for(\$i = 0; \$i < count(\$counting); \$i++)
{
for(\$j = 0; \$j < \$counting[\$i];\$j++)
{
\$ordered[] = \$i;
}
}

return \$ordered;
}

\$array = array(29,1,2,2,2,28,98);
print_r(increasingSort(\$array));``````

The time complexity of this algorithm is O(n), a lot better than the Insertion Sort algorithm. However, note that this algorithm might not be suitable for higher numbers which vary a lot, as the `\$array` will have a huge size. Always make sure that the algorithm fits the situation.

## Time complexity is not everything

Now that I showed you what time complexity is, note that computing time should never be your only focus. While you should always try to find out if your algorithm is time efficient enough, there are other aspects to consider, too. The computing time doesn’t matter if you need to sort ten items, so don’t waste time on that. Also, for most tasks like sorting, searching entries, etc. there already are various efficient and tested algorithms available, waiting for you to Google them.

• Lito

Sorry a lot my previous comment, it wasn’t constructive at all.

I think that you can create a code more optimizated replacing:

for(\$i = 1; \$i < count(\$array); \$i++)

with (for example):

for(\$i = 1, \$max = count(\$array); \$i < \$max; \$i++)

Best regards,
Lito.

• wanted

Your second implementation is still O(n²) btw…

“You see that there is a while loop inside a for loop.” in your second implementation there is a for loop inside a for loop (SSDD).

for(\$i = 0; \$i < count(\$counting); \$i++)
{
for(\$j = 0; \$j < \$counting[\$i];\$j++)
{
\$ordered[] = \$i;
}
}

No complexity gain over the first piece of code.

• Alexander Cogneau

Hey wanted,

You are correct that it is indeed a for-loop inside a for-loop. Thank you for pointing that out, I will edit the text.

But you second statement is not true: the function loops over the amount of ‘different’ values with index \$i, and with index \$j it loops over the amount of times that a certain number is in the original array.

Greetings,

Alexander

• wanted

Yes but to quantify with time complexity, you should always assume the worst and it would be possible to have billions of indexes in \$counting, therefore there is no time complexity gain.

With this concept the second algorythm has no complexity gain. It’s a concept, it’s not up for debate, that is the whole point.

Regards.

• Matt Parker

I think Alexander is right, but I may have misunderstood. If \$i is over the number of different values of \$array, and \$j is the number of times each value appears, then at the extremes:

– if every value of \$array is different to all the others, then \$i is O(n) and \$j === 1 for all \$i, (every value is different, so can by definition only appear once), which gives O(n).

– if every value of \$array is identical to all the others, \$then \$i === 1 and \$j is O(n), which gives O(n).

So at the extremes (input array is uniform or every element of input array is unique) we have O(n).

In-between. Let’s say the number of different values is half the size of the array (i.e. n/2) – something like [1,2,3,1,2,3], then \$j === 2 and I think we’re at O(n/2 * 2) i.e. back at O(n).

I might be wrong on this: I’d be interested to see why, if so.

• JD

I see the new algorithm as 0(n) as well.

The total number of \$j iterations will always be fixed at n, no matter how many iterations of \$i. The \$j iterations will just be spread out differently across each \$i iteration, depending on the number of unique values.

• Marian Bîrlădeanu

Great article! Congratulations!

• Saurabh Kumar Pandey

Good explaination

• Taylor Ren

@acogneau would be good if you can also share the profiling data for the two methods.

• Artur Dzida

I think ‘wanted’ is 100% right, cause you didn’t change the complexity of the algorithm. Maybe it’s a little bit faster sometimes, but the complexity remains the same O(n^2).
The size of the array that needs to be sorted is irrelevant, because complexity is calculated indepentent from the array size. AFAIR, the best complexity that you can gain for sorting a vector is O(n log n). Otherwise you would be given a nice prize for finding O(n) class algorithm for sorting a vector :D
Also, please don’t use functions in the conditional. They need to be executed every time the conditional is evaluated. It obviously takes more time that simply comparison of two variables/ variable and const.

• pinoniq

You talk about complexity and the big O notation, but you don’t mention ‘problem instances’. A for loop isn’t always considered O(n). It depends on the complexity of the code performed inside the for loop. If it is a simple echo statement or code that creates a Hash using an external service it will give copletly different O notations.

Then, you state that increasingSort is O(n), this is not true. Or you would have ofcourse bested all other sorting algos out there: http://bigocheatsheet.com/

Before you start alking abot complexity one should talk about the problem instances. Often comparisons are a good place to start (Decision problems).

Another important instance is the function instance. Consider the following code:

foreach( \$big_array as \$item => \$number) {
\$result[\$item] = in_array( \$number, \$big_array );
}and:

foreach( \$big_array as \$item => \$number) {
\$result[\$item] = array_key_exists( \$item, \$big_array );
}
These look very similiar. But in_arra is O(n) and array_key_exists is O(1) because it uses a hash look up.

So, in goes the ‘function problem’.

Then there is the memory usage. If you use a lot of memory, the system has to allocate that memory, adding extra complexity to your application. this means you will also have to define when memory usage becomes O(something) and when it’s is simply O(1). Same goes for passing in pointers or simple a copy, ….

• http://www.bitfalls.com/ Bruno Skvorc

Excellent feedback, thank you. If you’d like to write a renewed look at this with more PHP examples, please do get in touch.

• pinoniq

Well, I’d love to, but then you guys should answer your mails ;)

• http://www.bitfalls.com/ Bruno Skvorc

Gah! Missed it, sorry :)

### Recommended for you

Special Offer
Free course!

Git into it! Bonus course Introduction to Git is yours when you take up a free 14 day SitePoint Premium trial.