PHP
Article

A Comprehensive Guide to Using Cronjobs

By Reza Lavaryan

There are times when there’s a need for running a group of tasks automatically at certain times in the future. These tasks are usually administrative, but could be anything – from making database backups to downloading emails when everyone is asleep.

Cron is a time-based job scheduler in Unix-like operating systems, which triggers certain tasks at a point in the future. The name originates from the Greek word χρόνος (chronos), which means time.

The most commonly used version of Cron is known as Vixie Cron, originally developed by Paul Vixie in 1987.

This article is an in-depth walkthrough of this program, and a reboot of this ancient, but still surprisingly relevant post.

Chronos Image

Terminology

  • Job: a unit of work, a series of steps to do something. For example, sending an email to a group of users. In this article, we’ll use task, job, cron job or event interchangeably.

  • Daemon: (/ˈdiːmən/ or /ˈdeɪmən/) is a computer program which runs in the background, serving different purposes. Daemons are often started at boot time. A web server is a daemon serving HTTP requests. Cron is a daemon for running scheduled tasks.

  • Cron Job: a cron job is a scheduled job, being run by Cron when it’s due.

  • Webcron: a time-based job scheduler which runs within the web server environment. It’s used as an alternative to the standard Cron, often on shared web hosts that do not provide shell access.

Getting Started

This tutorial assumes you’re running a Unix-based operating system like Ubuntu. If you aren’t, we recommend setting up Homestead Improved – it’s a 5 minute process which will save you years down the line.

If we take a look inside the /etc directory, we can see directories like cron.hourly, cron.daily, cron.weekly and cron.monthly, each corresponding to a certain frequency of execution. One way to schedule our tasks is to place our scripts in the proper directory. For example, to run db_backup.php on a daily basis, we put it inside cron.daily. If the folder for a given frequency is missing, we would need to create it first.

Note: This approach uses the run-parts script, a command which runs every executable it finds within the specified directory.

This is the simplest way to schedule a task. However, if we need more flexibility, we should use Crontab.

Crontab Files

Cron uses special configuration files called crontab files, which contain a list of jobs to be done. Crontab stands for Cron Table. Each line in the crontab file is called a cron job, which resembles a set of columns separated by a space character. Each row specifies when and how often a certain command or script should be executed.

In a crontab file, blank lines or lines starting with #, spaces or tabs will be ignored. Lines starting with # are considered comments.

Active lines in a crontab are either the declaration of an environment variable or a cron job, and comments are not allowed on the active lines.

Below is an example of a crontab file with just one entry:

0 0 * * *  /var/www/sites/db_backup.sh

The first part 0 0 * * * is the cron expression, which specifies the frequency of execution. The above cron job will run once a day.

Users can have their own crontab files named after their username as registered in the /etc/passwd file. All user-level crontab files reside in Cron’s spool area. These files should not be edited directly. Instead, we should edit them using the crontab command-line utility.

Note: The spool directory varies across different distributions of Linux. On Ubuntu it’s /var/spool/cron/crontabs while in CentOS it’s /var/spool/cron.

To edit our own crontab file:

crontab -e

The above command will automatically open up the crontab file which belongs to our user. If a default system editor for the crontab hasn’t been selected before, a choice will be presented listing the installed ones. We can also explicitly choose or change our desired editor for editing the crontab file:

export VISUAL=nano; crontab -e

After we save the file and exit the editor, the crontab will be checked for accuracy. If everything is set properly, the file will be saved to the spool directory.

Note: Each command in the crontab file is executed from the perspective of the user who owns the crontab, so if your command runs as root (sudo) you will not be able to define this crontab from your own user account unless you log in as root.

To list the installed cron jobs belonging to our own user:

crontab -l

We can also write our cron jobs in a file and send its contents to the crontab file like so:

crontab /path/to/the/file/containing/cronjobs.txt

The preceding command will overwrite the existing crontab file with /path/to/the/file/containing/cronjobs.txt.

To remove the crontab, we use the -r option:

crontab -r

Anatomy of a Crontab Entry

The anatomy of a user-level crontab entry looks like the following:

 # ┌───────────── min (0 - 59) 
 # │ ┌────────────── hour (0 - 23)
 # │ │ ┌─────────────── day of month (1 - 31)
 # │ │ │ ┌──────────────── month (1 - 12)
 # │ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to Saturday, or use names; 7 is Sunday, the same as 0)
 # │ │ │ │ │
 # │ │ │ │ │
 # * * * * *  command to execute

The first two fields specify the time (minute and hour) at which the task will run. The next two fields specify the day of the month and the month. The fifth field specifies the day of the week.

The command will be executed when the minute, hour, month and either day of month or day of week match the current time.

If both day of week and day of month have certain values, the event will be run when either field matches the current time. Consider the following expression:

0 0 5-20/5 Feb 2 /path/to/command

The preceding cron job will run once per day every five days, from 5th to 20th of February plus all Tuesdays of February.

Important: When both day of month and day of week have certain values (not an asterisk), it will create an OR condition, meaning both days will be matched.

The syntax in system crontabs (/etc/crontab) is slightly different than user-level crontabs. The difference is that the sixth field is not the command, but it is the user we want to run the job as.

* * * * * testuser /path/to/command

It’s not recommended to put system-wide cron jobs in /etc/crontab, as this file might be modified in future system updates. Instead, we put these cron jobs in the /etc/cron.d directory.

Editing Other Users’ Crontab

We might need to edit other users’ crontab files. To do this, we use the -u option as below:

crontab -u username -e

Note We can only edit other users’ crontab files as the root user, or as a user with administrative privileges.

Some tasks require super admin privileges, thus, they should be added to the root user’s crontab file:

sudo crontab -e

Note: Please note that using sudo with crontab -e will edit the root user’s crontab file. If we need to to edit another user’s crontab while using sudo, we should use -u option to specify the crontab owner.

To learn more about the crontab command:

man crontab

Standard and Non-Standard Values

Crontab fields accept numbers as values. However, we can put other data structures in these fields, as well.

Ranges

We can pass in ranges of numbers:

0 6-18 1-15 * * /path/to/command

The above cron job will be executed from 6 am to 6 pm from 1st to 15th of each month in the year. Note that the specified range is inclusive, so 1-5 means 1,2,3,4,5.

Lists

A list is a group of values separated by commas. We can have lists as field values:

0 1,4,5,7 * * * /path/to/command

The above syntax will run the cron job at 1 am, 4 am, 5 am and 7 am every day.

Steps

Steps can be used with ranges or the asterisk character (*). When they are used with ranges they specify the number of values to skip through the end of the range. They are defined with a / character after the range, followed by a number. Consider the following syntax:

0 6-18/2 * * * /path/to/command

The above cron job will be executed every two hours from 6 am to 6 pm.

When steps are used with an asterisk, they simply specify the frequency of that particular field. As an example if we set the minute to */5, it simply means every five minutes.

We can combine lists, ranges, and steps together to have more flexible event scheduling:

0 0-10/5,14,15,18-23/3 1 1 * /path/to/command

The above event will run every five hours from midnight of January 1st to 10 am, 2 pm, 3 pm and also every three hours from 6pm to 11 pm.

Names

For the fields month and day of week we can use the first three letters of a particular day or month, like Sat, sun, Feb, Sep, etc.

* * * Feb,mar sat,sun /path/to/command

The preceding cron job will be run only on Saturdays and Sundays of February and March.

Please note that the names are not case-sensitive. Ranges are not allowed when using names.

Predefined Definitions

Some cron implementations may support some special strings. These strings are used instead of the first five fields, each specifying a certain frequency:

  • @yearly, @annually Run once a year at midnight of January 1 (0 0 1 1 *)
  • @monthly Run once a month, at midnight of the first day of the month (0 0 1 * *)
  • @weekly Run once a week at midnight of Sunday (0 0 * * 0)
  • @daily Run once a day at midnight (0 0 * * *)
  • @hourly Run at the beginning of every hour (0 * * * *)
  • @reboot Run once at startup

Multiple Commands in the Same Cron Job

We can run several commands in the same cron job by separating them with a semi-colon (;).

* * * * * /path/to/command-1; /path/to/command-2

If the running commands depend on each other, we can use double ampersand (&&) between them. As a result, the second command will not be executed if the first one fails.

* * * * * /path/to/command-1 && /path/to/command-2

Environment Variables

Environment variables in crontab files are in the form of VARIABLE_NAME = VALUE (The white spaces around the equal sign are optional). Cron does not source any startup files from the user’s home directory (when it’s running user-level crons). This means we should manually set any user-specific settings required by our tasks.

Cron daemon automatically sets some environmental variables when it starts. HOME and LOGNAME are set from the crontab owner’s information in /etc/passwd. However, we can override these values in our crontab file if there’s a need for this.

There are also a few more variables like SHELL, specifying the shell which runs the commands. It is /bin/sh by default. We can also set the PATH in which to look for programs.

PATH = /usr/bin;/usr/local/bin

Important: We should wrap the value in quotation marks when there’s a space in the value. Please note that values are considered as ordinary strings and are not interpreted or parsed in any way.

Different Time Zones

Cron uses the system’s time zone setting when evaluating crontab entries. This might cause problems for multiuser systems with users based in different time zones. To work around this problem, we can add an environment variable named CRON_TZ in our crontab file. As a result, all crontab entries will be parsed based on the specified timezone.

How Cron Interprets Crontab Files

After Cron starts, it searches its spool area to find and load crontab files into the memory. It additionally checks the /etc/crontab and or /etc/cron.d directories for system crontabs.

After loading the crontabs into memory, Cron checks the loaded crontabs on a minute-by-minute basis, running the events which are due.

In addition to this, Cron regularly (every minute) checks if the spool directory’s modtime (modification time) has changed. If so, it checks the modetime of all the loaded crontabs and reloads those which have changed. That’s why we don’t have to restart the daemon when installing a new cron job.

Cron Permissions

We can specify which user should be able to use Cron and which user should not. There are two files which play an important role when it comes to cron permissions: /etc/cron.allow and /etc/cron.deny.

If /etc/cron.allow exists, then our username must be listed in this file in order to use crontab. If /etc/cron.deny exists, it shouldn’t contain our username. If neither of these files exist, then based on the site-dependent configuration parameters, either the super user or all users will be able to use crontab command. For example, in Ubuntu, if neither file exists, all users can use crontab by default.

We can put ALL in /etc/cron.deny file to prevent all users from using cron:

echo ALL > /etc/cron.deny

Note: If we create an /etc/cron.allow file, there’s no need to create a /etc/cron.deny file as it has the same effect as creating a /etc/cron.deny file with ALL in it.

Redirecting Output

We can redirect the output of our cron job to a file, if the command (or script) has any output:

* * * * * /path/to/php /path/to/the/command >> /var/log/cron.log

We can redirect the standard output to dev null, to get no email (more on emails below), but still allowing the standard error to be sent as email:

* * * * * /path/to/php /path/to/the/command > /dev/null

To prevent Cron from sending any emails to us, we change the respective crontab entry as below:

* * * * * /path/to/php /path/to/the/command > /dev/null 2>&1

This means “send both the standard output, and the error output into oblivion”.

Email the Output

The output is mailed to the owner of the crontab or the email(s) specified in the MAILTO environment variable (if the standard output or standard error are not redirected as above).

If MAILTO is set to empty, no email will be sent out as the result of the cron job.

We can set several emails by separating them with commas:

MAILTO=admin@example.com,dev@example.com
* * * * * /path/to/command

Cron and PHP

We usually run our PHP command line scripts using the PHP executable.

php script.php

Alternatively, we can use shebang at the beginning of the script, and point to the PHP executable:

#! /usr/bin/php

<?php

// PHP code here

As a result, we can execute the file by calling it by name. However, we need to make sure we have the permission to execute it.

To have more robust PHP command line scripts, we can use third-party components for creating console applications like Symfony Console Component or Laravel Artisan. This article is a good start for using Symfony’s Console Component.
Creating console commands using Laravel Artisan has been also covered here. If you’d rather use another command line tool for PHP, we have a comparison here.

Task Overlaps

There are times when scheduled tasks take much longer than expected. This will cause overlaps, meaning some tasks might be running at the same time. This might not cause a problem in some cases, but when they are modifying the same data in a database, we’ll have a problem. We can overcome this by increasing the execution frequency of the tasks, but still it’s not guaranteed that these overlaps won’t happen again.

We have several options to prevent cron jobs from overlapping.

Using Flock

Flock is a nice tool to manage lock files from within shell scripts or the command line. These lock files are useful for knowing whether or not a script is running.

When used in conjunction with Cron, the respective cron jobs do not start if the lock file exists. You can install Flock using apt-get or yum depending on the Linux distribution.

apt-get install flock

Or

yum install flock

Consider the following crontab entry:

* * * * * /usr/bin/flock --timeout=1 /path/to/cron.lock /usr/bin/php /path/to/scripts.php

In the preceding example, flock looks for /path/to/cron.lock. If the lock is acquired in one second, it will run the script, otherwise, it will fail with an exit code of 1.

Using a Locking Mechanism in the Scripts

If the cron job executes a script, we can implement a locking mechanism in the script. Consider the following PHP script:

<?php
$lockfile = sys_get_temp_dir() . '/' md5(__FILE__) . '.lock';
$pid      = file_exists($lockfile) ? trim(file_get_contents($lockfile)) : null;

if (is_null($pid) || posix_getsid($pid) === false) {

    // Do something here
    
    // And then create/update the lock file
    file_put_contents($lockfile, getmypid());

} else {
    exit('Another instance of the script is already running.');
}

In the preceding code, we keep pid of the current PHP process in a file, which is located in the system’s temp directory. Each PHP script has its own lock file, which is the MD5 hash of the script’s filename.

First, we check if the lock file exists, and then we get its content, which is the process ID of the last running instance of the script. Then we pass the pid to posix_getsid PHP function, which returns the session ID of the process. If posix_getsid returns false it means the process is not running anymore and we can safely start a new instance.

Anacron

One of the problems with Cron is that it assumes the system is running continuously (24 hours a day). This causes problems for machines which are not running all day long (like personal computers). If the system goes off during the time a task is scheduled to run, Cron will not run that task retroactively.

Anacron is not a replacement for Cron, but it has been developed to solve this problem. It runs the commands once a day, week or month but not on a minute-by-minute or hourly basis as Cron does. It is, however, guaranteed that the task will run even if the system goes off for an unanticipated period of time.

Only root or a user with administrative privileges can manage Anacron tasks. Anacron does not run in the background like a daemon, but only once, executing the tasks which are due.

Anacron uses a configuration file (just like crontab) named anacrontabs. This file is located in the /etc directory.

The content of this file looks like this:

# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=3-22

#period in days   delay in minutes   job-identifier   command
1       5       cron.daily              nice run-parts /etc/cron.daily
7       25      cron.weekly             nice run-parts /etc/cron.weekly
@monthly 45     cron.monthly            nice run-parts /etc/cron.monthly

In an anacrontab file, we can only set the frequencies with a period of n days, followed by the delay time in minutes. This delay time is just to make sure the tasks do not run at the same time.

The third column is a unique name, which is used to identify the task in the Anacron log files.

The fourth column is the actual command to be run.

Consider the following entry:

1       5       cron.daily              nice run-parts /etc/cron.daily

The above tasks are run daily, 5 minutes after Anacron is run. It uses run-parts to execute all the scripts within /etc/cron.daily.

The second entry in the list above runs every 7 days (weekly), with a 25 minutes delay.

Collision Between Cron and Anacron

As you have probably noticed, Cron is also set to execute the scripts inside /etc/cron.* directories. This sort of possible collision with Anacron is handled differently in different flavors of Linux. In Ubuntu, Cron checks if Anacron is present in the system, and if it so, it won’t execute the scripts within /etc/cron.* directories.

In other flavors of Linux, Cron updates the Anacron times-stamps when it runs the tasks, so Anacron won’t execute them if they have been already run by Cron.

Quick Troubleshooting

Absolute Path to the commands

It’s a good habit to use the absolute paths to all the executables we use in a crontab file.

* * * * * /usr/local/bin/php /absolute/path/to/the/command

Make Sure Cron Daemon Is Running

If our tasks are not running at all, first we need to make sure the Cron daemon is running:

ps aux | grep crond

The output should similar to this:

root      7481  0.0  0.0 116860  1180 ?        Ss    2015   0:49 crond

Check /etc/cron.allow and /etc/cron.deny Files

If the cron jobs are not running, then we need to check if /etc/cron.allow exists. If it does, we need to make sure our username is listed in this file. If /etc/cron.deny exists, we need to make sure our username is not listed in this file.

If we edit a user’s crontab file whereas the user does not exist in the /etc/cron.allow file, including the user in the /etc/cron.allow won’t run the cron until we re-edit the crontab file.

Execute Permission

We need to make sure that the owner of the crontab has the execute permissions for all the commands and scripts in the crontab file. Otherwise, the cron will not work. Execute permissions can be added to any folder or file with chmod +x /some/file.php

New Line Character

Every entry in the crontab should end with a new line. This means there must be a blank line after the last crontab entry, or the last cron job will never run.

Wrapping Up

Cron is a daemon, running a list of events scheduled to take place in the future. These jobs are listed in special configuration files called crontab files. Users can have their own crontab file, if they are allowed to use Cron, based on /etc/cron.allow or /etc/cron.deny files. In addition to user-level cron jobs, Cron also loads the system-wide cron jobs which are slightly different in syntax.

Our tasks are commonly PHP scripts or command-line utilities. In systems which are not running all the time, we can use Anacron to run the events which happen in the period of n days.
When working with Cron, we should also be aware of the tasks overlapping each other, to prevent data loss. After a cron job is finished, the output will be sent to the owner of the crontab and or the email(s) specified in the MAILTO environment variable.

Did you learn anything new from this post? Have we missed anything? Or did you just like this post and want to tell us how awesomely comprehensive it was? Let us know in the comments below!

More:
  • DGiG

    Very nice post, thank you. The locking is key

    Something similar for Mac’s launchd would also be very enlightening.

    • bunam

      +1 for Mac’s launchd ;)

  • moisesjafet

    Comprehensive! Thanks!

Recommended

Learn Coding Online
Learn Web Development

Start learning web development and design for free with SitePoint Premium!

Get the latest in PHP, once a week, for free.