🔥 Get a free month of Premium Access. Use code: FREEAUG at checkout

Cron Jobs: A Comprehensive Guide

Reza Lavarian
Share

Do you need to run a script regularly but don’t want to remember to launch it manually? Or maybe you need to execute a command at a specific time or interval but don’t want the process to monopolize your CPU or memory. In either case, cron jobs are perfect for the task. Let’s look at what they are, how to set them up, and some of the things you can do with them.

There are times when there’s a need to run a group of tasks automatically at certain times in the future. These tasks are usually administrative but could be anything – from making database backups to downloading emails when everyone is asleep.

Cron is a time-based job scheduler in Unix-like operating systems, which triggers certain tasks in the future. The name originates from the Greek word χρόνος (chronos), which means time.

The most commonly used version of Cron is known as Vixie Cron. Paul Vixie originally developed it in 1987.

PHP & MySQL: Novice to Ninja

Cron Job Terminology

  • Job: a unit of work, a series of steps to do something. For example, sending an email to a group of users. This article will use task, job, cron job, or event interchangeably.
  • Daemon: a computer program that runs in the background, serving different purposes. Daemons often start at boot time. A web server is a daemon serving HTTP requests. Cron is a daemon for running scheduled tasks.
  • Cron Job: a cron job is a scheduled job. The daemon runs the job when it’s due.
  • Webcron: a time-based job scheduler that runs within the server environment. It’s an alternative to the standard Cron, often on shared web hosts that do not provide shell access.

Getting Started with Cron Jobs

If we take a look inside the /etc directory, we can see directories like cron.hourly, cron.daily, cron.weekly and cron.monthly, each corresponding to a certain frequency of execution.

One way to schedule our tasks is to place our scripts in the proper directory. For example, to run db_backup.php on a daily basis, we put it inside cron.daily. If the folder for a given frequency is missing, we would need to create it first.

Note: This approach uses the run-parts script, a command which runs every executable it finds within the specified directory.

This is the simplest way to schedule a task. However, if we need more flexibility, we should use Crontab.

Crontab Files

Cron uses special configuration files called crontab files, which contain a list of jobs to be done. Crontab stands for Cron Table. Each line in the crontab file is called a cron job, which resembles a set of columns separated by a space character. Each row specifies when and how often Cron should execute a certain command or script.

In a crontab file, blank lines or lines starting with #, spaces or tabs will be ignored. Lines starting with # are considered comments.

Active lines in a crontab are either the declaration of an environment variable or a cron job. Crontab does not allow comments on active lines.

Below is an example of a crontab file with just one entry:

0 0 * * *  /var/www/sites/db_backup.sh

The first part 0 0 * * * is the cron expression, which specifies the frequency of execution. The above cron job will run once a day.

Users can have their own crontab files named after their username as registered in the /etc/passwd file. All user-level crontab files reside in Cron’s spool area. You should not edit these files directly. Instead, we should edit them using the crontab command-line utility.

Note: The spool directory varies across different distributions of Linux. On Ubuntu it’s /var/spool/cron/crontabs while in CentOS it’s /var/spool/cron.

To edit our own crontab file:

crontab -e

The above command will automatically open up the crontab file which belongs to our user. If you haven’t chosen a default editor for the crontab before, you’ll see a selection of installed editors to pick from. We can also explicitly choose or change our desired editor for editing the crontab file:

export VISUAL=nano; crontab -e

After we save the file and exit the editor, the crontab will be checked for accuracy. If everything is set properly, the file will be saved to the spool directory.

Note: Each command in the crontab file executes from the perspective of the user who owns the crontab. If your command runs as root (sudo) you will not be able to define this crontab from your own user account unless you log in as root.

To list the installed cron jobs belonging to our own user:

crontab -l

We can also write our cron jobs in a file and send its contents to the crontab file like so:

crontab /path/to/the/file/containing/cronjobs.txt

The preceding command will overwrite the existing crontab file with /path/to/the/file/containing/cronjobs.txt.

To remove the crontab, we use the -r option:

crontab -r

Anatomy of a Crontab Entry

The anatomy of a user-level crontab entry looks like the following:

 # ┌───────────── min (0 - 59) 
 # │ ┌────────────── hour (0 - 23)
 # │ │ ┌─────────────── day of month (1 - 31)
 # │ │ │ ┌──────────────── month (1 - 12)
 # │ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to Saturday, or use names; 7 is Sunday, the same as 0)
 # │ │ │ │ │
 # │ │ │ │ │
 # * * * * *  command to execute

The first two fields specify the time (minute and hour) at which the task will run. The next two fields specify the day of the month and the month. The fifth field specifies the day of the week.

Cron will execute the command when the minute, hour, month, and either day of month or day of week match the current time.

If both day of week and day of month have certain values, the event will run when either field matches the current time. Consider the following expression:

0 0 5-20/5 Feb 2 /path/to/command

The preceding cron job will run once per day every five days, from 5th to 20th of February plus all Tuesdays of February.

Important: When both day of month and day of week have certain values (not an asterisk), it will create an OR condition, meaning both days will be matched.

The syntax in system crontabs (/etc/crontab) is slightly different than user-level crontabs. The difference is that the sixth field is not the command, but it is the user we want to run the job as.

* * * * * testuser /path/to/command

It’s not recommended to put system-wide cron jobs in /etc/crontab, as this file might be modified in future system updates. Instead, we put these cron jobs in the /etc/cron.d directory.

Editing Other Users’ Crontab

We might need to edit other users’ crontab files. To do this, we use the -u option as below:

crontab -u username -e

Note We can only edit other users’ crontab files as the root user, or as a user with administrative privileges.

Some tasks require super admin privileges. You should add them to the root user’s crontab file:

sudo crontab -e

Note: Please note that using sudo with crontab -e will edit the root user’s crontab file. If we need to to edit another user’s crontab while using sudo, we should use -u option to specify the crontab owner.

To learn more about the crontab command:

man crontab

Standard and Non-Standard Crontab Values

Crontab fields accept numbers as values. However, we can put other data structures in these fields, as well.

Ranges

We can pass in ranges of numbers:

0 6-18 1-15 * * /path/to/command

The above cron job will run from 6 am to 6 pm from the 1st to 15th of each month in the year. Note that the specified range is inclusive, so 1-5 means 1,2,3,4,5.

Lists

A list is a group of comma-separated values. We can have lists as field values:

0 1,4,5,7 * * * /path/to/command

The above syntax will run the cron job at 1 am, 4 am, 5 am and 7 am every day.

Steps

Steps can be used with ranges or the asterisk character (*). When they are used with ranges they specify the number of values to skip through the end of the range. They are defined with a / character after the range, followed by a number. Consider the following syntax:

0 6-18/2 * * * /path/to/command

The above cron job will run every two hours from 6 am to 6 pm.

When steps are used with an asterisk, they simply specify the frequency of that particular field. As an example, if we set the minute to */5, it simply means every five minutes.

We can combine lists, ranges, and steps together to have more flexible event scheduling:

0 0-10/5,14,15,18-23/3 1 1 * /path/to/command

The above event will run every five hours from midnight of January 1st to 10 am, 2 pm, 3 pm and also every three hours from 6pm to 11 pm.

Names

For the fields month and day of week we can use the first three letters of a particular day or month, like Sat, sun, Feb, Sep, etc.

* * * Feb,mar sat,sun /path/to/command

The preceding cron job will run only on Saturdays and Sundays of February and March.

Please note that the names are not case-sensitive. Ranges are not allowed when using names.

Predefined Definitions

Some cron implementations may support some special strings. These strings are used instead of the first five fields, each specifying a certain frequency:

  • @yearly, @annually Run once a year at midnight of January 1 (0 0 1 1 *)
  • @monthly Run once a month, at midnight of the first day of the month (0 0 1 * *)
  • @weekly Run once a week at midnight of Sunday (0 0 * * 0)
  • @daily Run once a day at midnight (0 0 * * *)
  • @hourly Run at the beginning of every hour (0 * * * *)
  • @reboot Run once at startup

Multiple Commands in the Same Cron Job

We can run several commands in the same cron job by separating them with a semi-colon (;).

* * * * * /path/to/command-1; /path/to/command-2

If the running commands depend on each other, we can use double ampersand (&&) between them. As a result, the second command will not run if the first one fails.

* * * * * /path/to/command-1 && /path/to/command-2

Environment Variables

Environment variables in crontab files are in the form of VARIABLE_NAME = VALUE (The white spaces around the equal sign are optional). Cron does not source any startup files from the user’s home directory (when it’s running user-level crons). This means we should manually set any user-specific settings required by our tasks.

Cron daemon automatically sets some environmental variables when it starts. HOME and LOGNAME are set from the crontab owner’s information in /etc/passwd. However, we can override these values in our crontab file if there’s a need for this.

There are also a few more variables like SHELL, specifying the shell which runs the commands. It is /bin/sh by default. We can also set the PATH in which to look for programs.

PATH = /usr/bin;/usr/local/bin

Important: We should wrap the value in quotation marks when there’s a space in the value. Please note that values are ordinary strings. They will not be interpreted or parsed in any way.

Different Time Zones

Cron uses the system’s time zone setting when evaluating crontab entries. This might cause problems for multiuser systems with users based in different time zones. To work around this problem, we can add an environment variable named CRON_TZ in our crontab file. As a result, all crontab entries will parse based on the specified timezone.

How Cron Interprets Crontab Files

After Cron starts, it searches its spool area to find and load crontab files into the memory. It additionally checks the /etc/crontab and or /etc/cron.d directories for system crontabs.

After loading the crontabs into memory, Cron checks the loaded crontabs on a minute-by-minute basis, running the events which are due.

In addition to this, Cron regularly (every minute) checks if the spool directory’s modtime (modification time) has changed. If so, it checks the modetime of all the loaded crontabs and reloads those which have changed. That’s why we don’t have to restart the daemon when installing a new cron job.

Cron Permissions

We can specify which user should be able to use Cron and which user should not. There are two files that play an important role when it comes to cron permissions: /etc/cron.allow and /etc/cron.deny.

If /etc/cron.allow exists, then our username must be listed in this file in order to use crontab. If /etc/cron.deny exists, it shouldn’t contain our username. If neither of these files exists, then based on the site-dependent configuration parameters, either the superuser or all users will be able to use crontab command. For example, in Ubuntu, if neither file exists, all users can use crontab by default.

We can put ALL in /etc/cron.deny file to prevent all users from using cron:

echo ALL > /etc/cron.deny

Note: If we create an /etc/cron.allow file, there’s no need to create a /etc/cron.deny file as it has the same effect as creating a /etc/cron.deny file with ALL in it.

Redirecting Output

We can redirect the output of our cron job to a file if the command (or script) has any output:

* * * * * /path/to/php /path/to/the/command >> /var/log/cron.log

We can redirect the standard output to dev null to get no email, but still send the standard error email:

* * * * * /path/to/php /path/to/the/command > /dev/null

To prevent Cron from sending any emails to us, we change the respective crontab entry as below:

* * * * * /path/to/php /path/to/the/command > /dev/null 2>&1

This means “send both the standard output and the error output into oblivion.”

Email the Output

The output is mailed to the owner of the crontab or the email(s) specified in the MAILTO environment variable (if the standard output or standard error are not redirected as above).

If MAILTO is set to empty, no email will be sent out as the result of the cron job.

We can set several emails by separating them with commas:

MAILTO=admin@example.com,dev@example.com
* * * * * /path/to/command

Cron and PHP

We usually run our PHP command line scripts using the PHP executable.

php script.php

Alternatively, we can use shebang at the beginning of the script, and point to the PHP executable:

#! /usr/bin/php

<?php

// PHP code here

As a result, we can execute the file by calling it by name. However, we need to make sure we have permission to execute it.

To have more robust PHP command-line scripts, we can use third-party components for creating console applications like Symfony Console Component or Laravel Artisan. This article is a good start for using Symfony’s Console Component.

Learn more about creating console commands using Laravel Artisan. If you’d rather use another command-line tool for PHP, we have a comparison here.

Task Overlaps

There are times when scheduled tasks take much longer than expected. This will cause overlaps, meaning some tasks might be running at the same time. This might not cause a problem in some cases, but when they are modifying the same data in a database, we’ll have a problem. We can overcome this by increasing the execution frequency of the tasks. Still, there’s no guarantee that these overlaps won’t happen again.

We have several options to prevent cron jobs from overlapping.

Using Flock

Flock is a nice tool to manage lock files from within shell scripts or the command line. These lock files are useful for knowing whether or not a script is running.

When used in conjunction with Cron, the respective cron jobs do not start if the lock file exists. You can install Flock using apt-get or yum depending on the Linux distribution.

apt-get install flock

Or:

yum install flock

Consider the following crontab entry:

* * * * * /usr/bin/flock --timeout=1 /path/to/cron.lock /usr/bin/php /path/to/scripts.php

In the preceding example, flock looks for /path/to/cron.lock. If the lock is acquired in one second, it will run the script. Otherwise, it will fail with an exit code of 1.

Using a Locking Mechanism in the Scripts

If the cron job executes a script, we can implement a locking mechanism in the script. Consider the following PHP script:

<?php
$lockfile = sys_get_temp_dir() . '/' md5(__FILE__) . '.lock';
$pid      = file_exists($lockfile) ? trim(file_get_contents($lockfile)) : null;

if (is_null($pid) || posix_getsid($pid) === false) {

    // Do something here
    
    // And then create/update the lock file
    file_put_contents($lockfile, getmypid());

} else {
    exit('Another instance of the script is already running.');
}

In the preceding code, we keep pid of the current PHP process in a file, which is located in the system’s temp directory. Each PHP script has its own lock file, which is the MD5 hash of the script’s filename.

First, we check if the lock file exists, and then we get its content, which is the process ID of the last running instance of the script. Then we pass the pid to posix_getsid PHP function, which returns the session ID of the process. If posix_getsid returns false it means the process is not running anymore and we can safely start a new instance.

Anacron

One of the problems with Cron is that it assumes the system is running continuously (24 hours a day). This causes problems for machines that are not running all day long (like personal computers). If the system goes offline during a scheduled task time, Cron will not run that task retroactively.

Anacron is not a replacement for Cron, but it solves this problem. It runs the commands once a day, week, or month but not on a minute-by-minute or hourly basis as Cron does. It is, however, a guarantee that the task will run even if the system goes off for an unanticipated period of time.

Only root or a user with administrative privileges can manage Anacron tasks. Anacron does not run in the background like a daemon, but only once, executing the tasks which are due.

Anacron uses a configuration file (just like crontab) named anacrontabs. This file is located in the /etc directory.

The content of this file looks like this:

# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=3-22

#period in days   delay in minutes   job-identifier   command
1       5       cron.daily              nice run-parts /etc/cron.daily
7       25      cron.weekly             nice run-parts /etc/cron.weekly
@monthly 45     cron.monthly            nice run-parts /etc/cron.monthly

In an anacrontab file, we can only set the frequencies with a period of n days, followed by the delay time in minutes. This delay time is just to make sure the tasks do not run at the same time.

The third column is a unique name, which identifies the task in the Anacron log files.

The fourth column is the actual command to run.

Consider the following entry:

1       5       cron.daily              nice run-parts /etc/cron.daily

These tasks run daily, five minutes after Anacron runs. It uses run-parts to execute all the scripts within /etc/cron.daily.

The second entry in the list above runs every 7 days (weekly), with a 25 minutes delay.

Collision Between Cron and Anacron

As you have probably noticed, Cron is also set to execute the scripts inside /etc/cron.* directories. Different flavors of Linux handle this sort of possible collision with Anacron differently. In Ubuntu, Cron checks if Anacron is present in the system and if it is so, it won’t execute the scripts within /etc/cron.* directories.

In other flavors of Linux, Cron updates the Anacron timestamps when it runs the tasks. Anacron won’t execute them if Cron has already run them.

Quick Troubleshooting

Absolute Path to the commands

It’s a good habit to use the absolute paths to all the executables we use in a crontab file.

* * * * * /usr/local/bin/php /absolute/path/to/the/command

Make Sure Cron Daemon Is Running

If our tasks are not running at all, first we need to make sure the Cron daemon is running:

ps aux | grep crond

The output should similar to this:

root      7481  0.0  0.0 116860  1180 ?        Ss    2015   0:49 crond

Check /etc/cron.allow and /etc/cron.deny Files

When cron jobs are not running, we need to check if /etc/cron.allow exists. If it does, we need to make sure we list our username in this file. And if /etc/cron.deny exists, we need to make sure our username is not listed in this file.

If we edit a user’s crontab file whereas the user does not exist in the /etc/cron.allow file, including the user in the /etc/cron.allow won’t run the cron until we re-edit the crontab file.

Execute Permission

We need to make sure that the owner of the crontab has the execute permissions for all the commands and scripts in the crontab file. Otherwise, the cron will not work. You can add execute permissions to any folder or file with:

chmod +x /some/file.php

New Line Character

Every entry in the crontab should end with a new line. This means there must be a blank line after the last crontab entry, or the last cron job will never run.

Wrapping Up

Cron is a daemon, running a list of events scheduled to take place in the future. We define these jobs in special configuration files called crontab files. Users can have their own crontab file if they are allowed to use Cron, based on /etc/cron.allow or /etc/cron.deny files. In addition to user-level cron jobs, Cron also loads the system-wide cron jobs which are slightly different in syntax.

Our tasks are commonly PHP scripts or command-line utilities. In systems that are not running all the time, we can use Anacron to run the events which happen in the period of n days.

When working with Cron, we should also be aware of the tasks overlapping each other, to prevent data loss. After a cron job is finished, the output will be sent to the owner of the crontab and or the email(s) specified in the MAILTO environment variable.

Did you learn anything new from this post? Have we missed anything? Or did you just like this post and want to tell us how awesomely comprehensive it was? Let us know in the comments below!