Sanitizing, Escaping and Validating Data in WordPress

Share this article

When creating WordPress plugins and themes, which will be used across thousands of websites, you need to be cautious about how to handle both the data coming into WordPress, and the data that is being presented to the user.

In this tutorial, we are going to look at the native functions that can secure, clean and check data that is coming in or going out of WordPress. This is necessary when creating a settings page, HTML form, manipulating shortcodes, and so on.

WordPress Data Validation

What is Sanitizing?

In a nutshell, sanitizing is cleaning user input. It is the process of removing text, characters or code from input that is not allowed.

Gmail Example: Gmail removes <style> tags and their contents from HTML email messages before they are displayed on the Gmail browser client. This is done to prevent email CSS from overwriting Gmail styles.

WordPress Example: Widget titles cannot have HTML tags in them. If you put HTML tags in them, then they are automatically removed before the title is saved.

There are various functions provided by WordPress to sanitize different data into different forms. Here are some of them:

sanitize_email()

This function strips out all characters that are not allowed in an email address. Code example:

<?php 

echo sanitize_email("narayan prusty@sitepoint.com"); //Output "narayanprusty@sitepoint.com"

Email address don’t allow whitespace characters. Therefore, the whitespace characters were removed from my email address.

sanitize_file_name()

This function strips characters from a filename that can cause issues while referencing the file in the command line. This function is used by WordPress Media Uploader to sanitize media file names. Code example:

<?php

echo sanitize_file_name("_profile pic--1_.png"); //Output "profile-pic-1_.png"

Here, the underscore at the beginning of the name was removed and double dashes were replaced by one single dash. And, finally, whitespace was replaced by a single dash.

sanitize_key()

Options, Meta Data and Transients Keys can only have lowercase alphanumeric characters, dashes and underscores. This function is used to sanitize the keys. Code example:

<?php

echo sanitize_key("http://SitePoint.com"); //Output "httpsitepointcom"

Here, uppercase characters were converted to lowercase characters and other invalid characters were removed.

sanitize_text_field()

This function removes invalid UTF-8 characters, converts HTML specific characters to entities, strips all tags, and removes line breaks, tabs and extra whitespace, strip octets.

WordPress uses this to sanitize widget titles.

<?php

echo sanitize_text_field("<b>Bold<</b>"); //Output "Bold&lt;"

sanitize_title()

This function removes PHP and HTML tags from a string, as well as removing accents. Whitespace characters are converted to dashes.

Note: This function is not used to sanitize titles. For sanitizing titles, you need to use sanitize_text_field. This function is used by WordPress to generate the slug for the posts/pages from the post/page title. Code example:

<?php

echo sanitize_title("Sanítizing, Escaping and Validating Data in WordPress"); //Output "sanitizing-escaping-and-validating-data-in-wordpress"

Here the í character was converted to i and whitespaces were replaced with the -character. And, finally, uppercase characters were converted to lowercase characters.

What is Escaping?

In a nutshell, escaping is securing output. This is done to prevent XSS attack and also to make sure that the data is displayed the way the user expects it to be.

Escaping converts the special HTML characters to HTML entities so that they are displayed, instead of being executed.

Example: Facebook escapes the chat messages while displaying them. To make sure that users don’t run code on each other’s computer.

WordPress provides some functions to escape different varieties of data.

esc_html()

This functions escapes HTML specific characters. Example code:

<?php

echo esc_html("<html>HTML</html>"); //Output "&lt;html&gt;HTML&lt;/html&gt;"

esc_textarea()

Use esc_textarea() instead of esc_html() while displays text in textarea. Because esc_textarea() can double encode entities.

esc_attr()

This function encodes the <,>, &, " and ' characters. It will never double encode entities. This function is used to escape the value of HTML tags attributes.

<?php

echo esc_html("<html>HTML</html>"); //Output "&lt;html&gt;HTML&lt;/html&gt;"

esc_url()

URLs can also contain JavaScript code in them. So, if you want to display a URL or a complete <a> tag, then you should escape the href attribute or else it can cause an XSS attack.

<?php
	$url = "javascript:alert('Hello')";
?>
<a href="<?php echo esc_url($url);?>">Text</a>

esc_url_raw()

This is used if you want to store a URL in a database or use in URL redirecting. The difference between esc_url and esc_url_raw is that esc_url_raw doesn’t replace ampersands and single quotes.

antispambot()

There are lots of email bots, which are constantly looking for email addresses. We may want to display the email address to the users, but not want it to be recognised by email bots. antispambot allows us to do that exactly.

antispambot converts email address characters to HTML entities to block spam bots. Example code:

<?php

echo antispambot("narayanprusty@sitepoint.com"); //Output "&#110;&#97;&#114;&#97;y&#97;npr&#117;sty&#64;s&#105;&#116;e&#112;oi&#110;&#116;.&#99;om"

What is Validating?

In a nutshell, validating is checking user input. This is done to check if the user has entered a valid value.

If data is not valid, then it is not processed or stored. The user is asked to enter the value again.

Example: While creating an account on a site, we are asked to enter the password twice. Both the passwords are validated; they are checked to confirm whether they both are same or not.

You shouldn’t rely on HTML5 validation as it can be easily bypassed. Server side validation is required before processing or storing specific data.

WordPress provides a couple of functions to validate only some types of data. Developers usually define their own functions for validate data. Let’s see some WordPress provided validation functions:

is_email()

Email validation is required while submitting comments, contact forms, and creating an account. is_email() function is provided by WordPress to check if a given is an email address or not. Code example:

<?php

if(is_email("narayanprusty@sitepoint.com"))
{
	echo "Valid Email";
}
else
{
	echo "Invalid Email";
}

is_serialized()

is_serialized() checks if the passed data is string or not. WordPress uses this function while storing options, meta data and transients. If value associated with a key is not a string then WordPress serializes it before storing in database.

Here is example code on how you can use it:

<?php

$data = array("a", "b", "c");

//while storing

if(!is_serialized($data)) 
{ 
  //serialize it 
  $data = maybe_serialize($data); 

  //or else ask user to re-input the data
}

//while displaying

echo maybe_unserialize($data);

Conclusion

We saw what sanitizing, validating and escaping are, and why it is important for every developer to know the functions associated with them. You can find more reading on the topic at the Data Validation Codex page on WordPress.org. It is always a good idea to include these functions when developing a WordPress theme or plugin. Unfortunately, quite a lot of plugins are poorly developed, and do not escape the output. The result is that they make the website open to potential XSS attacks. Please feel free to include any comments or helpful tips in the section below.

Frequently Asked Questions (FAQs) on Sanitizing, Escaping, and Validating Data in WordPress

What is the importance of data sanitization in WordPress?

Data sanitization is a crucial aspect of WordPress security. It involves cleaning or filtering your input data to ensure it’s safe and secure. This process helps to prevent potential security threats like SQL injection and cross-site scripting (XSS) attacks, which can compromise your website’s integrity and the user’s information. By sanitizing your data, you’re ensuring that any harmful or unnecessary characters are stripped off before the data is stored in the database or displayed on your site.

How does data escaping work in WordPress?

Data escaping in WordPress is a security measure that ensures output data is safe to be displayed on your website. It involves securing data that is output to the browser to prevent security vulnerabilities. WordPress provides several functions for escaping data, including esc_html(), esc_url(), and esc_js(). These functions ensure that any potentially harmful code is rendered harmless, thus protecting your site from attacks.

What is data validation in WordPress and why is it necessary?

Data validation is the process of checking if the data complies with the specific rules, standards, or values before it’s processed. In WordPress, it’s necessary to validate data to ensure it’s in the correct format and type before it’s stored in the database or used in a script. This helps to maintain the integrity of your data and prevent potential errors or security vulnerabilities.

How can I sanitize text fields in WordPress?

WordPress provides a function called sanitize_text_field() for sanitizing text fields. This function checks the text field for invalid UTF-8, converts single < characters to entities, and strips all tags. It’s a useful function for ensuring that text fields are safe and secure.

What are some best practices for securing input in WordPress?

Securing input in WordPress involves validating, sanitizing, and escaping data. Always validate data to ensure it meets specific criteria before processing it. Sanitize data to remove any unwanted or potentially harmful characters. Escape data before output to ensure it’s safe to be displayed. Also, use WordPress’s built-in functions for these processes whenever possible.

How can I sanitize URLs in WordPress?

WordPress provides a function called esc_url() for sanitizing URLs. This function checks the URL for invalid characters, removes any invalid characters, and then returns the sanitized URL. It’s a useful function for ensuring that URLs are safe and secure.

What is the difference between sanitizing and escaping data?

While both sanitizing and escaping data are security measures, they serve different purposes. Sanitizing data involves cleaning or filtering the data to remove any unwanted or potentially harmful characters before it’s stored in the database or used in a script. On the other hand, escaping data involves securing data that is output to the browser to prevent potential security vulnerabilities.

How can I validate data in WordPress?

WordPress provides several functions for validating data, including is_email(), is_url(), and is_int(). These functions check if the data is in the correct format or type. Always validate data before processing it to ensure it meets specific criteria.

What are some common security threats that data sanitization, escaping, and validation can prevent?

Data sanitization, escaping, and validation can prevent several common security threats, including SQL injection, cross-site scripting (XSS) attacks, and data corruption. These security measures ensure that your data is safe, secure, and free from potential vulnerabilities.

Are there any plugins that can help with data sanitization, escaping, and validation in WordPress?

Yes, there are several plugins available that can help with data sanitization, escaping, and validation in WordPress. These plugins provide additional functions and features that can enhance your site’s security. However, it’s important to remember that plugins should be used as a supplement to, not a replacement for, good security practices.

Narayan PrustyNarayan Prusty
View Author

Narayan is a web astronaut. He is the founder of QNimate. He loves teaching. He loves to share ideas. When not coding he enjoys playing football. You will often find him at QScutter classes.

ChrisBEscapingSanitizingsecurityValidatingWordPressWordPress Security
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week