Language Fluency in WordPress: Understanding the Basics of I18n

Tweet

With the worldwide success and geographical reach of WordPress, it’s pretty easy to understand why it’s useful to be able to seamlessly translate plugins, themes and WordPress itself into other languages.

However, that doesn’t necessarily make it easily adopted, and for good reason.

Localization (or internationalization, as it is sometimes referred to) is the source of a lot of confusion and general angst for many a WordPress developer. After all, we’re web developers, right? We’re not linguistic experts, and it’s a lot easier to just apply labels to fields in our programs in plain English.

Plus, the notion of trying to figure out how to translate text strings into other languages on your own (much less having somebody else do it for you) might induce mild nausea in even seasoned developers who’ve never really had the need or inclination to do so before.

But trust me… it’s really not all that bad, and it can significantly increase the usefulness as well as the user-base of your custom WordPress plugins and themes. Let’s break it down into bite-sized pieces, starting with the foundations.

The Basics of Internationalization & Localization

Let’s examine that aforementioned development nausea just a bit. The way many of us learned to code, the notion of keeping language considerations in mind when coding a PHP program was, at best, rather academic.

The majority of commercial coding projects have targeted scopes with defined audiences, and it’s often been a reasonable assumption to proceed without regard to translation.

However, since WordPress plugins and themes ultimately become much more useful when a user can use it in their own language, we need a systematic way to correct for this issue.

Enter i18n – another fun acronym to tuck under your belt. Named for the fact that there are 18 letters between the i and the n in the word “internationalization”, I18n describes the notion of creating software systems that are designed to be translated into other languages.

The process of actually translating that software into a specific language is referred to as localization or l10n (can you guess how many letters are between the “l” and the “n” in the word “localization”?). WordPress utilizes a specific framework to handle i18n called GNU gettext – the de facto standard system that is used in nearly all open-source software. GNU gettext is really just a library of PHP helper functions.

Since it is coded right into core, WordPress has the hooks necessary to allow you as a developer to define text string variables in your themes and plugins, as well as a standardized system to provide translations for each string in an unlimited number of languages.

Anatomy of a Localization Process

In general terms, the localization process is pretty straightforward. Conceptually, there are three key components to the process for either a theme or a plugin:

  1. using GNU gettext markers in your theme or plugin that lets WordPress know which strings to translate
  2. a function linking the markers in your theme or plugin to a file that provides a translation key, and
  3. a file that providers a translation key, essentially creating and one-to-one relationship between translatable strings and what the translation should be for a given string.

Let’s talk in a bit more detail about what each of these three components do.

Component #1: GNU gettext markers that let WordPress know which strings to translate

First of all, we need to let WordPress know what which strings we will want to translate. This is done directly in the output code of your theme or plugin by wrapping a specific string with a PHP function that identifies the type of localization you want to do, and then running your original string through a filter that will return the correct version.

While there are an array of functions that exist that allow you to define or output localized strings in different ways, there are really only two localization functions you will use the vast majority of the time:

__( 'string', $domain )

This is a double underscore, and it returns a localized string.

_e( 'string', $domain )

This is an underscore “e”, and it prints out a localized string directly to the browser

Note that both __ and _e take two parameters: a string and a domain. In this context, a domain is strictly a unique identifier and is nothing more than the label that is attached to a specific translation file. This relationship is defined in Component #2.

Component #2: A function linking the markers in your theme or plugin to a file that provides a translation key

Within your theme or plugin, you’ll need to create a relationship between the strings you want to translate and a translation file that provides a key for the string translation.

This is done using one of two PHP functions: load_theme_textdomain() for themes, or load_plugin_textdomain() for plugins.

In the case of theme localization, you’ll use load_theme_textdomain() in your functions.php file. The function takes two parameters as described below:

load_theme_textdomain( $domain, $path )

$domain: A unique identifier assigned to a your custom translatable strings

$path: The path to your translation key file within the theme

Theme localization keys off the WPLANG constant in wp-config.php, but we’ll discuss that in more detail later.

Plugin localization works in a very similar to theme localization, but with a few differences. Set either within the core PHP files or the i18n-enabled plugin, load_plugin_textdomain() takes three parameters as described below:

load_plugin_textdomain( $domain, $abs_rel_path, $plugin_rel_path )

$domain: A unique identifier assigned to a your custom translatable strings

$abs_rel_path: An optional, deprecated function as of WP 2.7. Default it to false or just omit it – it’s nothing to worry about

$plugin_rel_path: The relative path to your translation key file. If you fail to define this path, it’ll default to the root directory that the file is in. While this is by definition an optional parameter, it’s best practice to keep your language translation files separate from your logic files and so you’ll usually want to specify a value here.

In both of these instances, $domain is the unique identifier that we referred to in Component #1, and serves only to define a relationship between the the strings in the code that require translation and Component #3, the translation key.

Component #3: A file that providers a translation key

GNU gettext offers us a systematic way of both providing a mechanism to create one-to-one string translation relationships between individual default strings and their respective translations, and then feeding those various string translations to WordPress in an efficient manner. This is done through .PO and .MO files.

A .PO file is a file that provides a human-readable and editable translation key for a specific language. For instance, if your theme or plugin was written in English and you had translations readily available for French, German, and Pirate English, you would have three corresponding .PO files: one for each of the languages. When a specific string translation is made, it happens in this file.

While .PO files are human friendly and easily editable, they are not ideal to for WordPress to use while processing translation in practice. Instead, WordPress will use a .MO file for it’s actual translation. .MO files are compiled files that can be automatically generated for you when you use helper tools like Poedit. Each .PO file has a corresponding .MO file which is updated each time the .PO file is updated.

The final file pertinent file type is the .POT file, or .PO Template file. A .POT file is a special kind of .PO file that is an exact copy of any of the .PO files in a localization instance with the exception that it is void of any translations. Making .POT files available to translators allows anybody to easily create translations for your themes and plugins into their own native language using helper tools.

Putting the pieces together

So now that we know the main components in a localization process, let’s have a visual look at how they work together in WordPress. Consider the following diagram:

l10n processing in WordPress

We’ll start with the program’s source code.

When a browser initially moves to call a specific page, the translation process is initiated when it recognizes the __() and _e() functions in the source code wrapping text strings.

WordPress recognizes these functions because GNU gettext is built into core, and automatically seeks to translate them. Your functions.php file has already been loaded at this time, and WordPress is able to use the load_plugin_textdomain() or load_theme_textdomain() functions in the functions.php file to identify the connect the strings to a library location (Depicted in Step 1 in the diagram above).

load_plugin_textdomain() or load_theme_textdomain() then retrieve the locale information from the WPLANG Constant set in wp-config.php (Step 2 in the diagram above) and retrieve the proper .MO file associated with the locale (Step 3 in the diagram above).

If the properly named .MO exists, the functions output the translated strings to the browser (Step 4 in the diagram above).

If a properly named .MO file does not exist, or for any strings that do not have translated entries within the .MO file, the default verbiage is instead output to the browser.

Finally, the .MO file is a compiled version of a .PO file, which is human editable. Likewise, .PO files have template files that allow translators to easily modify strings in your program into other languages.

Looks like I don’t have enough room in this article to illustrate all this with some examples that really bring the whole conceptual process to life in WordPress, but I’ll follow up with another article that digs into the actual code involved next week.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • chodorowicz

    well explained, thanks

  • Kevin Albs

    Thanks for the post, I’m looking forward to the follow up.

  • http://www.onsman.com Ricky Onsman
  • Jorge A. Gonzalez

    Thanks for the insight. I need to localize a few of my websites and this article will help.

  • Skweekah

    Since when did Steve Martin start writing tech articles for Sitepoint!

    • http://www.onsman.com Ricky Onsman

      Mick can be a wild and crazy guy, but I don’t think he’s much good on the banjo.