Localizing PHP Applications “The Right Way”, Part 3

Abdullah Abouzekry

In Part 2 you gained more insight into using the gettext library by learning the most important functions of the extension. In this part you’ll learn how to best use a fallback locale, switch between locales, and override the currently selected message domain. So without further ado, let’s dive in!

Directory Structure and Fallback Locales

If you’ve been following along in the previous articles, you should have the following directory structure for testing the gettext library. While it is typical, it is not designed to achieve the best performance.

directory recap

Generally speaking, you should define your application’s default language/locale before getting too far into the localization process because this decision will affect how you create your core domain files. For example, if you decide to use American English as the default locale – as most people do – then you probably won’t create the en_US directory at all! It is preferable to create each target domain with msgids that are actual strings that would have been found in the default locale. So instead of using the identifier “HELLO_WORLD”, you should be using the actual English string.

#Test token 1
msgid "Hello World!"
msgstr "Bonjour tout le monde!"

Remember, gettext will display the msgid if it can’t find a translation domain. So it’s not about the en_US directory itself, it’s about avoiding unnecessary translations when the user requests the default locale. Using real text strings like this saves execution time and memory by eliminating the need to translate English to English, “HELLO_WORLD” to “Hello World!”.

While some developers prefer to keep the English-to-English translation so there is a clear distinction between application strings (“HELLO_WORLD”) and interface text (“Hello World!”), I much prefer this fallback approach. Of course you are free to choose whichever best aligns with your needs and personal style.

After deleting the en_US directory, I’ll add two more locales the application can target: Spanish (es_ES) and Egyptian Arabic (ar_EG). This is how the Locale directory should look now:

ar_EG and es_ES directories

Don’t forget to create the necessary translation domains for each using the same procedures outlined in Part 1.

The French domain contains:

#Test token 1
msgid "Hello World!"
msgstr "Bonjour tout le monde!"

#Test token 2
msgid "Testing Translation..."
msgstr "Test de traduction..."

The Spanish domain contains:

#Test token 1
msgid "Hello World!"
msgstr "¡Hola mundo!"

#Test token 2
msgid "Testing Translation..."
msgstr "Prueba de traducción..."

And the Arabic domian contains:

#Test token 1
msgid "Hello World!"
msgstr "!أهلا بالعالم"

#Test token 2
msgid "Testing Translation..."
msgstr "...اختبار الترجمة"

See how the msgids in all the target domains are actually strings from the en_US locale that was removed since you are now using it as the default. Now you have a real-world directory structure with translation domains!

Switching Locales

Switching between various locales is as easy as telling gettext to use another domain. Typically you will do this at the top of your application files or by putting it in a common file to be included in each script that sends output to the browser.

Create a new file named locale.php with the following contents:

// use sessions

// get language preference
if (isset($_GET["lang"])) {
    $language = $_GET["lang"];
else if (isset($_SESSION["lang"])) {
    $language  = $_SESSION["lang"];
else {
    $language = "en_US";

// save language preference for future page requests
$_SESSION["Language"]  = $language;

$folder = "Locale";
$domain = "messages";
$encoding = "UTF-8";

putenv("LANG=" . $language); 
setlocale(LC_ALL, $language);

bindtextdomain($domain, $folder); 
bind_textdomain_codeset($domain, $encoding);


Then open the test-locale.php script; remove the domain setup code and include the new locale.php file instead. Your code should now look like this:

// Include I18N support
require_once "locale.php";

echo _("Hello World!"), "<br>";
echo _("Testing Translation...");

Now you’re ready for the fun! Go to your browser and you’ll get the following output:

Hello World!
Testing Translation...

Change the URL to pass in one of the locales created earlier, for example: test-locale.php?lang=fr_FR. gettext will display the output in French or Arabic, depending on what you provided for the parameter. Switching languages is as simple as that!

Bonjour tout le monde!
Test de traduction...

The locale.php file makes use of sessions so you only have to pass the lang parameter once and it will be used for subsequent requests by the user. If you are performing URL-rewriting, another possibility is to make the language part of the URL and extract it from there, as in www.example.com/en_US/test-locale.php.

Overriding the Current Domain

Let’s suppose for a moment you want to switch from one text domain to another. This might be needed if you’re using separate domains to store the core translation and system error messages, for example messages would be main application domain and errors would be the domain for error strings. You’ll find the function dgettext() very useful for this type of setup.

dgettext() is essentially the same as gettext() and _(), except it accepts the name of a domain as the first argument in which to look up the translation. This doesn’t affect the default domain that was set by textdomain(); subsequent calls to gettext() will still use the default.

Create a new translation domain by the name of errors for the French locale (TestI18N/Locale/fr_FR/LC_MESSAGES/errors.po). Make sure you create the file using Poedit, then open it using a text editor and add the following translations:

Test Error 1
msgid "Error getting content"
msgstr "Erreur de l'obtention du contenu"

#Test Error 2
msgid "Error saving data"
msgstr "Erreur de sauvegarde des données"

Save errors.po, re-open it in Poedit and compile it to errors.mo. Then add the following lines at the end of test-locale.php:

echo "<br>";
echo _("Error getting content"), "<br>";
echo _("Error saving data");

When you run the test-locale.php script you’ll see the English strings, even if you pass lang=fr_FR as the parameter. This is because you added these messages to a different domain (_() is using the messages domain to look up translations since this is what was set with textdomain()). To inform gettext where to find the new translation strings, update the code to use dgettext() instead:

echo dgettext("errors", "Error getting content"), "<br>";
echo dgettext("errors", "Error saving data");

Now when you run the script… you still see the English messages! Hrm… why aren’t they being replaced?

Actually we’ve just made a very common mistake that people make when using gettext. If you recall in Part 1 I talked about two very important methods, bindtextdomain() and bind_textdomain_codeset() and mentioned that you can call them several times to bind as many domains as you want. Whenever you need to use a domain you must to explicitly bind it using bindtextdomain() first. gettext only allows loading a single “master” domain, which is the one when specify using textdomain() and is the one used by gettext() and _(). You can lookup messages in other domains with dgettext() only if they’re bound first.

Update the locale.php include file to bind the errors domain as well:

bindtextdomain($domain, $folder); 
bind_textdomain_codeset($domain, $encoding);

bindtextdomain("errors", "Locale"); 
bind_textdomain_codeset("errors", "UTF-8");


Now when you run your script you’ll see the error messages correctly translated to French.


In this part you learned how using the default locale’s strings as msgids in target domains can improve performance and organization, and how switching between locales based on the user’s preference can be accomplished. You also learned that while gettext allows only one default lookup domain, you can use multiple domains with dgettext() provided you’ve bound them first.

In the next part I’ll show you how powerful gettext is when it comes to handling one of language’s most demanding aspects – plural forms.

Image via sgame / Shutterstock