PHP
Article

Easy Multi-Language Twig Apps with Gettext

By Bruno Skvorc

There are many approaches for adding new languages to your application’s UI. Though some userland solutions like symfony/translation are arguably simpler to use, they’re slower than the good old native gettext by an order of several magnitudes.

In this tutorial, we’ll modify an English-only application to use gettext. Through this, we’ll demonstrate that getting internationalization up and running in an already existing app is not only possible, but relatively easy.

The application in question will be our own nofw – a ready-to-use skeleton app.

Do you speak English?

Bootstrapping and Basics

We’ll be using our trusty Homestead Improved as always as an environment – if you’d like to follow along, please fire it up. Our box already has gettext installed and activated. We’ll see how to manually install it for deployment purposes at the end of this tutorial.

Since nofw uses Twig, we’ll need the i18n extension. To start the project off right, here’s the full process:

git clone https://github.com/swader/nofw
cd nofw
git checkout tags/2.93 -b 2.93
composer require twig/extensions

Note: the above commands clone an older version of nofw – one without the internationalization features built in – so that readers can follow along with the tutorial.

This will install both Twig’s extensions, and all the project’s dependencies. Follow the procedure from the README to set up the rest of the nofw app (the database end), then return to this post.

The app should be up and running now.

Nofw working

The syntax for getting a translatable string is gettext("string") or its alias: _("string") – that is, _() is the function we call and "string" is the string we’re translating. If a translation for "string" isn’t found, then the original (which is considered a placeholder) value is returned. Placeholders are usually full strings in the most popular language for the site’s audience, so that if translation fails for some reason, readable text is still rendered.

Let’s try and make this work on a bogus PHP file, one that isn’t being powered by Twig, just to make sure everything is in working order. We’ll use the example from the old gettext post series. In the root of the project, we’ll make a file called i18n.php and give it the contents:

<?php

$language = "en_US.UTF-8";
putenv("LANGUAGE=" . $language); 
setlocale(LC_ALL, $language);

$domain = "messages"; // which language file to use
bindtextdomain($domain, "Locale"); 
bind_textdomain_codeset($domain, 'UTF-8');

textdomain($domain);

echo _("HELLO_WORLD");

In the same folder, let’s create a folder structure like this one:

Folder structure

Describing the code above, we first set the OS environment’s language as US English, then save that as an environment variable. PHP’s setlocale function uses the LC_ALL constant to switch all contexts to the given locale – so PHP will try to convert dates, numeric formatting, even currency to the locale we give it. Naturally, LC_ALL includes our custom translated messages, too.

The $domain field is there to tell PHP which language file to use – the language file will be called messages.po in its raw, editable form, and messages.mo in its compiled, machine readable form. bindtextdomain merely sets the path of the language file, which as we know is inside the Locale folder, and bind_textdomain_codeset will set the language character set. UTF-8 is a pretty universally safe bet here.

Finally, textdomain sets the active domain to be used.

Running this test script in the command line would echo the placeholder: HELLO_WORLD. Obviously, it’s missing the actual language file. It’s time to create it.

Extraction

Gettext comes with a handy tool for extracting placeholder strings from files. In the root of the project, we’ll execute:

xgettext --from-code=UTF-8 -o Locale/messages.pot public/i18n.php

Above, xgettext will use the UTF-8 encoding to output (-o) harvested strings from public/i18n.php into the given file. Inspecting the resulting messages.pot file now gives:

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2016-04-10 10:44+0000\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

#: public/i18n.php:13
msgid "HELLO_WORLD"
msgstr ""

.pot stands for portable object template. These template files are used to build other language files. If we decide to add Japanese to our app later on, the .pot file will be used to generate a Locale/ja_JP/LC_MESSAGES/messages.po which will, in turn, be used to generate the respective messages.mo file. Let’s use this approach to generate the en_US messages file now:

msginit --locale=en_US --output-file=Locale/en_US/LC_MESSAGES/messages.po --input=Locale/messages.pot

This process needs to be repeated for every new language we want added to the app.

The .po file is very similar to the .pot file from before, only it contains actual translation strings we can edit:

# English translations for PACKAGE package.
# Copyright (C) 2016 THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# vagrant <vagrant@homestead>, 2016.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2016-04-10 10:44+0000\n"
"PO-Revision-Date: 2016-04-10 10:58+0000\n"
"Last-Translator: vagrant <vagrant@homestead>\n"
"Language-Team: English\n"
"Language: en_US\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=ASCII\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

#: public/i18n.php:13
msgid "HELLO_WORLD"
msgstr "HELLO_WORLD"

After replacing the msgstr value of HELLO_WORLD with Howdy, we should compile the .po file into a .mo file Gettext can read:

msgfmt -c -o Locale/en_US/LC_MESSAGES/messages.mo Locale/en_US/LC_MESSAGES/messages.po

Adding a new language

To be sure things work, let’s add a new language – hr_HR (Croatian).

  1. First, we install the new locale onto the OS with:

    sudo locale-gen hr_HR hr_HR.UTF-8
    sudo update-locale
    sudo dpkg-reconfigure locales
    
  2. We then generate new .po files from the .pot files:

    mkdir -p Locale/hr_HR/LC_MESSAGES
    msginit --locale=hr_HR --output-file=Locale/hr_HR/LC_MESSAGES/messages.po --input=Locale/messages.pot
    
  3. Next, we change the HELLO_WORLD value into Zdravo, then generate the .mo file:

    msgfmt -c -o Locale/hr_HR/LC_MESSAGES/messages.mo Locale/hr_HR/LC_MESSAGES/messages.po
    
  4. Finally, we change the locale setting in the PHP file to hr_HR.UTF-8 and test.

Everything should be working fine.

Note: a restart of the web server and / or PHP-FPM might be necessary to clear the gettext cache.

Twig

Now that we know that gettext works fine and we’re able to add new languages on a whim, let’s see how it behaves in conjunction with Twig. First, let’s add the following into app/config/config_web.php, at the very top:

$language = "hr_HR.UTF-8";
putenv("LANGUAGE=" . $language);
setlocale(LC_ALL, $language);

$domain = "messages"; // which language file to use
bindtextdomain($domain, __DIR__."/../../Locale");
bind_textdomain_codeset($domain, 'UTF-8');

textdomain($domain);

For Twig to work with translatable strings, it needs the i18n extension we installed during the bootstrapping section. Then, in the templates, we use the trans block:

{% trans %}
    Hello {{ name }}!
{% endtrans %}

Of course, gettext has no idea what {{name}} is supposed to mean so Twig’s extension automatically compiles this into the gettext-friendly Hello %name%!. One caveat is that xgettext isn’t equipped to extract twig strings, so we need an alternative as per the docs.

We’ll compile our view templates into the system’s temporary folder, and then xgettext those, like regular PHP files!

First, let’s add a translatable message to one of the files. For example, somewhere into Standard/Views/home.twig, we can put:

    {% trans %}
        This is translatable
    {% endtrans %}

Then, in app/bin, we’ll create a new file: twigcache.php:

<?php

require __DIR__.'/../../vendor/autoload.php';
$shared = require __DIR__.'/../config/shared/root.php';

$tplDir = dirname(__FILE__) . '/templates';
$tmpDir = '/tmp/cache/';
$loader = new Twig_Loader_Filesystem($shared['site']['viewsFolders']);

// force auto-reload to always have the latest version of the template
$twig = new Twig_Environment(
    $loader, [
    'cache' => $tmpDir,
    'auto_reload' => true,
]
);
$twig->addExtension(new Twig_Extensions_Extension_I18n());
// configure Twig the way you want

// iterate over all your templates
foreach ($shared['site']['viewsFolders'] as $tplDir) {
    foreach (new RecursiveIteratorIterator(
                 new RecursiveDirectoryIterator($tplDir),
                 RecursiveIteratorIterator::LEAVES_ONLY
             ) as $file) {
        // force compilation
        if ($file->isFile()) {
            $twig->loadTemplate(str_replace($tplDir . '/', '', $file));
        }
    }
}

This file pulls in the common root.php configuration file in which view folders are defined, and as such we only need to update them in one place. Executing the script with php app/bin/twigcache.php now produces a directory tree with PHP cache files:

/tmp/cache
├── 1a
│   └── 1ad38dfd106734cda72279c3bbd83dd4c64d93ff9c713afb1e74904144018347.php
├── 1c
│   └── 1ca70331199383cea2ce308ab09447cebd7e5e81f2a7f5caa319d577f3a66682.php
...
├── df
│   └── df75e14ad2cb55315ab205872c8b8590ffde333912ec5c89e44c365479bfe457.php
└── f4
    └── f444ff725954cd5a9ec29ceb56a9cbf7eda8a273cea96c542c35a271e0f57c7e.php

We can unleash xgettext on this collection now:

xgettext -o Locale/messages.pot --from-code=UTF-8 -n --omit-header /tmp/cache/*/*.ph

Inspecting Locale/message.pot now reveals entirely new contents:

#: /tmp/cache/d0/d006e63c5a4c4e6a700d9273d4523dd0cf419105fa4b00cf6b89918c67df4b2b.php:56
msgid "This is translatable"
msgstr ""

As before, we can now create the .po files for our two pre-installed languages.

msgmerge -U Locale/en_US/LC_MESSAGES/messages.po Locale/messages.pot
msgmerge -U Locale/hr_HR/LC_MESSAGES/messages.po Locale/messages.pot

The msgmerge command merges the changes from messages.pot into the defined messages.po file. We use msgmerge instead of msginit here for convenience, but we could have also used msginit to start a new language file. Merge has an added bonus, though: seeing as xgettext no longer looked for translatable strings in the i18n.php from the example above, the newly updated .po files actually have the previously used string-value pair commented out:

#: /tmp/cache/d0/d006e63c5a4c4e6a700d9273d4523dd0cf419105fa4b00cf6b89918c67df4b2b.php:56
msgid "This is translatable"
msgstr "Yes, this is totally translatable"

#~ msgid "HELLO_WORLD"
#~ msgstr "Howdy"

This makes it easy to track deprecated translations without actually losing the effort it took to make them.

Assuming we changed some translation values, let’s compile to .mo now and test:

msgfmt -c -o Locale/hr_HR/LC_MESSAGES/messages.mo Locale/hr_HR/LC_MESSAGES/messages.po
msgfmt -c -o Locale/en_US/LC_MESSAGES/messages.mo Locale/en_US/LC_MESSAGES/messages.po

Working Croatian translation

Notice our translated string at the bottom there – everything works as expected!

Granted, the configuration we pasted to the top of config_web.php could use some work – like detecting the desired language through routes etc, but for brevity, this works fine.

Now all that’s left is hunting down all the strings in all the views and turning them into {% trans %} blocks!

Bonus: Scripts!

While the process above isn’t exactly complicated, it’d be simpler not to have to type out those long commands for every little thing. With more languages, things get more convoluted and confusing rather quickly, and it becomes ever easier to make a typo when punching in those shell commands. That’s why we can put together some shortcut bash scripts to help us out.

Note: if you’re not using nofw and don’t intend to, feel free to skip this section and/or just harvest what you think is useful from it. Likewise, note that all these scripts are meant to be run from the root folder of the project.


We’ll put all these scripts into app/bin/i18n/ and make them executable on the command line:

touch app/bin/i18n/{addlang.sh,update-pot.sh,update-mo.sh,config.sh}
chmod +x app/bin/i18n/*.sh

Config

LOCALE_FOLDER="Locale"
REGULAR_USER="forge"

[[ -f app/bin/i18n/config_local.sh ]] && source app/bin/i18n/config_local.sh

This script will be included by other scripts, which allows users to change the desired folder for the locales. Likewise, it contains the name of the non-sudo user. As it’s generally a bad idea to have many sudo commands inside a bash script, and we’ll certainly need to execute a lot of them with root privileges, we’ll opt to execute the whole script with sudo and then just drop privileges to the regular user on those commands that don’t need sudo. The user defaults to “forge” because that’s the user Laravel Forge sets up.

This script also includes another config script if it exists (because it’s in .gitignore and won’t exist on live servers) in which the username can be overridden. This is useful for local development. For example, when using Homestead Improved, everything will be run from the vagrant user’s perspective and the forge user doesn’t exist.

New language / refresh languages script

#!/usr/bin/env bash

# addlang.sh

source app/bin/i18n/config.sh

if [[ $EUID -ne 0 ]]; then
   echo "This script must be run as root" 1>&2
   exit 1
fi

if [ -z "$1" ]; then
for folder in $(find ${LOCALE_FOLDER} -maxdepth 1 -type d | awk -F/ '{print $NF}')
do
    if [ "${folder}" != ${LOCALE_FOLDER} ]; then
        echo "Executing locale-gen ${folder} ${folder}.UTF-8"
        locale-gen ${folder} ${folder}.UTF-8
    fi
done
echo "Executing updates..."
update-locale
dpkg-reconfigure locales
fi

if [ -n "$1" ]; then
echo "Executing locale-gen $1 $1.UTF-8"
locale-gen $1 $1.UTF-8

echo "Executing updates..."
update-locale
dpkg-reconfigure locales

echo "Creating folder: ${LOCALE_FOLDER}/$1/LC_MESSAGES"
sudo -u ${REGULAR_USER} mkdir -p ${LOCALE_FOLDER}/$1/LC_MESSAGES
fi

This will immediately install any locale passed in as the first argument, like so:

sudo app/bin/i18n/addlang.sh ja_JP

It will also create the appropriate language folder in the Locale folder.

If no parameter was passed in, then this script will look for expected locales by traversing the Locale folder, and auto-installing each of the locales as per the subfolder name.

I.e., if there are folders Locale/en_US and Locale/hr_HR, it will be as if we had run sudo app/bin/i18n/addlang.sh en_US and sudo app/bin/i18n/addlang.sh hr_HR. This helps auto-install locales during deployment.

This script needs to be run as root because the locale-related commands require elevated privileges.

Refresh pot script

#!/usr/bin/env bash

# update-potpo.sh

source app/bin/i18n/config.sh

echo "Regenerating cache"
php app/bin/twigcache.php

echo "Running xgettext on the cached files"
xgettext -o ${LOCALE_FOLDER}/messages.pot --from-code=UTF-8 -n --omit-header /tmp/cache/*/*.php

for folder in $(find ${LOCALE_FOLDER} -maxdepth 1 -type d | awk -F/ '{print $NF}')
do
    if [ "${folder}" != ${LOCALE_FOLDER} ]; then
        if [[ -f ${LOCALE_FOLDER}/${folder}/LC_MESSAGES/messages.po ]]; then
            echo "Merging for ${folder}"
            msgmerge -U Locale/${folder}/LC_MESSAGES/messages.po ${LOCALE_FOLDER}/messages.pot
        else
            echo "Initializing for ${folder}"
            msginit --locale=${folder} --output-file=${LOCALE_FOLDER}/${folder}/LC_MESSAGES/messages.po --input=${LOCALE_FOLDER}/messages.pot
        fi
    fi
done

This regenerates the view cache, unleashes xgettext on it, and merges the result with the current .pot file, if any. It then uses the refreshed .pot file to update the .po files. Notice it uses msginit if the language hasn’t been initialized yet, and msgmerge otherwise.

Recompile script

#!/usr/bin/env bash

# update-mo.sh

source app/bin/i18n/config.sh

for folder in $(find ${LOCALE_FOLDER} -maxdepth 1 -type d | awk -F/ '{print $NF}')
do
    if [ "${folder}" != ${LOCALE_FOLDER} ]; then
        echo "Compiling .mo for ${folder}"
        msgfmt -c -o Locale/${folder}/LC_MESSAGES/messages.mo ${LOCALE_FOLDER}/${folder}/LC_MESSAGES/messages.po
    fi
done

The recompilation script is supposed to be run after edits to .po files have been made. It makes the edits ready for use, and allows the translations to appear on the site.

Deploying

Deployment of these language-specific upgrades will depend on the deployment approach applied to the app. We could be using Deployer, we could be using Forge, or something else entirely. What ever the case, before we even try out our scripts above we’ll need to make sure that:

  1. gettext is installed and activated
  2. the necessary locales have been generated on the OS

On Ubuntu, this is easily skipped by making sure the following commands run at the end of the deployment process:

sudo apt-get install gettext
sudo app/bin/i18n/addlang.sh

The rest is automatic, seeing as .pot, .po and .mo files are meant to be committed alongside the application’s source code.

Note that you’ll need to modify both the installation command and the shell scripts above if you’re using something other than Ubuntu

Conclusion

In this tutorial, we looked at adding internationalization features to an existing application powered by Twig. We demonstrated the use of gettext on a mock no-Twig file, made sure everything worked, and then went through a step by step integration with Twig. Finally, we wrote some shortcut scripts that can help tremendously when sharing the project or deploying it to production.

Do you use gettext? Or do your apps take a different approach? Let us know in the comments!

  • http://www.alejandrocelaya.com/ Alejandro Celaya Alastrué

    Very complete article!
    I have tried to use this approach several times, but it has a big problem in my opinion.
    When twig templates get compiled so that gettext is able to extract strings from it, the pot file has this aspect:

    #: /tmp/cache/d0/d006e63c5a4c4e6a700d9273d4523dd0cf419105fa4b00cf6b89918c67df4b2b.php:56
    msgid “This is translatable”
    msgstr “Yes, this is totally translatable”

    The problem with this is that the generated file has a “random” name. If any other developer in the team has also edited any translation and regenerated the pot file, absolutely all the paths change, even if he has changed just one translation and I have changed another one.
    When dealing with VCS systems like GIT, that produces huge conflicts that are hard to deal with.
    I have not found a solution to this problem so far.

    What I’ve seen is that paths don’t usually change in the same machine. I’m not sure what’s twigs strategy to generate cached files names.

    • http://www.bitfalls.com/ Bruno Skvorc

      Good question – one approach is to run `xgettext` with the `–no-location` flag, which will remove these location comment strings altogether, thus helping VCS.

      • http://www.alejandrocelaya.com/ Alejandro Celaya Alastrué

        I didn’t know that was even possible, and yet it seems very logic. I was focusing it wrong, by trying to make twig to generate always the same name, but it’s not necessary while disabling location comments.
        Thank you, that completely solves the problem :-)

  • http://wesam.ly Wesam Alalem

    Working perfectly. thank you.

Recommended
Sponsors
Get the latest in PHP, once a week, for free.