Multi language website word storage


#1

Hi

I am looking at adding multi language capability to my website. The example I am looking at uses a MySQL database to store and cross reference the words on the webpage.

My stack is a LEMP stack.

I understand however, that querying a database is “expensive” in terms of computer memory work / resources.

I am wondering therefore if it wold be better to store the language words in some form of XML file and use javascript (like Ajax) to pull the data in?

Or is that overkill and querying the database for each page is fine?

Thoughts?
Thanks!
Karen


#2

I'm curious to know why not just add Google Translate? My old host limited my databases and content so I had to cut corners where I could.

https://translate.google.com/manager/website/?hl=yi


#3

Yes, and there's also Microsoft Translator


#4

I guess it depends on whether the OP needs a good translation or if a rough translation will do. It probably depends on the language too. I have yet to find a half-decent automatic translation for Italian.


#5

There are several options available for translations. The most common formats are PO files and XLIFF.

Database could work too, depending on how many translations there are. In my opinion it would certainly be better to fetch all translations from the database once and keep re-using them, instead of querying every translation when you need it, resulting in a lot more queries.


#6

Yes I am generally not happy with translation engines. The quality is subpar.

As to the database ok in the single query at start. I can launch with that.

But how to make readily available across all pages? I would not want to use Sessions? (I am just learning oop and am mainly procedural)

So then query and dump in to local files then query that throughout the session?

My thought would be to download the string data in the form of a mysql[assoc] array in the form of an include file in the header.html initiated with a select post query and then echo throughout the site like $words[‘log_on’] (where that value is associated with the correct language and the array itself is accessed via the header.html file)) and placed in between HTML code.

Otherwise, If there is a best practices approach would live to hear it.

Thanks!
Karen


#7

Because this topic is in the Database category I was thinking database solutions. eg. user enters post content, content stored into table, third party translates text, translation stored into table, visitor is served one or another of the translations.

If what is to be translated are relatively stable text strings, AFAIK, the typical approach is to have multiple files. eg. WordPress uses Gettext and Discourse uses Transifex

https://codex.wordpress.org/I18n_for_WordPress_Developers


#8

It is quite common for sites make a query (or a few) for each page. Though I think there are other ways.

One thing I have come across (but not really actively worked with) is MVCL. That is like MVC (Model, View, Controller) with the additional "L" being Language.
So you have a folder for language then a sub-folder for each language, english, french, german ect.
Those contain include files for various pagespage-/elements which are just associative arrays of words and phrases (such as "Log In") in the given language, there the array keys are common to all languages and the values are translated.
So I guess the include path will contain a variable to set the language and the view will print out the values.

I've never done a multi lingual site, so this may not be the best way, I don't know, it's just something I have seen.


#9

It's not really clear what the OP really wants.

Why do you want to use a database? What kind of content do you even want to translate? Is it static content, small sentences like Log In etc, or entire articles?


#10

Hi sorry if I was not specific enough.

I need to translate the relatively static content that the website presents. For example - login, first name, login, last name, go etc.

These will be buttons, field names, titles, navigation titles etc.

I will NOT be translating what the users submit into the website.

So, static content is the best description comprised mainly of single / double words and the occasional sentence.

So, I just wanted to populate the website with the correct language static words /setnences when uses chooses their language. The idea of the database was to store and cross reference the correct language word, retrieve it all when the user makes their llanguage choise and print the correct words in between the HTML tags.

My query was whether this was standard practice or even an optimal approach. Given that my understanding is that querying a database via php and a lemp stack is relatively resource intensive.
Thanks!
Karen


#11

You can see an example of what Discourse uses
https://github.com/discourse/discourse/tree/master/config/locales


#12

Hi

The code is a bit outside my realm of knowledge but i think it appears to be just a form of an array / j s o n that is cross referenced and pulled into other files for display.

So they do static files and not database queries.


#13

As for Discourse, I know it's YAML format and that it's used when assets are compiled. I know that pages poll for JSON.

What I don't know for certain is how text from the token: value pairs gets from the yml file to the browser. Because they are compiled, I'm guessing they're put into a JSON object in memory for Ember to use. I could be wrong and the token: value pairs could be entered into a table, with the table used to get the JSON, but I don't think so.

What I know is saved in a table is the plugins third party translations of post content for reuse without involving the third party for the same content.


#14

That may be in a way similar to the MVCL I have seen.
This is taken from an Opencart site which is PHP based:-

<?php
// Text
$_['text_home']          = 'Home';
$_['text_wishlist']      = 'Wish List (%s)';
$_['text_shopping_cart'] = 'Shopping Cart';
$_['text_category']      = 'Categories';
$_['text_account']       = 'My Account';
$_['text_register']      = 'Register';
$_['text_login']         = 'Login';
$_['text_order']         = 'Order History';
$_['text_transaction']   = 'Transactions';
$_['text_download']      = 'Downloads';
$_['text_logout']        = 'Logout';
$_['text_checkout']      = 'Checkout';
$_['text_search']        = 'Search';
$_['text_all']           = 'Show All';

This file is:-
httpdocs/catalog/language/en-gb/common/header.php

So I think the language include path may be defined with a variable in it, perhaps like:-

$languagePath = "httpdocs/catalog/language/$language/" ;

Where $language is set to whatever the user's language is.


#15

Seeing that reminds me of the phpBB I worked with several years ago (writing mods for ver 2)

There is a language folder that has a lang_english folder. That folder has a few PHP "lang_" files. i.e. admin, bbcode, faq, main. Each with lengthy arrays eg.

//
// Index page
//
$lang['Index'] = 'Index';
$lang['No_Posts'] = 'No Posts';
$lang['No_forums'] = 'This board has no forums';

$lang['Private_Message'] = 'Private Message';
$lang['Private_Messages'] = 'Private Messages';
$lang['Who_is_Online'] = 'Who is Online';

$lang['Mark_all_forums'] = 'Mark all forums read';
$lang['Forums_marked_read'] = 'All forums have been marked read';

Then in the template files

$message = $lang['Forums_marked_read'] 
 . '<br /><br />' 
 . sprintf($lang['Click_return_index'], '<a href="' 
 . append_sid("index.$phpEx") 
 . '">', '</a> ');

The important thing is that every text string that is to have translations can not be hard-coded. They need to either be a variable or a function call or something the templating engine will use to replace with a string translation.

For example

phpBB
<span>' . $lang['sometext'] . '</span>

WordPress
<span>' . __('sometext', 'en_us') . '</span>

Handlebars
<span>{{en.sometext}}</span>


#16

Thanks yes I see that elsewhere.It is simple but it works. That looks like a good option. Tx!