PHP - How to Translate a Website into Multiple Languages

Best practice multi language website

Topic's premise

There are three distinct aspects in a multilingual site:

  • interface translation
  • content
  • url routing

While they all interconnected in different ways, from CMS point of view they are managed using different UI elements and stored differently. You seem to be confident in your implementation and understanding of the first two. The question was about the latter aspect - "URL Translation? Should we do this or not? and in what way?"

What the URL can be made of?

A very important thing is, don't get fancy with IDN. Instead favor transliteration (also: transcription and romanization). While at first glance IDN seems viable option for international URLs, it actually does not work as advertised for two reasons:

  • some browsers will turn the non-ASCII chars like 'ч' or 'ž' into '%D1%87' and '%C5%BE'
  • if user has custom themes, the theme's font is very likely to not have symbols for those letters

I actually tried to IDN approach few years ago in a Yii based project (horrible framework, IMHO). I encountered both of the above mentioned problems before scraping that solution. Also, I suspect that it might be an attack vector.

Available options ... as I see them.

Basically you have two choices, that could be abstracted as:

  • http://site.tld/[:query]: where [:query] determines both language and content choice

  • http://site.tld/[:language]/[:query]: where [:language] part of URL defines the choice of language and [:query] is used only to identify the content

Query is Α and Ω ..

Lets say you pick http://site.tld/[:query].

In that case you have one primary source of language: the content of [:query] segment; and two additional sources:

  • value $_COOKIE['lang'] for that particular browser
  • list of languages in HTTP Accept-Language (1), (2) header

First, you need to match the query to one of defined routing patterns (if your pick is Laravel, then read here). On successful match of pattern you then need to find the language.

You would have to go through all the segments of the pattern. Find the potential translations for all of those segments and determine which language was used. The two additional sources (cookie and header) would be used to resolve routing conflicts, when (not "if") they arise.

Take for example: http://site.tld/blog/novinka.

That's transliteration of "блог, новинка", that in English means approximately "blog", "latest".

As you can already notice, in Russian "блог" will be transliterated as "blog". Which means that for the first part of [:query] you (in the best case scenario) will end up with ['en', 'ru'] list of possible languages. Then you take next segment - "novinka". That might have only one language on the list of possibilities: ['ru'].

When the list has one item, you have successfully found the language.

But if you end up with 2 (example: Russian and Ukrainian) or more possibilities .. or 0 possibilities, as a case might be. You will have to use cookie and/or header to find the correct option.

And if all else fails, you pick the site's default language.

Language as parameter

The alternative is to use URL, that can be defined as http://site.tld/[:language]/[:query]. In this case, when translating query, you do not need to guess the language, because at that point you already know which to use.

There is also a secondary source of language: the cookie value. But here there is no point in messing with Accept-Language header, because you are not dealing with unknown amount of possible languages in case of "cold start" (when user first time opens site with custom query).

Instead you have 3 simple, prioritized options:

  1. if [:language] segment is set, use it
  2. if $_COOKIE['lang'] is set, use it
  3. use default language

When you have the language, you simply attempt to translate the query, and if translation fails, use the "default value" for that particular segment (based on routing results).

Isn't here a third option?

Yes, technically you can combine both approaches, but that would complicate the process and only accommodate people who want to manually change URL of http://site.tld/en/news to http://site.tld/de/news and expect the news page to change to German.

But even this case could probable be mitigated using cookie value (which would contain information about previous choice of language), to implement with less magic and hope.

Which approach to use?

As you might already guessed, I would recommend http://site.tld/[:language]/[:query] as the more sensible option.

Also in real word situation you would have 3rd major part in URL: "title". As in name of the product in online shop or headline of article in news site.

Example: http://site.tld/en/news/article/121415/EU-as-global-reserve-currency

In this case '/news/article/121415' would be the query, and the 'EU-as-global-reserve-currency' is title. Purely for SEO purposes.

Can it be done in Laravel?

Kinda, but not by default.

I am not too familiar with it, but from what I have seen, Laravel uses simple pattern-based routing mechanism. To implement multilingual URLs you will probably have to extend core class(es), because multilingual routing need access to different forms of storage (database, cache and/or configuration files).

It's routed. What now?

As a result of all you would end up with two valuable pieces of information: current language and translated segments of query. These values then can be used to dispatch to the class(es) which will produce the result.

Basically, the following URL: http://site.tld/ru/blog/novinka (or the version without '/ru') gets turned into something like

$parameters = [
'language' => 'ru',
'classname' => 'blog',
'method' => 'latest',
];

Which you just use for dispatching:

$instance = new {$parameter['classname']};
$instance->{'get'.$parameters['method']}( $parameters );

.. or some variation of it, depending on the particular implementation.

How can I setup my PHP website as a multilingual site?

The GetText function in PHP is a great way to work with multiple languages.

First the difference between .po, .mo and .pot:

.POT Portable Object Template. This is the file that you get when you extract texts from the application. Normally, you send this file to your translators.

.PO Portable Object. This is the file that you receive back from the translators. It’s a text file that includes the original texts and the translations.

.MO Machine Object. The MO file includes the exact same contents as PO file. The two files differ in their format. While a PO file is a text file and is easy for humans to read, MO files are compiled and are easy for computers to read. Your web server will use the MO file to display the translations.


Usage in PHP

<?php
// Set language to German
putenv('LC_ALL=de_DE');
setlocale(LC_ALL, 'de_DE');

// Specify location of translation tables
bindtextdomain("myPHPApp", "./locale");

// Choose domain
textdomain("myPHPApp");

// Translation is looking for in ./locale/de_DE/LC_MESSAGES/myPHPApp.mo now

// Print a test message
echo gettext("Welcome to My PHP Application");

// Or use the alias _() for gettext()
echo _("Have a nice day");
?>
  • See PHP Documentation
  • See GNU GetText Documentation

Tools

The tool I use is Poedit. The tools allows you to merdge between new texts in a POT and produces PO or MO files then.

Easy way to translate a website

A little late for you, I suppose but in case someone like me stumbles across this thread... Because I currently have the same problem you do.
Unfortunately, there doesn't appear to be a "non-cumbersome way" to do this with PHP. Everything seems to involve lots of function-calls (if you have a lot of text).

Well... there is ONE convenient way. Not exactly safe though. Manipulating the output buffer before it's sent to the user:
=> http://dev-tips.com/featured/output-buffering-for-web-developers-a-beginners-guide

So you could depending on the language chosen just define an array filled with "from->to"-data and replace all the readable text in your buffer by looping through that.

But of course... if you e.g. replace "send" (English) with "senden" (German) and you link to a "send.html", it would break that link.

So if one has to translate not only long, definitely unique strings but also shorter ones, one would have to manipulate only the text that is readable to the user. There is a solution for that too - however, that is JavaScript based:
=> http://www.isogenicengine.com/documentation/jquery-multi-language-site-plugin/

Web translating - single or separate files for different languages?

The same as database approach you can use static file for each language and translation.

en.php

return [
"somekey" => "English Translation"
];

lt.php

return [
"somekey" => "Lithunian Translation"
];

You can then mod rewrite to get language from url if you want some directory structure, or simple query parameter or cookies (as specified by others). If you are using any any RESTfull service it is also possible to set it in HTTP header. Many frameworks also there to help you parse data from url out of the box.

$langCode is language code fetched from Query PAram, url path, header or cookie

you can also use http://php.net/file_exists to check if translation file is available or not before you use require_once to pull the translation resource.

Once you get the language code you can just use

$stringResource = require_once "lang/{$langCode}.php";

Then you can fetch all the resource by its key from $stringResource.

    <?php $stringResource = require_once "lang/{$langCode}.php"; ;?>

<input type="text" id="first-name" placeholder="" required
data-validation="length alphanumeric"
data-validation-length="3-12"
data-validation-error-msg="<?php echo $stringResource['somekey'] ;?>"/>

You can just edit the translation in editor. wont need to connect to database and as it is just assoc array. it would be way faster.

Query:

www.mysite.com/jobs.php?lang=en
as already mentioned it will be ignored in term of SEO. Query parameters are ignored by crawlers.

URL Path

www.mysite.com/en/jobs.php

here you need to do mod rewrite http://httpd.apache.org/docs/2.0/misc/rewriteguide.html which is basically just catch url and fetch out the en part and rewrite to something like www.mysite.com/jobs.php?lang=en

Both data can be get from $_GET['lang']. but url path will have benefit on SEO.

Cookies

Will not be shown in address bar. but that also means if the link is shared to another user they will not see the language of origin they will see default language.

https://support.google.com/webmasters/answer/182192?hl=en#1 as per google doc. i believe it would be nice to do it using url path.

Make translations in multiple languages

I know the pains of using gettext, but its performance is what keeps me with it !

In your case, you might want to look at this little project ? i'm pretty sure this might help you !

this simply uses .ini files with translations, you can freely switch between files and echo the different languages for the same word.

https://github.com/Philipp15b/php-i18n



Related Topics



Leave a reply



Submit