How to Put a Translation System in PHP Website

What is the best way to put a translation system in php website?

There are some factors you should consider.

Will the website be updated frequenytly? if yes, by whom? you or the owner? how much data / information are you dealing with? and also... are you doing this frequently (for many clients) ?

I can hardly think that using a relational database can couse any serious speed impacts unless you are having VERY high traffic (several hundreds of thousands of pageviews per day).

Should you be doing this frequently (for lots of clients) think no further: build up a CMS (or use an existing one). If you really need to consider speed impact, you can customize it so that when you are done with the website you can export static HTML pages where possible.

If you are updating frequently, the same as above applies.
If the client has to update (and not you), again, you need a CMS.
If you are dealing with lots of infomration (big and lots of articles), you need a CMS.

All in all, a CMS will help you build up your website structure fast, add content fast and not worry that much about code since it will be reusable.

Now, if you just need to create a small website fast, you can easily do this with hardcoded arrays and datafiles.

Most efficient approach for multilingual PHP website

A few considerations:

1. Translations
Who will be doing the translations? People that are also connected to the site? A translation agency? When using Gettext you'll be working with 'pot' (.po) files. These files contain the message ID and the message string (the translation). Example:

msgid "A string to be translated would go here"  
msgstr ""

Now, this looks just fine and understandable for anyone who needs to translate this. But what happens when you use keywords, like Mike suggests, instead of full sentences? If someone needs to translate a msgid called "address_home", he or she has no clue if this is should be a header "Home address" or that it's a full sentence. In this case, make sure to add comments to the file right before you call on the gettext function, like so:

/// This is a comment that will be included in the pot file for the translators
gettext("ready_for_lost_episode");

Using xgettext --add-comments=/// when creating the .po files will add these comments. However, I don't think Gettext is ment to be used this way. Also, if you need to add comments with every text you want to display you'll a) probably make an error at some point, b) you're whole script will be filled with the texts anyway, only in comment form, c) the comments needs to be placed directly above the Gettext function, which isn't always convient, depending on the position of the function in your code.

2. Maintenance
Once your site grows (even further) and your language files along with it, it might get pretty hard to maintain all the different translations this way. Every time you add a text, you need to create new files, send out the files to translators, receive the files back, make sure the structure is still intact (eager translators are always happy to translate the syntax as well, making the whole file unusable :)), and finish with importing the new translations. It's doable, sure, but be aware with possible problems on this end with large sites and many different languages.



Another option: combine your 2nd and 3rd alternative:

Personally, I find it more useful to manage the translation using a (simple) CMS, keeping the variables and translations in a database and export the relevent texts to language files yourself:

  1. add variables to the database (e.g.: id, page, variable);
  2. add translations to these variables (e.g.: id, varId, language, translation);
  3. select relevant variables and translations, write them to a file;
  4. include the relevant language file in your site;
  5. create your own function to display a variables text:

text('var'); or maybe something like __('faq','register','lost_password_text');

Point 3 can be as simple as selecting all the relevant variables and translations from the database, putting them in an array and writing the serlialized array to a file.

Advantages:

  1. Maintenance. Maintaining the texts can be a lot easier for big projects. You can group variables by page, sections or other parts within your site, by simply adding a column to your database that defines to which part of the site this variable belongs. That way you can quickly pull up a list of all the variables used in e.g. the FAQ page.

  2. Translating. You can display the variable with all the translations of all the different languages on a single page. This might be useful for people who can translate texts into multiple languages at the same time. And it might be useful to see other translations to get a feel for the context so that the translation is as good as possible. You can also query the database to find out what has been translated and what hasn't. Maybe add timestamps to keep track of possible outdated translations.

  3. Access. This depends on who will be translating. You can wrap the CMS with a simple login to grant access to people from a translation agency if need be, and only allow them to change certain languages or even certain parts of the site. If this isn't an option you can still output the data to a file that can be manually translated and import it later (although this might come with the same problems as mentioned before.). You can add one of the translations that's already there (English or another main language) as context for the translator.

All in all I think you'll find that you'll have a lot more control over the translations this way, especially in the long run. I can't tell you anything about speed or efficiency of this approach compared to the native gettext function. But, depending on the size of the language files, I don't think it'll be a big difference. If you group the variables by page or section, you can alway include only the required parts.

Language translation using PHP

AS per me you can try this method.This method is already implemented in our system and it is working properly.

Make php file of each language and define all the variables and use those variables in pages.

for e.g
For english

english.php

$hello="Hello";

persian.php

$hello=html_entity_decode(htmlentities("سلام"));

Now use this variable to page like this.

your_page.php

<label><?php echo $hello; ?></label>

You have load specific language file as per get language variable from URL.

It is better that you have define this language variable into config file.

config.php

if(isset($_GET['lang']) && $_GET['lang']=='persian')
{
require_once('persian.php');
}
else
{
require_once('english.php');
}

Easy way to translate a website

A little late for you, I suppose but in case someone like me stumbles across this thread... Because I currently have the same problem you do.
Unfortunately, there doesn't appear to be a "non-cumbersome way" to do this with PHP. Everything seems to involve lots of function-calls (if you have a lot of text).

Well... there is ONE convenient way. Not exactly safe though. Manipulating the output buffer before it's sent to the user:
=> http://dev-tips.com/featured/output-buffering-for-web-developers-a-beginners-guide

So you could depending on the language chosen just define an array filled with "from->to"-data and replace all the readable text in your buffer by looping through that.

But of course... if you e.g. replace "send" (English) with "senden" (German) and you link to a "send.html", it would break that link.

So if one has to translate not only long, definitely unique strings but also shorter ones, one would have to manipulate only the text that is readable to the user. There is a solution for that too - however, that is JavaScript based:
=> http://www.isogenicengine.com/documentation/jquery-multi-language-site-plugin/

Best practice multi language website

Topic's premise

There are three distinct aspects in a multilingual site:

  • interface translation
  • content
  • url routing

While they all interconnected in different ways, from CMS point of view they are managed using different UI elements and stored differently. You seem to be confident in your implementation and understanding of the first two. The question was about the latter aspect - "URL Translation? Should we do this or not? and in what way?"

What the URL can be made of?

A very important thing is, don't get fancy with IDN. Instead favor transliteration (also: transcription and romanization). While at first glance IDN seems viable option for international URLs, it actually does not work as advertised for two reasons:

  • some browsers will turn the non-ASCII chars like 'ч' or 'ž' into '%D1%87' and '%C5%BE'
  • if user has custom themes, the theme's font is very likely to not have symbols for those letters

I actually tried to IDN approach few years ago in a Yii based project (horrible framework, IMHO). I encountered both of the above mentioned problems before scraping that solution. Also, I suspect that it might be an attack vector.

Available options ... as I see them.

Basically you have two choices, that could be abstracted as:

  • http://site.tld/[:query]: where [:query] determines both language and content choice

  • http://site.tld/[:language]/[:query]: where [:language] part of URL defines the choice of language and [:query] is used only to identify the content

Query is Α and Ω ..

Lets say you pick http://site.tld/[:query].

In that case you have one primary source of language: the content of [:query] segment; and two additional sources:

  • value $_COOKIE['lang'] for that particular browser
  • list of languages in HTTP Accept-Language (1), (2) header

First, you need to match the query to one of defined routing patterns (if your pick is Laravel, then read here). On successful match of pattern you then need to find the language.

You would have to go through all the segments of the pattern. Find the potential translations for all of those segments and determine which language was used. The two additional sources (cookie and header) would be used to resolve routing conflicts, when (not "if") they arise.

Take for example: http://site.tld/blog/novinka.

That's transliteration of "блог, новинка", that in English means approximately "blog", "latest".

As you can already notice, in Russian "блог" will be transliterated as "blog". Which means that for the first part of [:query] you (in the best case scenario) will end up with ['en', 'ru'] list of possible languages. Then you take next segment - "novinka". That might have only one language on the list of possibilities: ['ru'].

When the list has one item, you have successfully found the language.

But if you end up with 2 (example: Russian and Ukrainian) or more possibilities .. or 0 possibilities, as a case might be. You will have to use cookie and/or header to find the correct option.

And if all else fails, you pick the site's default language.

Language as parameter

The alternative is to use URL, that can be defined as http://site.tld/[:language]/[:query]. In this case, when translating query, you do not need to guess the language, because at that point you already know which to use.

There is also a secondary source of language: the cookie value. But here there is no point in messing with Accept-Language header, because you are not dealing with unknown amount of possible languages in case of "cold start" (when user first time opens site with custom query).

Instead you have 3 simple, prioritized options:

  1. if [:language] segment is set, use it
  2. if $_COOKIE['lang'] is set, use it
  3. use default language

When you have the language, you simply attempt to translate the query, and if translation fails, use the "default value" for that particular segment (based on routing results).

Isn't here a third option?

Yes, technically you can combine both approaches, but that would complicate the process and only accommodate people who want to manually change URL of http://site.tld/en/news to http://site.tld/de/news and expect the news page to change to German.

But even this case could probable be mitigated using cookie value (which would contain information about previous choice of language), to implement with less magic and hope.

Which approach to use?

As you might already guessed, I would recommend http://site.tld/[:language]/[:query] as the more sensible option.

Also in real word situation you would have 3rd major part in URL: "title". As in name of the product in online shop or headline of article in news site.

Example: http://site.tld/en/news/article/121415/EU-as-global-reserve-currency

In this case '/news/article/121415' would be the query, and the 'EU-as-global-reserve-currency' is title. Purely for SEO purposes.

Can it be done in Laravel?

Kinda, but not by default.

I am not too familiar with it, but from what I have seen, Laravel uses simple pattern-based routing mechanism. To implement multilingual URLs you will probably have to extend core class(es), because multilingual routing need access to different forms of storage (database, cache and/or configuration files).

It's routed. What now?

As a result of all you would end up with two valuable pieces of information: current language and translated segments of query. These values then can be used to dispatch to the class(es) which will produce the result.

Basically, the following URL: http://site.tld/ru/blog/novinka (or the version without '/ru') gets turned into something like

$parameters = [
'language' => 'ru',
'classname' => 'blog',
'method' => 'latest',
];

Which you just use for dispatching:

$instance = new {$parameter['classname']};
$instance->{'get'.$parameters['method']}( $parameters );

.. or some variation of it, depending on the particular implementation.

translating a website in php or mysql?

You can use gettext, this function is proposed for this feature, not a "standard" but fast enough.

The second options in the use of a PHP file with a big array (really big, for each string), this is the most common solution.

To the database content (the big problem here, don't forget), if all your content must have the translation, one column for each language, otherwise use a flag of language for each line on database.



Related Topics



Leave a reply



Submit