PHP Localization Best Practices? Gettext

PHP Localization Best Practices? gettext?

You basically asked and answered your own question, the answer might just be having a slightly better understanding of how PO files work.

Within the PO file you have a msgid and a msgstr. The msgid is the value which is replaced with the msgstr within the PHP file depending on the localization.

Now you can make those msgid's anything you would like, you could very well make it:

<?php echo _("web.home.featured.HelloWorld"); ?>

And then you would never touch this string again within the source, you only edit the string through the PO files.

So basically the answer to your question is you make the gettext values identifiers for what the string should say, however the translators typically use the default language files text as the basis for conversion, not the identifier itself.

I hope this is clear.

Is gettext the best way to localise a website in php? (only mild localisation needed)

You can just use a lang_XX.php and include it in your application.

$lang = array(
"welcome" => "Welcome",
"bye" => "Bye"
);

For other languages, say lang_fr.php, just have something like:

$lang = array(
"welcome" => "Accueil",
"bye" => "Au revoir"
);

For a small use case, this should be fine and no need for going with .po files. Also, you can define it this way:

function _($l)
{
return $lang[$l];
}

Gettext: Is it a good idea for the message ID to be the english text?

I use meaningful IDs such as "welcome_back_1" which would be "welcome back, %1" etc. I always have English as my "base" language so in the worst case scenario when a specific language doesn't have a message ID, I fall-back on English.

I don't like to use actual English phrases as message ID's because if the English changes so does the ID. This might not affect you much if you use some automated tools, but it bothers me. I don't like to use simple codes (like msg3975) because they don't mean anything, so reading the code is more difficult unless you litter comments everywhere.

Variables in PHP gettext

poedit recognizes the vars.

msgid "Are you sure you want to block %s?"
msgstr "Sind Sie sicher, dass Sie %s blockieren?"

and in PHP

sprintf(_('Are you sure you want to block %s?'),'Alice');

Chinese localization not worked with PHP gettext extension as it works with English

Summary

I was able to make this work without changing the <meta charset="..."> value away from utf-8. You should also be able to remove the AddDefaultCharset rule from your .htaccess and also remove the &charset=GBK from your RewriteRule. You need to make sure that your .po file is formatted and compiled correctly, and also make sure that server can find it.

Explanation/Example

Setting the <meta charset="..."> tag only tells the browser what character encoding is being used on the page. PHP still needs to know which file to select to replace strings. And in any case, although this documentation suggests otherwise, I think you can still use UTF-8 to do Chinese localization. Here is a simple working example I set up on my system:

<?php
// initialize locale-related variables
$locale = $_GET['locale'] ?: 'en_US';
$domain = 'bridges';
$locale_dir = dirname( __FILE__ ) . '/locale'; // using absolute path!

// set up locale
putenv( "LC_ALL=$locale" );
setlocale( LC_ALL, $locale );
bindtextdomain( $domain, $locale_dir );
bind_textdomain_codeset( $domain, 'UTF-8' );
textdomain($domain);
?><!doctype html>
<html>
<head>
<meta charset="utf-8">
<title><?= _( 'Localization Test' ) ?></title>
</head>
<body>
<p><?= _( 'Hello' ) ?>!</p>
</body>
</html>

My .po file which is located at ./locale/zh_CN/LC_MESSAGES/bridges.po looks like:

msgid ""
msgstr ""
"Project-Id-Version: 1.0\n"
"PO-Revision-Date: 2015-07-20\n"
"Last-Translator: Morgan Benton\n"
"Language-Team: Chinese\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Language: zh_CN\n"

msgid "Localization Test"
msgstr "本土化试"

msgid "Hello"
msgstr "您好"

According to a comment on the gettext() documentation, you should put the character encoding and other relevant headers inside your .po file, e.g.

"Content-Type: text/plain; charset=UTF-8\n"

You can check the syntax of your .po file by running the command msgfmt -c bridges.po -o bridges.mo from your terminal. It will warn you if it thinks anything is wrong with your .po file. As the commenter suggested, I think you do NOT need to have the Chinese system libraries installed.

P.S. I don't know if these Chinese translations are correct or not. This is just what Google Translate gave me! :)

What are the reasons to use or to not use PHP's native gettext versus a self-build?

The gettext extension has some quirks.

  • It keeps translation strings in memory, and thus can necessitate a restart (under the mod_php runtime that is) when catalogs are updated.
  • The gettext API wasn't really designed for web apps. (It looks for environment variables and system settings. You have to spoon feed the Accept-Language header.)
  • Many people run into problems setting it up.
  • On the other hand there is more tool support for gettext.

You will almost always have less trouble with a handicrafted solution. But that being said, the gettext API is unbeatable in conciseness. _("orig text") is more or less the optimal interface for translating text.

If you want to code something up yourself, I recommend you concentrate on that.

  • Use a simple function name. In lieu of _() a few php apps use the double underscore __(). Don't adopt any library that makes it cumbersome to actually use translated strings. (E.g. if using Zend Framework, always write a wrapper function.)
  • Accept raw English text as input. Avoid mnemonic translation keys (e.g. BTN_SUBMT)
  • Do not under no circumstances use the database for translation catalogues. Those texts are runtime data, not application data. (For a bad example see osCommerce.)

You can often get away with PHP array scripts lang/nl.php containing nothing but $text["orig english"] = "dutch here";, which are easy to utilize from whatever access method you use.

Also avoid pressing everything into that system. Sometimes it's unavoidable to adopt a second mechanism for longer texts. I for example used template/mail.EN.txt for bigger blobs.

Combining keys and full text when working with gettext and .po files

I just answered a similar (much older) question here.

Short version:

The PO file format is very simple, so it is possible to generate PO/MO files from another workflow that allows the flexibility you're asking for. (your devs want identifiers, your translators want words)

You could roll this solution yourself, or use a cloud-based app like Loco to manage your translations and export a Gettext file with identifiers when your devs need them.



Related Topics



Leave a reply



Submit