Set Locale to System Default Utf-8

Set locale to system default UTF-8

Answering my own question: On Ubuntu the default LANG is defined in /etc/default/locale:

jeroen@dev:~⟫ cat /etc/default/locale
# Created by cloud-init v. 0.7.7 on Wed, 29 Jun 2016 11:02:51 +0000
LANG="en_US.UTF-8"

So in R we could do something like:

readRenviron("/etc/default/locale")
LANG <- Sys.getenv("LANG")
if(nchar(LANG))
Sys.setlocale("LC_ALL", LANG)

Apache also has a line in /etc/apache2/envvars that can be uncommented to enable this.

I've set the system locale on Windows 10 to use the beta UTF-8 support, but RStudio does not recognize it

As it turns out, the problem was in how I was reading the data. Reading it with read.csv() read it with the encoding set by the locale. Changing it to readr::read_csv() made sure the file was being read with its own encoding, UTF-8.

why do I get a locale error even though it is set?

Making the "comment crowned by success" an answer:

sudo locale-gen en_US en_US.UTF-8
sudo dpkg-reconfigure locales

Best practice: Should I try to change to UTF-8 as locale or is it safe to leave it as is?

This is not a perfect answer but a good workaround: As Roland pointed out, it might be dangerous to change the locale. So leave it as is. If you have a file and you run into trouble, just search for non-UTF8 encoding as discribed here for RStudio. What I saw, most Editors have such a feature.

Furthermore, this answer gives more insight in what you can do in case you source() a file.

For a way to deal with locales when collations play a crucial part see here

Why is PHP not taking over system default locale settings?

In order to ensure that PHP uses the locale settings from the OS you have to call setlocale(LC_ALL, "") at the very beginning of your code.

The manual of setlocale under https://www.php.net/manual/en/function.setlocale.php states the following:

// If locales is the empty string "", the locale names will be set from 
// the values of environment variables with the same names as the above
// categories, or from "LANG".

// On Windows, setlocale(LC_ALL, '') sets the locale names from the
// system's regional/language settings (accessible via Control Panel).

Your example then looks as follows:

abc@ced4c553207d:~/$ locale -a
C
C.UTF-8
de_CH.utf8
en_US.utf8
POSIX
abc@ced4c553207d:~/$ locale
LANG=de_CH.UTF-8
LANGUAGE=de_CH.UTF-8
LC_CTYPE="de_CH.UTF-8"
LC_NUMERIC="de_CH.UTF-8"
LC_TIME="de_CH.UTF-8"
LC_COLLATE="de_CH.UTF-8"
LC_MONETARY="de_CH.UTF-8"
LC_MESSAGES="de_CH.UTF-8"
LC_PAPER="de_CH.UTF-8"
LC_NAME="de_CH.UTF-8"
LC_ADDRESS="de_CH.UTF-8"
LC_TELEPHONE="de_CH.UTF-8"
LC_MEASUREMENT="de_CH.UTF-8"
LC_IDENTIFICATION="de_CH.UTF-8"
LC_ALL=de_CH.UTF-8
abc@ced4c553207d:~/$ php -r "echo setlocale(LC_MONETARY, 0).\"\n\";"
C
abc@ced4c553207d:~/$ php -r " setlocale(LC_ALL, ''); echo setlocale(LC_MONETARY, 0).\"\n\";"
de_CH.UTF-8
abc@ced4c553207d:~/$


Related Topics



Leave a reply



Submit