Printing Utf-8 in Python 3 Using Sublime Text 3

printing UTF-8 in Python 3 using Sublime Text 3

The answer was actually in the question linked in your question - PYTHONIOENCODING needs to be set to "utf-8". However, since OS X is silly and doesn't pick up on environment variables set in Terminal or via .bashrc or similar files, this won't work in the way indicated in the answer to the other question. Instead, you need to pass that environment variable to Sublime.

Luckily, ST3 build systems (I don't know about ST2) have the "env" option. This is a dictionary of keys and values passed to exec.py, which is responsible for running build systems without the "target" option set. As discussed in our comments above, I indicated that your sample program worked fine on a UTF-8-encoded text file containing non-ASCII characters when run with ST3 (Build 3122) on Linux, but not with the same version run on OS X. All that was necessary to get it to run was to change the build system to enclude this line:

"env": {"PYTHONIOENCODING": "utf8"},

I saved the build system, hit B, and the program ran fine.

BTW, if you'd like to read exec.py, or Packages/Python/Python.sublime-build, or any other file packed up in a .sublime-package archive, install PackageResourceViewer via Package Control. Use the "Open Resource" option in the Command Palette to pick individual files, or "Extract Package" (both are preceded by "PackageResourceViewer:", or prv using fuzzy search) to extract an entire package to your Packages folder, which is accessed by selecting Sublime Text → Preferences → Browse Packages… (just Preferences → Browse Packages… on other operating systems). It is located on your hard drive in the following location:

  • Linux: ~/.config/sublime-text-3/Packages
  • OS X: ~/Library/Application Support/Sublime Text 3/Packages
  • Windows Regular Install: C:\Users\YourUserName\AppData\Roaming\Sublime Text 3\Packages
  • Windows Portable Install: InstallationFolder\Sublime Text 3\Data\Packages

Once files are saved to your Packages folder (if you just view them via the "Open Resource" option and close without changing or saving them, they won't be), they will override the identically-named file contained within the .sublime-package archive. So, for instance, if you want to edit the default Python.sublime-build file in the Python package, your changes will be saved as Packages/Python/Python.sublime-build, and when you choose the Python build system from the menu, it will only use your version.

Printing utf8 strings in Sublime Text's console with Windows

I have found a possible fix: add the encoding parameter in the Python.sublime-build file:

{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "cp1252",
...

Note: "encoding": "latin1" seems to work as well, but - I don't know why - "encoding": "utf8" does not work, even if the .py file is UTF8, even if Python 3 uses UTF8, etc. Mystery!


Edit: This works now:

{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "utf8",
"env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},
}

Linked topic:

  • Setting the correct encoding when piping stdout in Python and this answer in particular

  • How to change the preferred encoding in Sublime Text 3 for MacOS for the env trick.

Sublime - Python3 not printing non utf-8 characters (Spanish)

Generally speaking problems like this are caused by some interplay between how Python determines behind the scenes the encoding that it should use when it's generating output and how Sublime is executing the Python interpreter.

In particular where it may correctly determine the correct encoding when run from a terminal, the Python interpreter may get confused and pick the wrong one when Sublime invokes it.

The PYTHONIOENCODING environment variable can be used to tell the interpreter to use a specific encoding in favor of whatever it might have otherwise automatically selected.

The sublime-build file lets you specify custom environment variables to apply during a build using the env key, so you can do something like the following:

{
"shell_cmd": "python -u \"$file\"",
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",

"env": {"PYTHONIOENCODING": "utf-8"},

"variants":
[
{
"name": "Syntax Check",
"shell_cmd": "python -m py_compile \"${file}\"",
}
]
}

Python 2.7 build on Sublime Text 3 doesn't print the '\uFFFD' character

I've reproduced your problem and I've found a solution that works on my platform anyhow: Remove the -u flag from your cmd build config option.

I'm not 100% sure why that works, but it seems to be a poor interaction resulting from the console interpreting an unbuffered stream of data containing multi-byte characters. Here's what I've found:

  • The -u option switches Python's output to unbuffered
  • This problem is not at all specific to the replacement character. I've gotten similar behaviour with other characters like "あ" (U+3042).
  • Similar bad results happen with other encodings. Setting "env": {"PYTHONIOENCODING": "utf-16be"} results in print u'\u3042' outputting 0B.

That last example with the encoding set to UTF-16BE illustrates what I think is going on. The console is receiving one byte at a time because the output is unbuffered. So it receives the 0x30 byte first. The console then determines this is not valid UTF-16BE and decides instead to fallback to ASCII and thus outputs 0. It of courses receives the next byte right after and follows the same logic to output B.

With the UTF-8 encoding, the console receives bytes that can't possibly be interpreted as ASCII, so I believe the console is doing a slightly better job at properly interpreting the unbuffered stream, but it is still running into the difficulties that your question points out.

sublime text3 is not displaying utf8 character but their codes

That is valid JSON. The default is to write the data using only ASCII and Unicode escape codes. When you load the JSON back into Python with json.load() the strings will contain the original characters again.

If you want the JSON text file to be readable instead of using escape codes, you can use ensure_ascii=False:

with open('./gsw_fb_r1_metadata.json','w',encoding='utf8' )as f:
json.dump(metadata,f,ensure_ascii=False)

This will write the data as UTF-8 to the file and be readable in a UTF-8-capable text editor.

Sublime will not print certain unicode chars on windows

Try setting PYTHONIOENCODING=utf-8 environment in your build config:

{
"cmd": ["C:/Users/<username>/AppData/Local/Continuum/Anaconda3/python.exe","$file"],
"selector":"source.py",
"env": { 'PYTHONIOENCODING": "utf-8" }
}

A python program fails to execute in sublime text 3, but success in bash

Sublime has a configuration problem. Python uses the default ascii codec when it can't determine the terminal encoding. It is figuring it out correctly in bash so it works.

If you set the environment variable set PYTHONIOENCODING=utf8 before launching sublime you can force Python to use that encoding when printing. I'm not familiar with sublime so I can't suggest how to fix its configuration.

How to change the preferred encoding in Sublime Text 3 for MacOS

In ST3's build system for Python, you can specify that it should set the LANG environment variable, and this will affect the result returned from locale.getpreferredencoding(), so that you don't need to amend any Python scripts.

Example:

"env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},

This has been confirmed to work on Linux as well as MacOS and Windows.



Related Topics



Leave a reply



Submit