printing UTF-8 in Python 3 using Sublime Text 3
The answer was actually in the question linked in your question - PYTHONIOENCODING
needs to be set to "utf-8"
. However, since OS X is silly and doesn't pick up on environment variables set in Terminal or via .bashrc
or similar files, this won't work in the way indicated in the answer to the other question. Instead, you need to pass that environment variable to Sublime.
Luckily, ST3 build systems (I don't know about ST2) have the "env"
option. This is a dictionary of keys and values passed to exec.py
, which is responsible for running build systems without the "target"
option set. As discussed in our comments above, I indicated that your sample program worked fine on a UTF-8-encoded text file containing non-ASCII characters when run with ST3 (Build 3122) on Linux, but not with the same version run on OS X. All that was necessary to get it to run was to change the build system to enclude this line:
"env": {"PYTHONIOENCODING": "utf8"},
I saved the build system, hit ⌘B, and the program ran fine.
BTW, if you'd like to read exec.py
, or Packages/Python/Python.sublime-build
, or any other file packed up in a .sublime-package
archive, install PackageResourceViewer
via Package Control. Use the "Open Resource" option in the Command Palette to pick individual files, or "Extract Package" (both are preceded by "PackageResourceViewer:", or prv
using fuzzy search) to extract an entire package to your Packages
folder, which is accessed by selecting Sublime Text → Preferences → Browse Packages…
(just Preferences → Browse Packages…
on other operating systems). It is located on your hard drive in the following location:
- Linux:
~/.config/sublime-text-3/Packages
- OS X:
~/Library/Application Support/Sublime Text 3/Packages
- Windows Regular Install:
C:\Users\YourUserName\AppData\Roaming\Sublime Text 3\Packages
- Windows Portable Install:
InstallationFolder\Sublime Text 3\Data\Packages
Once files are saved to your Packages
folder (if you just view them via the "Open Resource" option and close without changing or saving them, they won't be), they will override the identically-named file contained within the .sublime-package
archive. So, for instance, if you want to edit the default Python.sublime-build
file in the Python
package, your changes will be saved as Packages/Python/Python.sublime-build
, and when you choose the Python
build system from the menu, it will only use your version.
Printing utf8 strings in Sublime Text's console with Windows
I have found a possible fix: add the encoding
parameter in the Python.sublime-build
file:
{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "cp1252",
...
Note: "encoding": "latin1"
seems to work as well, but - I don't know why - "encoding": "utf8"
does not work, even if the .py file is UTF8, even if Python 3 uses UTF8, etc. Mystery!
Edit: This works now:
{
"cmd": ["python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"encoding": "utf8",
"env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},
}
Linked topic:
Setting the correct encoding when piping stdout in Python and this answer in particular
How to change the preferred encoding in Sublime Text 3 for MacOS for the
env
trick.
Sublime - Python3 not printing non utf-8 characters (Spanish)
Generally speaking problems like this are caused by some interplay between how Python determines behind the scenes the encoding that it should use when it's generating output and how Sublime is executing the Python interpreter.
In particular where it may correctly determine the correct encoding when run from a terminal, the Python interpreter may get confused and pick the wrong one when Sublime invokes it.
The PYTHONIOENCODING
environment variable can be used to tell the interpreter to use a specific encoding in favor of whatever it might have otherwise automatically selected.
The sublime-build
file lets you specify custom environment variables to apply during a build using the env
key, so you can do something like the following:
{
"shell_cmd": "python -u \"$file\"",
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"env": {"PYTHONIOENCODING": "utf-8"},
"variants":
[
{
"name": "Syntax Check",
"shell_cmd": "python -m py_compile \"${file}\"",
}
]
}
Python 2.7 build on Sublime Text 3 doesn't print the '\uFFFD' character
I've reproduced your problem and I've found a solution that works on my platform anyhow: Remove the -u
flag from your cmd
build config option.
I'm not 100% sure why that works, but it seems to be a poor interaction resulting from the console interpreting an unbuffered stream of data containing multi-byte characters. Here's what I've found:
- The
-u
option switches Python's output to unbuffered - This problem is not at all specific to the replacement character. I've gotten similar behaviour with other characters like "あ" (U+3042).
- Similar bad results happen with other encodings. Setting
"env": {"PYTHONIOENCODING": "utf-16be"}
results inprint u'\u3042'
outputting0B
.
That last example with the encoding set to UTF-16BE illustrates what I think is going on. The console is receiving one byte at a time because the output is unbuffered. So it receives the 0x30
byte first. The console then determines this is not valid UTF-16BE and decides instead to fallback to ASCII and thus outputs 0
. It of courses receives the next byte right after and follows the same logic to output B
.
With the UTF-8 encoding, the console receives bytes that can't possibly be interpreted as ASCII, so I believe the console is doing a slightly better job at properly interpreting the unbuffered stream, but it is still running into the difficulties that your question points out.
sublime text3 is not displaying utf8 character but their codes
That is valid JSON. The default is to write the data using only ASCII and Unicode escape codes. When you load the JSON back into Python with json.load()
the strings will contain the original characters again.
If you want the JSON text file to be readable instead of using escape codes, you can use ensure_ascii=False
:
with open('./gsw_fb_r1_metadata.json','w',encoding='utf8' )as f:
json.dump(metadata,f,ensure_ascii=False)
This will write the data as UTF-8 to the file and be readable in a UTF-8-capable text editor.
Sublime will not print certain unicode chars on windows
Try setting PYTHONIOENCODING=utf-8
environment in your build config:
{
"cmd": ["C:/Users/<username>/AppData/Local/Continuum/Anaconda3/python.exe","$file"],
"selector":"source.py",
"env": { 'PYTHONIOENCODING": "utf-8" }
}
A python program fails to execute in sublime text 3, but success in bash
Sublime has a configuration problem. Python uses the default ascii
codec when it can't determine the terminal encoding. It is figuring it out correctly in bash
so it works.
If you set the environment variable set PYTHONIOENCODING=utf8
before launching sublime you can force Python to use that encoding when printing. I'm not familiar with sublime so I can't suggest how to fix its configuration.
How to change the preferred encoding in Sublime Text 3 for MacOS
In ST3's build system for Python, you can specify that it should set the LANG
environment variable, and this will affect the result returned from locale.getpreferredencoding()
, so that you don't need to amend any Python scripts.
Example:
"env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},
This has been confirmed to work on Linux as well as MacOS and Windows.
Related Topics
Importerror When Importing Tkinter in Python
How to Make a Barplot and a Lineplot in the Same Seaborn Plot with Different Y Axes Nicely
Dataframe Set_Index Not Setting
How to Convert 24 Hour Time to 12 Hour Time
Opencv Python Rotate Image by X Degrees Around Specific Point
Unicodeencodeerror: 'Latin-1' Codec Can't Encode Character
Importing Flask.Ext Raises Modulenotfounderror
How to Take the Nth Digit of a Number in Python
Re.Sub Replace with Matched Content
Using Multiple Python Engines (32Bit/64Bit and 2.7/3.5)
Using Print() (The Function Version) in Python2.X
Schedule a Repeating Event in Python 3
How to Convert a Python Datetime.Datetime to Excel Serial Date Number
How to Check If a String Only Contains Letters
How to Find First Non-Zero Value in Every Column of a Numpy Array