How to print UTF-8 strings to std::cout on Windows?
The problem is not std::cout
but the windows console. Using C-stdio you will get the ü
with fputs( "\xc3\xbc", stdout );
after setting the UTF-8 codepage (either using SetConsoleOutputCP
or chcp
) and setting a Unicode supporting font in cmd's settings (Consolas should support over 2000 characters and there are registry hacks to add more capable fonts to cmd).
If you output one byte after the other with putc('\xc3'); putc('\xbc');
you will get the double tofu as the console gets them interpreted separately as illegal characters. This is probably what the C++ streams do.
See UTF-8 output on Windows console for a lenghty discussion.
For my own project, I finally implemented a std::stringbuf
doing the conversion to Windows-1252. I you really need full Unicode output, this will not really help you, however.
An alternative approach would be overwriting cout
's streambuf, using fputs
for the actual output:
#include <iostream>
#include <sstream>
#include <Windows.h>
class MBuf: public std::stringbuf {
public:
int sync() {
fputs( str().c_str(), stdout );
str( "" );
return 0;
}
};
int main() {
SetConsoleOutputCP( CP_UTF8 );
setvbuf( stdout, nullptr, _IONBF, 0 );
MBuf buf;
std::cout.rdbuf( &buf );
std::cout << u8"Greek: αβγδ\n" << std::flush;
}
I turned off output buffering here to prevent it to interfere with unfinished UTF-8 byte sequences.
How to print UTF-8 characters on console using C
The best answer given above was by Joachim Isaksson. Thank you, this ideed seems to be the problem. I solved it in Eclipse by setting the "Encoding" settings for the run configuration to UTF-8.
Show UTF-8 characters in console
There are some hacks you can find that demonstrate how to write multibyte character sets to the Console, but they are unreliable. They require your console font to be one that supports it, and in general, are something I would avoid. (All of these techniques break if your user doesn't do extra work on their part... so they are not reliable.)
If you need to write Unicode output, I highly recommend making a GUI application to handle this, instead of using the Console. It's fairly easy to make a simple GUI to just write your output to a control which supports Unicode.
UTF-8 output on Windows console
It's time to close this now. Stephan T. Lavavej says the behaviour is "by design", although I cannot follow this explanation.
My current knowledge is: Windows XP console in UTF-8 codepage does not work with C++ iostreams.
Windows XP is getting out of fashion now and so does VS 2008. I'd be interested to hear if the problem still exists on newer Windows systems.
On Windows 7 the effect is probably due to the way the C++ streams output characters. As seen in an answer to Properly print utf8 characters in windows console, UTF-8 output fails with C stdio when printing one byte after after another like putc('\xc3'); putc('\xbc');
as well. Perhaps this is what C++ streams do here.
How do I print Unicode to the output console in C with Visual Studio?
This is code that works for me (VS2017) - project with Unicode enabled
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
wchar_t * test = L"the 来. Testing unicode -- English -- Ελληνικά -- Español." ;
wprintf(L"%s\n", test);
}
This is console
After copying it to the Notepad++ I see the proper string
the 来. Testing unicode -- English -- Ελληνικά -- Español.
OS - Windows 7 English, Console font - Lucida Console
Edits based on comments
I tried to fix the above code to work with VS2019 on Windows 10 and best I could come up with is this
#include <stdio.h>
int main()
{
const auto* test = L"the 来. Testing unicode -- English -- Ελληνικά -- Español.";
wprintf(L"%s\n", test);
}
When run it "as is" I see
When it is run with console set to Lucida Console fond and UTF-8 encoding I see
As the answer to 来 character shown as empty rectangle - I suppose is the limitation of the font which does not contain all the Unicode gliphs
When text is copied from the last console to Notepad++ all characters are shown correctly
Related Topics
How to Easily Format My Data Table in C++
Should the Exception Thrown by Boost::Asio::Io_Service::Run() Be Caught
How to Catch Segmentation Fault in Linux
Is There Any Reason to Use This-≫
Convert a Vector≪Int≫ to a String
How to Include Openssl in Visual Studio
C++: What Regex Library Should I Use
How Is Std::String Implemented
Static Constructors in C++? I Need to Initialize Private Static Objects
Floating Point Keys in Std:Map
How to Print an Unsigned Char as Hex in C++ Using Ostream
Include Header Files Using Command Line Option
Difference Between Static, Auto, Global and Local Variable in the Context of C and C++
Why Is the C++ Initializer_List Behavior For Std::Vector and Std::Array Different