How to Open an Std::Fstream (Ofstream or Ifstream) With a Unicode Filename

How to open an std::fstream (ofstream or ifstream) with a unicode filename?

The C++ standard library is not Unicode-aware. char and wchar_t are not required to be Unicode encodings.

On Windows, wchar_t is UTF-16, but there's no direct support for UTF-8 filenames in the standard library (the char datatype is not Unicode on Windows)

With MSVC (and thus the Microsoft STL), a constructor for filestreams is provided which takes a const wchar_t* filename, allowing you to create the stream as:

wchar_t const name[] = L"filename.txt";
std::fstream file(name);

However, this overload is not specified by the C++11 standard (it only guarantees the presence of the char based version). It is also not present on alternative STL implementations like GCC's libstdc++ for MinGW(-w64), as of version g++ 4.8.x.

Note that just like char on Windows is not UTF8, on other OS'es wchar_t may not be UTF16. So overall, this isn't likely to be portable. Opening a stream given a wchar_t filename isn't defined according to the standard, and specifying the filename in chars may be difficult because the encoding used by char varies between OS'es.

How to open unicode file with ifstream using mingw under Windows?

Most likely (it's unclear whether the presented code is the real code) the reason that you see garbage is that std::cout in Windows defaults to presenting its result in a non-UTF-8 console window.

To properly check whether you're reading the UTF-8 file correctly, simply collect all the input in a string, convert it from UTF-8 to UTF-16 wstring, and display that using MessageBoxW (or wide direct console output).

The following UTF-8 → UTF-16 conversion function works nicely with Visual C++ 12.0:

#include <codecvt>          // std::codecvt_utf8_utf16
#include <locale> // std::wstring_convert
#include <string> // std::wstring

auto wstring_from_utf8( char const* const utf8_string )
-> std::wstring
{
std::wstring_convert< std::codecvt_utf8_utf16< wchar_t > > converter;
return converter.from_bytes( utf8_string );
}

Unfortunately, even though it only uses standard C++11 functionality, it fails to compile with MinGW g++ 4.8.2, but hopefully you have Visual C++ (after all it's free).


As an alternative you can code up a conversion function using the Windows API MultiByteToWideChar.

For example, the following code works nicely with g++ 4.8.2 with -D USE_WINAPI:

#undef UNICODE
#define UNICODE
#include <windows.h>
#include <shellapi.h> // ShellAbout

#ifndef USE_WINAPI
# include <codecvt> // std::codecvt_utf8_utf16
# include <locale> // std::wstring_convert
#endif
#include <fstream> // std::ifstream
#include <iostream> // std::cerr, std::endl
#include <stdexcept> // std::runtime_error, std::exception
#include <stdlib.h> // EXIT_FAILURE
#include <string> // std::string, std::wstring

namespace my {
using std::ifstream;
using std::ios;
using std::runtime_error;
using std::string;
using std::wstring;

#ifndef USE_WINAPI
using std::codecvt_utf8_utf16;
using std::wstring_convert;
#endif

auto hopefully( bool const c ) -> bool { return c; }
auto fail( string const& s ) -> bool { throw runtime_error( s ); }

#ifdef USE_WINAPI
auto wstring_from_utf8( char const* const utf8_string )
-> wstring
{
if( *utf8_string == '\0' )
{
return L"";
}
wstring result( strlen( utf8_string ), L'#' ); // More than enough.
int const n_chars = MultiByteToWideChar(
CP_UTF8,
0, // Flags, only alternative is MB_ERR_INVALID_CHARS
utf8_string,
-1, // ==> The string is null-terminated.
&result[0],
result.size()
);
hopefully( n_chars > 0 )
|| fail( "MultiByteToWideChar" );
result.resize( n_chars );
return result;
}
#else
auto wstring_from_utf8( char const* const utf8_string )
-> wstring
{
wstring_convert< codecvt_utf8_utf16< wchar_t > > converter;
return converter.from_bytes( utf8_string );
}
#endif

auto text_of_file( string const& filename )
-> string
{
ifstream f( filename, ios::in | ios::binary );
hopefully( !f.fail() )
|| fail( "file open" );
string result;
string s;
while( getline( f, s ) )
{
result += s + '\n';
}
return result;
}

void cpp_main()
{
string const utf8_text = text_of_file( "spanish.txt" );
wstring const wide_text = wstring_from_utf8( utf8_text.c_str() );
//ShellAbout( 0, L"Spanish text", wide_text.c_str(), LoadIcon( 0, IDI_INFORMATION ) );
MessageBox(
0,
wide_text.c_str(),
L"Spanish text",
MB_ICONINFORMATION | MB_SETFOREGROUND
);
}
} // namespace my

auto main()
-> int
{
using namespace std;
try
{
my::cpp_main();
return EXIT_SUCCESS;
}
catch( exception const& x )
{
cerr << "!" << x.what() << endl;
}
return EXIT_FAILURE;
}

Sample Image

Opening a text file with fstream but filename characters are not in ASCII

From the question How to open an std::fstream with a unicode filename @jalf notes that the C++ standard library is not unicode aware, but there is a windows extension that accepts wchar_t arrays.

You will be able to open a file on a windows platform by creating or calling open on an fstream object with a wchar_t array as the argument.

fstream fileHandle(L"δ»Wüste.txt");
fileHandle.open(L"δ»Wüste.txt");

Both of the above will call the wchar_t* version of the appropriate functions, as the L prefix on a string indicates that it is to be treated as a unicode string.

Edit: Here is a complete example that should compile and run. I created a file on my computer called δ»Wüste.txt with the contents This is a test. I then compiled and ran the following code in the same directory.

#include <fstream>
#include <iostream>
#include <string>

int main(int, char**)
{
std::fstream fileHandle(L"δ»Wüste.txt", std::ios::in|std::ios::out);

std::string text;
std::getline(fileHandle, text);
std::cout << text << std::endl;

system("pause");

return 0;
}

The output is:

This is a test.
Press any key to continue...

Opening fstream with file with Unicode file name under Windows using non-MSVC compiler

Currently there is no easy solution.

You need to create your own stream buffer that uses _wfopen under the hood. You can use for this for example boost::iostream

c++ UTF-16 ofstream file creation Windows

Thanks you all guys, but it seems that C++ streams are helpless in this case (at least I got such opinion).

So I used WinApi:

 #ifndef WIN32    // for Linux
ofstream out(output);
out.close();
#else // for Windows
LPWSTR lp=(LPWSTR )output;
CreateFileW(lp,GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ |
FILE_SHARE_WRITE, NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL );
#endif

And I got an output file with a correct name:

Sample Image

Thanks again!



Related Topics



Leave a reply



Submit