Convert const char* to wstring
I recommend you using std::string
instead of C-style strings (char*
) wherever possible. You can create std::string
object from const char*
by simple passing it to its constructor.
Once you have std::string
, you can create simple function that will convert std::string
containing multi-byte UTF-8 characters to std::wstring
containing UTF-16 encoded points (16bit representation of special characters from std::string
).
There are more ways how to do that, here's the way by using MultiByteToWideChar function:
std::wstring s2ws(const std::string& str)
{
int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo( size_needed, 0 );
MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
Check these questions too:
Mapping multibyte characters to their unicode point representation
Why use MultiByteToWideCharArray to convert std::string to std::wstring?
C++: convert char * to wstring
That a very good question! :-)
As Maxim wrote: mbstowcs()
wsprintf() with "%S" (Capital "S"). In wsprintf() "S" means multi-byte string (in sprintf() "S" means wide-char).
You can use std::wstring_convert and choose the UTF-8 encoding. I THINK its "codecvt_utf8_utf16"
For windows:
MultiByteToWideChar() in WINAPI
If you set to the clipboard using SetClipboardData() the ASCII text using CF_TEXT, windows allows you to GetClipboardData() for CF_UNICODETEXT doing the conversion for you!
You can also do it hardcore manually (and work only in some of the cases) by adding "NULLs" between 2 ASCII characters.
That's all comes to mind right now :-)
Best way to fill wstring with const char*
Since you don't need to do any character conversions you could initialize both of the strings from with a vector of characters. Consider this example:
#include <string>
#include <vector>
int main()
{
char data[32];
std::vector<char> v(data, data + 32);
std::string str(v.begin(), v.end());
std::wstring wstr(v.begin(), v.end());
}
UTF8 char array to std::wstring
Current solution
You can use std::wstring_convert
to convert a string
to or from wstring
, using a codecvt
to specify the conversion to be performed.
Example of use:
string so=u8"Jérôme Ângle";
wstring st;
wstring_convert<std::codecvt_utf8<wchar_t>,wchar_t> converter;
st = converter.from_bytes(so);
If you have a c-string (array of char), the overloads of from_bytes()
will do exactly what you want:
char p[]=u8"Jérôme Ângle";
wstring ws = converter.from_bytes(p);
Online demo
Is it sustainable ?
As pointed out in the comments, C++17 has deprecated codecvt
and the wstring_convert
utility:
These features are hard to use correctly, and there
are doubts whether they are even specified correctly. Users should use
dedicated text-processing libraries instead.
In addition, a wstring
is based on wchar_t
which has a very different encoding on linux systems and on windows systems.
So the first question would be to ask why a wstring
is needed at all, and why not just keep utf-8 everywhere.
Depending on the reasons, you may consider to use:
- ICU and its
UnicodeString
for a full, in-depth, unicode support - boost.locale an its
to_utf
orutf_to_utf
, for common unicode related tasks. - utf8-cpp for working with utf8 strings the unicode way (attention, seems not maintained).
Why I can't construct a wstring from char*
You could use wchar_t
directly and use corresponding wchar_t
supported API to retrieve data directly to wchar_t
and then construct wstring
. _dupenv_s
function has a wide counterpart - _wdupenv_s.
Your code then would look like this:
wchar_t* pathAppData = nullptr;
size_t sz = 0;
_wdupenv_s(&pathAppData, &sz, L"APPDATA");
std::wstring wPathAppData(pathAppData);
wPathAppData.append(L"\\MyApplication")
Also this could be an interesting read: std::wstring VS std::string
Cannot convert character array to wstring with utf-8 characters
I think you should change code to below:
std::wstring s2ws(const char* utf8Bytes)
{
const std::string& str(utf8Bytes);
int size_needed = MultiByteToWideChar(CP_ACP, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo(size_needed, 0);
MultiByteToWideChar(CP_ACP, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
Difference between two flags is listed here.
Convert const char* to const wchar_t*
There are multiple questions on SO that address the problem on Windows. Sample posts:
- char* to const wchar_t * conversion
- conversion from unsigned char* to const wchar_t*
There is a platform agnostic method posted at http://ubuntuforums.org/showthread.php?t=1579640. The source from this site is (I hope I am not violating any copyright):
#include <locale>
#include <iostream>
#include <string>
#include <sstream>
using namespace std ;
wstring widen( const string& str )
{
wostringstream wstm ;
const ctype<wchar_t>& ctfacet = use_facet<ctype<wchar_t>>(wstm.getloc()) ;
for( size_t i=0 ; i<str.size() ; ++i )
wstm << ctfacet.widen( str[i] ) ;
return wstm.str() ;
}
string narrow( const wstring& str )
{
ostringstream stm ;
// Incorrect code from the link
// const ctype<char>& ctfacet = use_facet<ctype<char>>(stm.getloc());
// Correct code.
const ctype<wchar_t>& ctfacet = use_facet<ctype<wchar_t>>(stm.getloc());
for( size_t i=0 ; i<str.size() ; ++i )
stm << ctfacet.narrow( str[i], 0 ) ;
return stm.str() ;
}
int main()
{
{
const char* cstr = "abcdefghijkl" ;
const wchar_t* wcstr = widen(cstr).c_str() ;
wcout << wcstr << L'\n' ;
}
{
const wchar_t* wcstr = L"mnopqrstuvwx" ;
const char* cstr = narrow(wcstr).c_str() ;
cout << cstr << '\n' ;
}
}
Related Topics
C++: Wrapping Vector<Char> with Istream
How to Use New Std::Byte Type in Places Where Old-Style Unsigned Char Is Needed
Class Template for Numeric Types
Why "Universal References" Have the Same Syntax as Rvalue References
Declaring a Pointer to Multidimensional Array and Allocating the Array
C++ Can Compilers Inline a Function Pointer
"To_String" Isn't a Member of "Std"
Purpose of a ".F" Appended to a Number
C++ Implicit Conversion (Signed + Unsigned)
What Is Activation Record in the Context of C and C++
Passing Optional Parameter by Reference in C++
How to Initialize All Elements in an Array to the Same Number in C++
What's the Time Complexity of Iterating Through a Std::Set/Std::Map
Calculating Factorial Using Template Meta-Programming
How to Use C++20's Likely/Unlikely Attribute in If-Else Statement
How to Add the Mingw Bin Directory to My System Path
How to Find Longest Common Substring Using C++
What's Time Complexity of This Algorithm for Finding All Combinations