Why does `std::basic_ifstreamchar16_t` not work in c++11?
The various stream classes need a set of definitions to be operational. The standard library requires the relevant definitions and objects only for char
and wchar_t
but not for char16_t
or char32_t
. Off the top of my head the following is needed to use std::basic_ifstream<cT>
or std::basic_ofstream<cT>
:
std::char_traits<cT>
to specify how the character type behaves. I think this template is specialized forchar16_t
andchar32_t
.- The used
std::locale
needs to contain an instance of thestd::num_put<cT>
facet to format numeric types. This facet can just be instantiated and a newstd::locale
containing it can be created but the standard doesn't mandate that it is present in astd::locale
object. - The used
std::locale
needs to contain an instance of the facetstd::num_get<cT>
to read numeric types. Again, this facet can be instantiated but isn't required to be present by default. - the facet
std::numpunct<cT>
needs to be specialized and put into the usedstd::locale
to deal with decimal points, thousand separators, and textual boolean values. Even if it isn't really used it will be referenced from the numeric formatting and parsing functions. There is no ready specialization forchar16_t
orchar32_t
. - The facet
std::ctype<cT>
needs to be specialized and put into the used facet to support widening, narrowing, and classification of the character type. There is no ready specialization forchar16_t
orchar32_t
.- The facet
std::codecvt<cT, char, std::mbstate_t>
needs to be specialized and put into the usedstd::locale
to convert between external byte sequences and internal "character" sequences. There is no ready specialization forchar16_t
orchar32_t
.
- The facet
Most of the facets are reasonably easy to do: they just need to forward a simple conversion or do table look-ups. However, the std::codecvt
facet tends to be rather tricky, especially because std::mbstate_t
is an opaque type from the point of view of the standard C++ library.
All of that can be done. It is a while since I last did a proof of concept implementation for a character type. It took me about a day worth of work. Of course, I knew what I need to do when I embarked on the work having implemented the locales and IOStreams library before. To add a reasonable amount of tests rather than merely having a simple demo would probably take me a week or so (assuming I can actually concentrate on this work).
char16_t printing
Give this a try:
#include <locale>
#include <codecvt>
#include <string>
#include <iostream>
int main()
{
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t> > myconv;
std::wstring ws(L"Your UTF-16 text");
std::string bs = myconv.to_bytes(ws);
std::cout << bs << '\n';
}
error: no matching function for call to ‘std::__cxx11::basic_stringchar::basic_string(int&)’
This statement
a = new T(MAX);
tries to create an object of the type std::string from the integer value MAX. However the class std::string has no such a constructor.
It seems you mean
a = new T[MAX];
that is you want to create an array of objects of the type std::string
.
This function
T Stack<T>::peek() {
if (top < 0) {
cout << "Stack is Empty" << endl;
return NULL;
} else {
return a[top];
}
}
is also wrong because creating an object of the type std::string from a null pointer results in undefined behavior. You should throw an exception for example std::out_of_range
.
Pay attention to that the class has no destructor.
Instead of the dynamically allocated array you could use the class std::vector<std::string>
.
Canot read char8_t from basic_stringstreamchar8_t
This is actually an old issue not specific to support for char8_t
. The same issue occurs with char16_t
or char32_t
in C++11 and newer. The following gcc bug report has a similar test case.
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508
The issue is also discussed at the following:
- GCC 4.8 and char16_t streams - bug?
- Why does `std::basic_ifstream<char16_t>` not work in c++11?
- http://gcc.1065356.n8.nabble.com/UTF-16-streams-td1117792.html
The issue is that gcc does not implicitly imbue the global locale with facets for ctype<char8_t>
, ctype<char16_t>
, or ctype<char32_t>
. When attempting to perform an operation that requires one of these facets, a std::bad_cast
exception is thrown from std::__check_facet
(which is subsequently silently swallowed by the IOS sentry object created for the character extraction operator and which then sets badbit
and failbit
).
The C++ standard only requires that ctype<char>
and ctype<wchar_t>
be provided. See [locale.category]p2.
Using char16_t and char32_t in I/O
In the proposal Minimal Unicode support for the standard library (revision 2) it is indicated that there was only support among the Library Working Group for supporting the new character types in strings and codecvt facets. Apparently the majority was opposed to supporing iostream, fstream, facets other than codecvt, and regex.
According to minutes from the Portland meeting in 2006 "the LWG is committed to full support of Unicode, but does not intend to duplicate the library with Unicode character variants of existing library facilities." I haven't found any details, however I would guess that the committee feels that the current library interface is inappropriate for Unicode. One possible complaint could be that it was designed with fixed sized characters in mind, but Unicode completely obsoletes that as, while Unicode data can use fixed sized code points, it does not limit characters to single code points.
Personally I think there's no reason not to standardized the minimal support that's already provided on various platforms (Windows uses UTF-16 for wchar_t, most Unix platforms use UTF-32). More advanced Unicode support will require new library facilities, but supporting char16_t and char32_t in iostreams and facets won't get in the way but would enable basic Unicode i/o.
Compile error for (char based) STL (stream) containers in Visual Studio
Start notes:
- I am using VStudio Community 2015 (v14.0.25431.01 Update 3). Version is important here, since standard header files might change across versions (and line numbers might differ)
- Created [MSDN]: Compile error for STL (stream) containers in Visual Studio
Approaches:
Quick (shallow) investigation
On VStudio IDE double click, on the 2nd note in the Output window (after attempting to compile the file), and from there repeated RClicks on relevant macros, and from the context menu choosing Go To Definition (F12):
xlocnum (#120): (comment is part of the original file/line)
__PURE_APPDOMAIN_GLOBAL _CRTIMP2_PURE static locale::id id; // unique facet id
yvals.h: (#494):
#define _CRTIMP2_PURE _CRTIMP2
crtdefs.h (#29+):
#ifndef _CRTIMP2
#if defined CRTDLL2 && defined _CRTBLD
#define _CRTIMP2 __declspec(dllexport)
#else
#if defined _DLL && !defined _STATIC_CPPLIB
#define _CRTIMP2 __declspec(dllimport) // @TODO - cfati: line #34: Here is the definition
#else
#define _CRTIMP2
#endif
#endif
#endif
As seen,
__declspec(dllimport)
is defined on line #34. Repeating the process on the_DLL
macro, yielded no result. Found on [MSDN]: Predefined Macros:_DLL Defined as 1 when the /MD or /MDd (Multithreaded DLL) compiler option is set. Otherwise, undefined.
I thought of 2 possible ways to go on (both resulting in a successful build):
- Use static version of CRT Runtime ([MSDN]: /MD, /MT, /LD (Use Run-Time Library)). I don't consider it a viable option, especially when the project consists of .dlls (and it does): bad things can happen (e.g. [SO]: Errors when linking to protobuf 3 on MSVC 2013, or even nastier ones can occur at runtime)
- Manually
#undef _DLL
(in main.cpp, before any#include
). This is a lame workaround (gainarie). It builds fine, but tampering with these things could (and most likely will) trigger Undefined Behavior at runtime
None of these 2 options was fully satisfactory, so:
Going a (little) bit deeper
Tried to simplify things even more (main.cpp):
#include <sstream>
//typedef unsigned short CharType; // wchar_t unsigned short
#define CharType unsigned short
int main() {
std::basic_stringstream<CharType> stream;
CharType c = 0x41;
stream << c;
return 0;
}Notes:
- Replaced
typedef
by#define
(to strip out new type definition complexity) - Switched to
unsigned short
which iswchar_t
's definition (/Zc:wchar_t-
) to avoid any possible type size / alignment differences
"Compiled" the above code with [MSDN]: /E (Preprocess to stdout) and [MSDN]: /EP (Preprocess to stdout Without #line Directives) (so that the warnings/errors only reference line numbers from current file):- Generated preprocessed files (using each flag froma bove): ~1MB+ (~56.5k lines)
- The only difference in the files was the
#define
(wchar_t
vs.unsigned short
) somewhere at the very end - Compiling the files (shockingly :)) yielded the same result: the
wchar_t
one compiled while theunsigned short
failed with the same error - Added some
#pragma message
statements (yes, they are handled by the preprocessor, but still) in the file that fails (before each warning/note), noticed some difference between the 2#define
s, but so far unable to figure out why 1 - While browsing the generated file(s), noticed a
template<> struct char_traits<char32_t>
definition, so I gave it a try, and it worked (at least the current program compiled) 1 (and, as expectedsizeof(char32_t)
is 4). Then, found [MSDN]: char, wchar_t, char16_t, char32_t
Notes:- Although this fixed my current problem (still don't know why), will have to give it a shot on the end goal
- 1 Although I looked over the file, I didn't see any template definitions targeting only the "privileged" types (e.g. I didn't see anything that would differentiate
wchar_t
,signed char
orchar32_t
fromunsigned short
for example), so I don't know (yet) why it works for some types but not for others. This is an open topic, whenever I'll get new updates, I will share them
- Replaced
Bottom line:
As empirically discovered, the following types are allowed, when working with char based STL containers:
char
unsigned char
signed char
wchar_t
char16_t
char32_t
unsigned short
(/Zc:wchar_t-
only )
Final note(s):
- I will incorporate anything useful (e.g. comments) in the answer
@EDIT0:
Based on @IgorTandetnik's answer on [MSDN]: Compile error for STL (stream) containers in Visual Studio, although there is still a little bit of fog left on:
unsigned char
andsigned char
- Difference between static and dynamic C++ RTLib
I'm going to accept this as an answer.
Implicit instantiation of undefined template 'std::basic_stringchar, std::char_traitschar, std::allocatorchar '
You need to include this header:
#include <string>
to upper with char16_t array
The best way to do it is probably something like this:
char16_t upper = std::use_facet<std::ctype<char16_t>>(std::locale()).toupper(ch);
Related Topics
Should Std::Unique_Ptr<Void> Be Permitted
Using Std::Make_Unique with a Custom Deleter
Mingw Error: 'Thread' Is Not a Member of 'Std'
How to Easily Indent Output to Ofstream
How to Config Cmake for Strip File
How to Call the Class's Destructor
Why Does Makeintresource() Work
How to Embed the Gnu Octave in C/C++ Program
Why Can't I Use <Experimental/Filesystem> with G++ 4.9.2
Advantages of Using Initializer List
How to Properly Delete a Pointer to Array
Fatal Error C1083: Cannot Open Include File: 'Xyz.H': No Such File or Directory
How to Determine If Returned Pointer Is on the Stack or Heap