Are Int8_T and Uint8_T Intended to Be Char Types

Are int8_t and uint8_t intended to be char types?

From § 18.4.1 [cstdint.syn] of the C++0x FDIS (N3290), int8_t is an optional typedef that is specified as follows:

namespace std {
typedef signed integer type int8_t; // optional
//...
} // namespace std

§ 3.9.1 [basic.fundamental] states:

There are five standard signed integer types: “signed char”, “short int”, “int”, “long int”, and “long long int”. In this list, each type provides at least as much storage as those preceding it in the list. There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.

...

Types bool, char, char16_t, char32_t, wchar_t, and the signed and unsigned integer types are collectively called integral types. A synonym for integral type is integer type.

§ 3.9.1 also states:

In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

It is tempting to conclude that int8_t may be a typedef of char provided char objects take on signed values; however, this is not the case as char is not among the list of signed integer types (standard and possibly extended signed integer types). See also Stephan T. Lavavej's comments on std::make_unsigned and std::make_signed.

Therefore, either int8_t is a typedef of signed char or it is an extended signed integer type whose objects occupy exactly 8 bits of storage.

To answer your question, though, you should not make assumptions. Because functions of both forms x.operator<<(y) and operator<<(x,y) have been defined, § 13.5.3 [over.binary] says that we refer to § 13.3.1.2 [over.match.oper] to determine the interpretation of std::cout << i. § 13.3.1.2 in turn says that the implementation selects from the set of candidate functions according to § 13.3.2 and § 13.3.3. We then look to § 13.3.3.2 [over.ics.rank] to determine that:

  • The template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, signed char) template would be called if int8_t is an Exact Match for signed char (i.e. a typedef of signed char).
  • Otherwise, the int8_t would be promoted to int and the basic_ostream<charT,traits>& operator<<(int n) member function would be called.

In the case of std::cout << u for u a uint8_t object:

  • The template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char) template would be called if uint8_t is an Exact Match for unsigned char.
  • Otherwise, since int can represent all uint8_t values, the uint8_t would be promoted to int and the basic_ostream<charT,traits>& operator<<(int n) member function would be called.

If you always want to print a character, the safest and most clear option is:

std::cout << static_cast<signed char>(i);

And if you always want to print a number:

std::cout << static_cast<int>(i);

Why does int8_t and user input via cin shows strange result

int8_t is a typedef for an integer type with the required characteristics: pure 2's-complement representation, no padding bits, size of exactly 8 bits.

For most (perhaps all) compilers, that means it's going to be a typedef for signed char.(Because of a quirk in the definition of the term signed integer type, it cannot be a typedef for plain char, even if char happens to be signed).

The >> operator treats character types specially. Reading a character reads a single input character, not sequence of characters representing some integer value in decimal. So if the next input character is '0', the value read will be the character value '0', which is probably 48.

Since a typedef creates an alias for an existing type, not a new distinct type, there's no way for the >> operator to know that you want to treat int8_t as an integer type rather than as a character type.

The problem is that in most implementations there is no 8-bit integer type that's not a character type.

The only workaround is to read into an int variable and then convert to int8_t (with range checks if you need them).

Incidentally, int8_t is a signed type; the corresponding unsigned type is uint8_t, which has a range of 0..255.

(One more consideration: if CHAR_BIT > 8, which is permitted by the standard, then neither int8_t nor uint8_t will be defined at all.)

uint8_t vs unsigned char

It documents your intent - you will be storing small numbers, rather than a character.

Also it looks nicer if you're using other typedefs such as uint16_t or int32_t.

Why doesn't uint8_t and int8_t work with file and console streams?

The declaration of template std::basic_ifstream is:

template< 
class CharT,
class Traits = std::char_traits<CharT>
> class basic_ifstream;

The C++03 Standard (21.1/1) requires the library to define specializations
of std::char_traits<CharT> for CharT = char, wchar_t.

The C++11 Standard (C++11 21.2/1) requires the library to define specializations
of std::char_traits<CharT> for CharT = char,char16_t,char32_t,wchar_t.

If you instantiate std::basic_ifstream<Other> with Other not one of
the 2[4] types nominated by the Standard to which you are compiling then
the behaviour will be undefined, unless you yourself
define my_char_traits<Other> as you require and then instantiate
std::basic_ifstream<Other,my_char_traits<Other>>.

CONTINUED in response to OP's comments.

Requesting an std::char_traits<Other> will not provoke template instantiation
errors: the template is defined so that you may specialize it, but the
default (unspecialized) instantiation is very likely to be wrong for Other
or indeed for any given CharT, where wrong means does not satisfy the
the Standard's requirements for a character traits class per C++03 § 21.1.1/C++11 § 21.2.1
.

You suspect that a typedef might thwart the choice of a template specialization
for the typedef-ed type, i.e. that the fact that uint8_t and int8_t
are typedefs for fundamentals character types might result in std::basic_ifstream<byte>
not being the same as std::basic_ifstream<FCT>, where FCT
is the aliased fundamental character type.

Forget that suspicion.typedef is transparent. It seems you believe one of
the typedefs int8_t and uint8_t must be char, in which case - unless
the typedef was somehow intefering with template resolution -
one of the misbehaving basic_ifstream instantiations you have tested would
have to be std::basic_ifstream<char>

But what about the fact that typedef char byte is harmless? That belief that
either int8_t or uint8_t = char is false. You will find that int8_t
is an alias for signed char while uint8_t is an alias for unsigned char.
But neither signed char nor unsigned char is the same type as char:

C++03/11 § 3.9.1/1

Plain char, signed char, and unsigned char are three distinct types

So both char_traits<int8_t> and char_traits<uint8_t> are default,
unspecialized, instantiations of template char_traits and you have
no right to expect that they fulfill that Standard's requirements of
character traits.

The one test case in which you found no misbehaviour was for byte = char.
That is because char_traits<char> is a Standard specialization provided
by the library.

The connection between all the misbehaviour you have observed and the
types that you have substituted for SOMECAST in:

std::cout << (SOMECAST)buff; // <------- interesting

is none. Since your testfile contains ASCII text, basic_ifstream<char>
is the one and only instantiation of basic_ifstream that the Standard warrants
for reading it. If you read the file using typedef char byte in your program
then none of the casts that you say you substituted will have an unexpected
result: SOMECAST = char or unsigned char will output a, and
SOMECAST = int or unsigned int will output 97.

All the misbehaviour arises from instantiating basic_ifstream<CharT> with CharT
some type that the Standard does not warrant.

Whats the difference between UInt8 and uint8_t

In C99 the available basic integer types (the ones without _t) were deemed insufficient, because their actual sizes may vary across different systems.

So, the C99 standard includes definitions of several new integer types to enhance the portability of programs. The new types are especially useful in embedded environments.

All of the new types are suffixed with a _t and are guaranteed to be defined uniformly across all systems.

For more info see the fixed-width integer types section of the wikipedia article on Stdint.

Difference between uint8_t, uint_fast8_t and uint_least8_t

uint_least8_t is the smallest type that has at least 8 bits.
uint_fast8_t is the fastest type that has at least 8 bits.

You can see the differences by imagining exotic architectures. Imagine a 20-bit architecture. Its unsigned int has 20 bits (one register), and its unsigned char has 10 bits. So sizeof(int) == 2, but using char types requires extra instructions to cut the registers in half. Then:

  • uint8_t: is undefined (no 8 bit type).
  • uint_least8_t: is unsigned char, the smallest type that is at least 8 bits.
  • uint_fast8_t: is unsigned int, because in my imaginary architecture, a half-register variable is slower than a full-register one.

Is there any reason not to use fixed width integer types (e.g. uint8_t)?

It's actually quite common to store a number without needing to know the exact size of the type. There are plenty of quantities in my programs that I can reasonably assume won't exceed 2 billion, or enforce that they don't. But that doesn't mean I need an exact 32 bit type to store them, any type that can count to at least 2 billion is fine by me.

If you're trying to write very portable code, you must bear in mind that the fixed-width types are all optional.

On a C99 implementation where CHAR_BIT is greater than 8 there is no int8_t. The standard forbids it to exist because it would have to have padding bits, and intN_t types are defined to have no padding bits (7.18.1.1/1). uint8_t therefore also forbidden because (thanks, ouah) an implementation is not permitted to define uint8_t without int8_t.

So, in very portable code, if you need a signed type capable of holding values up to 127 then you should use one of signed char, int, int_least8_t or int_fast8_t according to whether you want to ask the compiler to make it:

  • work in C89 (signed char or int)
  • avoid surprising integer promotions in arithmetic expressions (int)
  • small (int_least8_t or signed char)
  • fast (int_fast8_t or int)

The same goes for an unsigned type up to 255, with unsigned char, unsigned int, uint_least8_t and uint_fast8_t.

If you need modulo-256 arithmetic in very portable code, then you can either take the modulus yourself, mask bits, or play games with bitfields.

In practice, most people never need to write code that portable. At the moment CHAR_BIT > 8 only comes up on special-purpose hardware, and your general-purpose code won't get used on it. Of course that could change in future, but if it does I suspect that there is so much code that makes assumptions about Posix and/or Windows (both of which guarantee CHAR_BIT == 8), that dealing with your code's non-portability will be one small part of a big effort to port code to that new platform. Any such implementation is probably going to have to worry about how to connect to the internet (which deals in octets), long before it worries how to get your code up and running :-)

If you're assuming that CHAR_BIT == 8 anyway then I don't think there's any particular reason to avoid (u)int8_t other than if you want the code to work in C89. Even in C89 it's not that difficult to find or write a version of stdint.h for a particular implementation. But if you can easily write your code to only require that the type can hold 255, rather than requiring that it can't hold 256, then you might as well avoid the dependency on CHAR_BIT == 8.



Related Topics



Leave a reply



Submit