Char!=(Signed Char), Char!=(Unsigned Char)

char!=(signed char), char!=(unsigned char)

Here is your answer from the standard:

3.9.1 Fundamental types [basic.fundamental]

Objects declared as characters (char) shall be large enough to store any member of the implementation's basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (basic.types); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular
implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

Difference between signed / unsigned char

There's no dedicated "character type" in C language. char is an integer type, same (in that regard) as int, short and other integer types. char just happens to be the smallest integer type. So, just like any other integer type, it can be signed or unsigned.

It is true that (as the name suggests) char is mostly intended to be used to represent characters. But characters in C are represented by their integer "codes", so there's nothing unusual in the fact that an integer type char is used to serve that purpose.

The only general difference between char and other integer types is that plain char is not synonymous with signed char, while with other integer types the signed modifier is optional/implied.

comparison between signed and unsigned char

Your question is not stupid at all. You were close to the solution: i is assigned the value -3 but the implicit conversion to the type of i, unsigned char, changes the value to 253.

For a more precise explanation, there are multiple issues in your test code:

  • char may be signed or unsigned depending on the platform and compiler configuration, so char cnt = -1; may store the value -1 or 255 into cnt, or even some other value if char is unsigned and has more than 8 bits.

  • The behavior of for (i = cnt - 2; i < cnt; i--) also depends on whether char is signed or unsigned by default:

    • in all cases, the test i < cnt is evaluated with both operands converted to int (or unsigned int in the rare case where sizeof(int)==1). If int can represent all values of types char and unsigned char, this conversion does not change the values.

    • if char is unsigned and has 8 bits, cnt has the value 255 so i is initialized with the value 253 and the loop runs 254 times with i from 253 down to 0, then i-- stores the value 255 again into i, for which the test i < cnt evaluates to false. The loop prints 507, then 759, ... 32385.

    • if char is signed and has 8 bits, as is probably the case on your system, cnt has the value -1 and i is initialized with the value -3 converted to unsigned char, which is 253. The initial test i < cnt evaluates as 253 < -1, which is false, causing the loop body to be skipped immediately.

You can force char to be unsigned by default by giving the compiler the appropriate flag (eg: gcc -funsigned-char) and test how the behavior changes. Using Godbolt's compiler explorer, you can see that gcc generates just 2 instructions to return 0 in the signed (default) case and the expected output in the unsigned case.

Why doesn't C++ accept signed or unsigned char for arrays of characters

"z1y2x3w4" is const char[9] and there is no implicit conversion from const char* to const signed char*.

You could use reinterpret_cast

const signed char * AnArrayOfStrings[]  = {reinterpret_cast<const signed char *>("z1y2x3w4"),
reinterpret_cast<const signed char *>("Aname")};

Char, unsigned char and signed char as char&

They're different types.

It doesn't matter that they have the same numerical range.

You can't bind a reference to OneThing, to a SomeOtherThing.

This is the very purpose of a type system: to make constraints in your program so that you don't make mistakes.

Options:

  1. void f(const char&)

    References to const are special. When you pass a signed char here, it is automatically converted to a temporary char. Temporaries can bind to const references. However, what would be the purpose of this if you can't change the original value? Might as well just pass by value.

  2. void f(char)

    This is pass by value. Now anything that can implicit convert to char (such as signed char) will be accepted, though the original value will no longer be "connected" within f. Also you will need to be careful yourself that the numerical range does match: if passing an unsigned char that may not be the case.

  3. Type punning

    Your calling scope can do f(reinterpret_cast<char&>(mySignedChar)) and it'll work, because there are special rules for aliasing/punning chars in this way. However, this is a hack, and (contrary to popular belief) is not legal for most other types.

  4. Make your types consistent

    This is what the language wants you to do. Why do you have a function taking char, but a function passing signed char? Why do your types not match? If this is beyond your control (e.g. different idioms from different third-party libraries) then you can play around with type-punning if really necessary, though this should be a final resort.

In the C++11 standard, why leave the char type implementation dependent?

Some processors prefer signed char, and others prefer unsigned char. For example, POWER can load an 8-bit value from memory with zero extension, but not sign extension. But SuperH-3 can load an 8-bit value from memory with sign extension but not zero extension. C++ derives from C, and C leaves many details of the language implementation-defined so that each implementation can be tailored to be most efficient for its target environment.

Types of char in C++ (confused by C++ primer explanation)

If an integer type is signed it means that it holds negative and positive values and the value 0.

If an integer type is unsigned it means that it holds positive values and the value 0. It cannot hold negative values.

Some types have "more names". E.g. long, long int and signed long int are all names for the same type.

Let's see what is the case with most of the integer types. E.g. int and signed int are the exact same type. These are two names for the same type. This type is signed (holds negative and positive values and the value 0). unsigned int is a different type, it is unsigned, i.e. it only holds positive values and the value zero. This pattern repeats for all integer types (short, long etc)

The exception to this is char: char, unsigned char and signed char are 3 different types. signed char is signed, unsigned char is unsigned and char well it can be either signed or unsigned depending on the compiler and platform.

Now what does it mean that two types are different? For beginners it really doesn't make much of an impact. This fact comes more into play in function overloading and meta programming.

c: type casting char values into unsigned short

The type char is not a "third" signedness. It is either signed char or unsigned char, and which one it is is implementation defined.

This is dictated by section 6.2.5p15 of the C standard:

The three types char , signed char , and unsigned char are
collectively called the character types. The implementation
shall define char to have the same range, representation, and
behavior as either signed char or unsigned char.

It appears that on your implementation, char is the same as signed char, so because the value is negative and because the destination type is unsigned it must be converted.

Section 6.3.1.3 dictates how conversion between integer types occur:

1 When a value with integer type is converted to another integer type
other than
_Bool ,if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is
converted by repeatedly adding or subtracting one more than
the maximum value that can be represented in the new type
until the value is in the range of the new type.

3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or
an implementation-defined signal is raised.

Since the value 0x80 == -128 cannot be represented in an unsigned short the conversion in paragraph 2 occurs.



Related Topics



Leave a reply



Submit