Are the Character Digits ['0'..'9'] Required to Have Contiguous Numeric Values

Are the character digits ['0'..'9'] required to have contiguous numeric values?

Indeed not looked hard enough: In 2.3. Character sets, item 3:

In both the source and execution basic character sets, the value of each character after 0 in the
above list of decimal digits shall be one greater than the value of the previous.

And this is above list of decimal digits:

0 1 2 3 4 5 6 7 8 9

Therefore, an implementation must use a character set where the decimal digits have a contiguous representation. Thus, optimizations where you rely on this property are safe; however, optimizations where you rely on the coniguity of other digits (e.g. 'a'..'z') are not portable w.r.t. to the standard (see also header <cctype>). If you do this, make sure to assert that property.

Convert a character digit to the corresponding integer in C

As per other replies, this is fine:

char c = '5';
int x = c - '0';

Also, for error checking, you may wish to check isdigit(c) is true first. Note that you cannot completely portably do the same for letters, for example:

char c = 'b';
int x = c - 'a'; // x is now not necessarily 1

The standard guarantees that the char values for the digits '0' to '9' are contiguous, but makes no guarantees for other characters like letters of the alphabet.

What is the purpose of using str[i]-'0' where str is a string?

if str contains stringified digits and you are using ASCII or EBCDIC encoding (or perhaps others), then str[i] - '0' converts the character at position i to a numeric digit.

Why is the regex to match 1 to 10 written as [1-9]|10 and not [1-10]?

Sometime a good drawing worth 1000 words...

Here are the three propositions in your question and the way a regex flavour would understand them:

[1-9]|10

Regular expression image

[1-10]

Regular expression image

[1-(10)]

Invalid regexp !!

This regex is invalid because a range is opened (1-) with a digit but not closed with another digit (ends with ().

A range is usually bound with digits on both sides or letters on both sides.

Images generated with Debuggex

what does string - '0' do (string is a char)

This subtracts from the character to which string is pointing the ASCII code of the character '0'. So, '0' - '0' gives you 0 and so on and '9' - '0' gives you 9.

The entire loop is basically calculating "manually" the numerical value of the decimal integer in the string string points to.

That's because i << 3 is equivalent to i * 8 and i << 1 is equivalent to i * 2 and (i << 3) + (i<<1) is equivalent to i * 8 + i * 2 or i * 10.

Why does adding a '0' to an int digit allow conversion to a char?

If you look at the ASCII table, asciitable, you'll see that the digits start at 48 (being '0') and go up to 57 (for '9'). So in order to get the character code for a digit, you can add that digit to the character code of '0'.



Related Topics



Leave a reply



Submit