How to Convert Character Code to What I Want

Convert character to ASCII numeric value in java

Very simple. Just cast your char as an int.

char character = 'a';    
int ascii = (int) character;

In your case, you need to get the specific Character from the String first and then cast it.

char character = name.charAt(0); // This gives the character 'a'
int ascii = (int) character; // ascii is now 97.

Though cast is not required explicitly, but its improves readability.

int ascii = character; // Even this will do the trick.

Convert character to ASCII code in JavaScript

"\n".charCodeAt(0);

How do I convert a single character code to a `char` given a character set?

So, a couple of things.

First of all the page you linked to says this about the code point range in question:

The extended ASCII codes (character code 128-255)

There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 128-159 contain the Microsoft® Windows Latin-1 extended characters.

This is incorrect, or at least, to me, misleadingly worded. ISO 8859-1 / Latin-1 does not define code point 146 (and another reference just because). So that's already asking for trouble. You can see this also if you do the conversion through String:

String s = new String(new byte[] {(byte)146}, "iso-8859-1");
System.out.println(s);

Outputs the same "unexpected" result. It appears that what they are actually referring to is the Windows-1252 set (aka "Windows Latin-1", but this name is almost completely obsolete these days), which does define that code point as a right single quote (for other charsets that provide this character at 146 see this list and look for encodings that provide it at 0x92), and we can verify this as such:

String s = new String(new byte[] {(byte)146}, "windows-1252");
System.out.println(s);

So the first mistake is that page is confusing.

But the big mistake is you can't do what you're trying to do in the way you are doing it. A char in Java is a UTF-16 code point (or half of one, if you're representing the supplementary characters > 0xFFFF, a single char corresponds to a BMP point, a pair of them or an int corresponds to the full range, including the supplementary ones).

Unfortunately, Java doesn't really expose a lot of API for single-character conversions. Even Character doesn't have any readily available ways to convert from the charset of your choice to UTF-16.

So one option is to do it via String as hinted at in the examples above, e.g. express your code points as a raw byte[] array and convert from there:

String s = new String(new byte[] {(byte)146}, "windows-1252");
System.out.println(s);
char c = s.charAt(0);
System.out.println(c);

You could grab the char again via s.charAt(0). Note that you have to be mindful of your character set when doing this. Here we know that our byte sequence is valid for the specified encoding, and we know that the result is only one char long, so we can do this.

However, you have to watch out for things in the general case. For example, perhaps your byte sequence and character set yield a result that is in the UTF-16 supplementary character range. In that case s.charAt(0) would not be sufficient and s.codePointAt(0) stored in an int would be required instead.

As an alternative, with the same caveats, you could use Charset to decode, although it's just as clunky, e.g.:

Charset cs = Charset.forName("windows-1252");
CharBuffer cb = cs.decode(ByteBuffer.wrap(new byte[] {(byte)146}));
char c = cb.get(0);
System.out.println(c);

Note that I am not entirely sure how Charset#decode handles supplementary characters and can't really test right now (but anybody, feel free to chime in).


As an aside: In your case, 146 (0x92) cast directly to char corresponds to the UTF-16 character "PRIVATE USE TWO" (see also), and all bets are off for what you'll end up displaying there. This character is classified by Unicode as a control character, and seems to fall in the range of characters reserved for ANSI terminal control (although AFAIK isn't actually used, but it's in that range regardless). I wouldn't be surprised if perhaps browsers in some locales rendered it as a right-single-quote for compatibility, but terminals did something weird with it.

Also, fyi, the official UTF-16 code point for right single quote is 0x2019. You could reliably store that in a char by using that value, e.g.:

System.out.println((char)0x2019);

You can also see this for yourself by looking at the value after the conversion from windows-1252:

String s = new String(new byte[] {(byte)146}, "windows-1252");
char c = s.charAt(0);
System.out.printf("0x%x\n", (int)c); // outputs 0x2019

Or, for completeness:

String s = new String(new byte[] {(byte)146}, "windows-1252");
int cp = s.codePointAt(0);
System.out.printf("0x%x\n", cp); // outputs 0x2019

How to get the ASCII value of a character

From here:

The function ord() gets the int value
of the char. And in case you want to
convert back after playing with the
number, function chr() does the trick.

>>> ord('a')
97
>>> chr(97)
'a'
>>> chr(ord('a') + 3)
'd'
>>>

In Python 2, there was also the unichr function, returning the Unicode character whose ordinal is the unichr argument:

>>> unichr(97)
u'a'
>>> unichr(1234)
u'\u04d2'

In Python 3 you can use chr instead of unichr.


ord() - Python 3.6.5rc1 documentation

ord() - Python 2.7.14 documentation

How to convert ASCII code (0-255) to its corresponding character?

Character.toString ((char) i);

Converting char to ASCII and display in C#?

Try

int AsciCode = (int)'c';
string AsciStr = AsciCode.ToString();

hope it helps

Convert a character digit to the corresponding integer in C

As per other replies, this is fine:

char c = '5';
int x = c - '0';

Also, for error checking, you may wish to check isdigit(c) is true first. Note that you cannot completely portably do the same for letters, for example:

char c = 'b';
int x = c - 'a'; // x is now not necessarily 1

The standard guarantees that the char values for the digits '0' to '9' are contiguous, but makes no guarantees for other characters like letters of the alphabet.

How to convert characters into ASCII code?

I guess you mean utf8ToInt, see the R manuals:

utf8ToInt("Hello")
# [1] 72 101 108 108 111

Or, if you want a mapping of the letters to their codes:

sapply(strsplit("Hello", NULL)[[1L]], utf8ToInt)
# H e l l o
# 72 101 108 108 111

How to convert an ASCII character into an int in C

What about:

int a_as_int = (int)'a';


Related Topics



Leave a reply



Submit