How to Convert Unicode Character to Int in Swift

How to convert Unicode Character to Int in Swift

Hex to Int

If you are starting with \u{0D85} and you know the hex value of the Unicode character, then you might as well write it in the following format because it is an Int already.

let myInt = 0x0D85                          // Int: 3461

String to Int

I assume, though, that you have "\u{0D85}" (in quotes), which makes it a String by default. And since it is a String, you can't be certain that you will only have a single Int value for the general case.

let myString = "\u{0D85}"

for element in myString.unicodeScalars {
let myInt = element.value // UInt32: 3461
}

I could have also used myString.utf16 and let myInt = Int(element), but I find it easier to deal with Unicode scalar values (UTF-32) when there is a possibility of things like emoji. (See this answer for more details.)

Character to Int

Swift Character, which is an extended grapheme cluster, does not have a utf16 or unicodeScalars property, so if you are starting with Character then convert it to a String first and then follow the directions in the String to Int section above.

let myChar: Character = "\u{0D85}"
let myString = String(myChar)

How to convert an Int to a Character in Swift

You can't convert an integer directly to a Character instance, but you can go from integer to UnicodeScalar to Character and back again:

let startingValue = Int(("A" as UnicodeScalar).value) // 65
for i in 0 ..< 26 {
print(Character(UnicodeScalar(i + startingValue)))
}

Convert Character to Int in Swift 2.0

In Swift 2.0, toInt(), etc., have been replaced with initializers. (In this case, Int(someString).)

Because not all strings can be converted to ints, this initializer is failable, which means it returns an optional int (Int?) instead of just an Int. The best thing to do is unwrap this optional using if let.

I'm not sure exactly what you're going for, but this code works in Swift 2, and accomplishes what I think you're trying to do:

let unsolved = "123abc"

var fileLines = [Int]()

for i in unsolved.characters {
let someString = String(i)
if let someInt = Int(someString) {
fileLines += [someInt]
}
print(i)
}

Or, for a Swiftier solution:

let unsolved = "123abc"

let fileLines = unsolved.characters.filter({ Int(String($0)) != nil }).map({ Int(String($0))! })

// fileLines = [1, 2, 3]

You can shorten this more with flatMap:

let fileLines = unsolved.characters.flatMap { Int(String($0)) }

flatMap returns "an Array containing the non-nil results of mapping transform over self"… so when Int(String($0)) is nil, the result is discarded.

Convert Character to Int in Swift

There is no need to work with characters, but your code to create an
array with the decimal digits of the input number can be greatly
simplified:

var checkSumArray = [Int]()
var tmp = number
while tmp > 0 {
checkSumArray.append(tmp % 10)
tmp /= 10
}
checkSumArray = checkSumArray.reverse()

Converting Unicode in Swift

This answer suggests using the NSString method stringByFoldingWithOptions.

The Swift String class has a concept called a "view" which lets you operate on the string under different encodings. It's pretty neat, and there are some views that might help you.

If you're dealing with strings in Swift, read this excellent post by Mike Ash. He discusses the idea of what a string really is with great detail and has some helpful hints for Swift 2.

Getting character ASCII value as an Integer in Swift

In Swift, a Character may not necessarily be an ASCII one. It would for example have no sense to return the ascii value of "quot; which requires a large unicode encoding. This is why asciiValue property has an optional UInt8 value, which is annotated UInt8?.

The simplest solution

Since you checked yourself that the character isAscii, you can safely go for an unconditional unwrapping with !:

var tempo:Int = Int(n.asciiValue!)     // <--- just change this line

A more elegant alternative

You could also take advantage of optional binding that uses the fact that the optional is nil when there is no ascii value (i.e. n was not an ASCII character):

if let tempo = n.asciiValue   // is true only if there is an ascii value
{
temp += (Int(tempo) | key)
}

Working with Unicode code points in Swift

Updated for Swift 3

String and Character

For almost everyone in the future who visits this question, String and Character will be the answer for you.

Set Unicode values directly in code:

var str: String = "I want to visit 北京, Москва, मुंबई, القاهرة, and 서울시. quot;
var character: Character = "quot;

Use hexadecimal to set values

var str: String = "\u{61}\u{5927}\u{1F34E}\u{3C0}" // a大π
var character: Character = "\u{65}\u{301}" // é = "e" + accent mark

Note that the Swift Character can be composed of multiple Unicode code points, but appears to be a single character. This is called an Extended Grapheme Cluster.

See this question also.

Convert to Unicode values:

str.utf8
str.utf16
str.unicodeScalars // UTF-32

String(character).utf8
String(character).utf16
String(character).unicodeScalars

Convert from Unicode hex values:

let hexValue: UInt32 = 0x1F34E

// convert hex value to UnicodeScalar
guard let scalarValue = UnicodeScalar(hexValue) else {
// early exit if hex does not form a valid unicode value
return
}

// convert UnicodeScalar to String
let myString = String(scalarValue) // br>

Or alternatively:

let hexValue: UInt32 = 0x1F34E
if let scalarValue = UnicodeScalar(hexValue) {
let myString = String(scalarValue)
}

A few more examples

let value0: UInt8 = 0x61
let value1: UInt16 = 0x5927
let value2: UInt32 = 0x1F34E

let string0 = String(UnicodeScalar(value0)) // a
let string1 = String(UnicodeScalar(value1)) // 大
let string2 = String(UnicodeScalar(value2)) // br>
// convert hex array to String
let myHexArray = [0x43, 0x61, 0x74, 0x203C, 0x1F431] // an Int array
var myString = ""
for hexValue in myHexArray {
myString.append(UnicodeScalar(hexValue))
}
print(myString) // Cat‼br>

Note that for UTF-8 and UTF-16 the conversion is not always this easy. (See UTF-8, UTF-16, and UTF-32 questions.)

NSString and unichar

It is also possible to work with NSString and unichar in Swift, but you should realize that unless you are familiar with Objective C and good at converting the syntax to Swift, it will be difficult to find good documentation.

Also, unichar is a UInt16 array and as mentioned above the conversion from UInt16 to Unicode scalar values is not always easy (i.e., converting surrogate pairs for things like emoji and other characters in the upper code planes).

Custom string structure

For the reasons mentioned in the question, I ended up not using any of the above methods. Instead I wrote my own string structure, which was basically an array of UInt32 to hold Unicode scalar values.

Again, this is not the solution for most people. First consider using extensions if you only need to extend the functionality of String or Character a little.

But if you really need to work exclusively with Unicode scalar values, you could write a custom struct.

The advantages are:

  • Don't need to constantly switch between Types (String, Character, UnicodeScalar, UInt32, etc.) when doing string manipulation.
  • After Unicode manipulation is finished, the final conversion to String is easy.
  • Easy to add more methods when they are needed
  • Simplifies converting code from Java or other languages

Disadavantages are:

  • makes code less portable and less readable for other Swift developers
  • not as well tested and optimized as the native Swift types
  • it is yet another file that has to be included in a project every time you need it

You can make your own, but here is mine for reference. The hardest part was making it Hashable.

// This struct is an array of UInt32 to hold Unicode scalar values
// Version 3.4.0 (Swift 3 update)

struct ScalarString: Sequence, Hashable, CustomStringConvertible {

fileprivate var scalarArray: [UInt32] = []


init() {
// does anything need to go here?
}

init(_ character: UInt32) {
self.scalarArray.append(character)
}

init(_ charArray: [UInt32]) {
for c in charArray {
self.scalarArray.append(c)
}
}

init(_ string: String) {

for s in string.unicodeScalars {
self.scalarArray.append(s.value)
}
}

// Generator in order to conform to SequenceType protocol
// (to allow users to iterate as in `for myScalarValue in myScalarString` { ... })
func makeIterator() -> AnyIterator<UInt32> {
return AnyIterator(scalarArray.makeIterator())
}

// append
mutating func append(_ scalar: UInt32) {
self.scalarArray.append(scalar)
}

mutating func append(_ scalarString: ScalarString) {
for scalar in scalarString {
self.scalarArray.append(scalar)
}
}

mutating func append(_ string: String) {
for s in string.unicodeScalars {
self.scalarArray.append(s.value)
}
}

// charAt
func charAt(_ index: Int) -> UInt32 {
return self.scalarArray[index]
}

// clear
mutating func clear() {
self.scalarArray.removeAll(keepingCapacity: true)
}

// contains
func contains(_ character: UInt32) -> Bool {
for scalar in self.scalarArray {
if scalar == character {
return true
}
}
return false
}

// description (to implement Printable protocol)
var description: String {
return self.toString()
}

// endsWith
func endsWith() -> UInt32? {
return self.scalarArray.last
}

// indexOf
// returns first index of scalar string match
func indexOf(_ string: ScalarString) -> Int? {

if scalarArray.count < string.length {
return nil
}

for i in 0...(scalarArray.count - string.length) {

for j in 0..<string.length {

if string.charAt(j) != scalarArray[i + j] {
break // substring mismatch
}
if j == string.length - 1 {
return i
}
}
}

return nil
}

// insert
mutating func insert(_ scalar: UInt32, atIndex index: Int) {
self.scalarArray.insert(scalar, at: index)
}
mutating func insert(_ string: ScalarString, atIndex index: Int) {
var newIndex = index
for scalar in string {
self.scalarArray.insert(scalar, at: newIndex)
newIndex += 1
}
}
mutating func insert(_ string: String, atIndex index: Int) {
var newIndex = index
for scalar in string.unicodeScalars {
self.scalarArray.insert(scalar.value, at: newIndex)
newIndex += 1
}
}

// isEmpty
var isEmpty: Bool {
return self.scalarArray.count == 0
}

// hashValue (to implement Hashable protocol)
var hashValue: Int {

// DJB Hash Function
return self.scalarArray.reduce(5381) {
($0 << 5) &+ $0 &+ Int($1)
}
}

// length
var length: Int {
return self.scalarArray.count
}

// remove character
mutating func removeCharAt(_ index: Int) {
self.scalarArray.remove(at: index)
}
func removingAllInstancesOfChar(_ character: UInt32) -> ScalarString {

var returnString = ScalarString()

for scalar in self.scalarArray {
if scalar != character {
returnString.append(scalar)
}
}

return returnString
}
func removeRange(_ range: CountableRange<Int>) -> ScalarString? {

if range.lowerBound < 0 || range.upperBound > scalarArray.count {
return nil
}

var returnString = ScalarString()

for i in 0..<scalarArray.count {
if i < range.lowerBound || i >= range.upperBound {
returnString.append(scalarArray[i])
}
}

return returnString
}


// replace
func replace(_ character: UInt32, withChar replacementChar: UInt32) -> ScalarString {

var returnString = ScalarString()

for scalar in self.scalarArray {
if scalar == character {
returnString.append(replacementChar)
} else {
returnString.append(scalar)
}
}
return returnString
}
func replace(_ character: UInt32, withString replacementString: String) -> ScalarString {

var returnString = ScalarString()

for scalar in self.scalarArray {
if scalar == character {
returnString.append(replacementString)
} else {
returnString.append(scalar)
}
}
return returnString
}
func replaceRange(_ range: CountableRange<Int>, withString replacementString: ScalarString) -> ScalarString {

var returnString = ScalarString()

for i in 0..<scalarArray.count {
if i < range.lowerBound || i >= range.upperBound {
returnString.append(scalarArray[i])
} else if i == range.lowerBound {
returnString.append(replacementString)
}
}
return returnString
}

// set (an alternative to myScalarString = "some string")
mutating func set(_ string: String) {
self.scalarArray.removeAll(keepingCapacity: false)
for s in string.unicodeScalars {
self.scalarArray.append(s.value)
}
}

// split
func split(atChar splitChar: UInt32) -> [ScalarString] {
var partsArray: [ScalarString] = []
if self.scalarArray.count == 0 {
return partsArray
}
var part: ScalarString = ScalarString()
for scalar in self.scalarArray {
if scalar == splitChar {
partsArray.append(part)
part = ScalarString()
} else {
part.append(scalar)
}
}
partsArray.append(part)
return partsArray
}

// startsWith
func startsWith() -> UInt32? {
return self.scalarArray.first
}

// substring
func substring(_ startIndex: Int) -> ScalarString {
// from startIndex to end of string
var subArray: ScalarString = ScalarString()
for i in startIndex..<self.length {
subArray.append(self.scalarArray[i])
}
return subArray
}
func substring(_ startIndex: Int, _ endIndex: Int) -> ScalarString {
// (startIndex is inclusive, endIndex is exclusive)
var subArray: ScalarString = ScalarString()
for i in startIndex..<endIndex {
subArray.append(self.scalarArray[i])
}
return subArray
}

// toString
func toString() -> String {
var string: String = ""

for scalar in self.scalarArray {
if let validScalor = UnicodeScalar(scalar) {
string.append(Character(validScalor))
}
}
return string
}

// trim
// removes leading and trailing whitespace (space, tab, newline)
func trim() -> ScalarString {

//var returnString = ScalarString()
let space: UInt32 = 0x00000020
let tab: UInt32 = 0x00000009
let newline: UInt32 = 0x0000000A

var startIndex = self.scalarArray.count
var endIndex = 0

// leading whitespace
for i in 0..<self.scalarArray.count {
if self.scalarArray[i] != space &&
self.scalarArray[i] != tab &&
self.scalarArray[i] != newline {

startIndex = i
break
}
}

// trailing whitespace
for i in stride(from: (self.scalarArray.count - 1), through: 0, by: -1) {
if self.scalarArray[i] != space &&
self.scalarArray[i] != tab &&
self.scalarArray[i] != newline {

endIndex = i + 1
break
}
}

if endIndex <= startIndex {
return ScalarString()
}

return self.substring(startIndex, endIndex)
}

// values
func values() -> [UInt32] {
return self.scalarArray
}

}

func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.scalarArray == right.scalarArray
}

func +(left: ScalarString, right: ScalarString) -> ScalarString {
var returnString = ScalarString()
for scalar in left.values() {
returnString.append(scalar)
}
for scalar in right.values() {
returnString.append(scalar)
}
return returnString
}

Convert Unicode symbol or its XML/HTML entities into its Unicode number in Swift

update: Xcode 11.4 • Swift 5.2

extension String {
var data: Data { .init(utf8) }
var html2AttributedString: NSAttributedString? {
do {
return try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
} catch {
print(error)
return nil
}
}
var html2String: String { html2AttributedString?.string ?? "" }
var unicodes: [UInt32] { unicodeScalars.map(\.value) }
}


let str = "<span>€€</span>".html2String  // "€€"
str.unicodes // [8364, 8364]




extension StringTransform {
static let toUnicodeHex = Self("Hex/Unicode")
static let toJavaHex = Self("Hex/Java")
static let toPerlHex = Self("Hex/Perl")
}


extension String {
var convertedToUnicodeHex: String { applyingTransform(.toUnicodeHex, reverse: false) ?? "" }
var convertedToJavaHex: String { applyingTransform(.toJavaHex, reverse: false) ?? "" }
var convertedToXMLHex: String { applyingTransform(.toXMLHex, reverse: false) ?? "" }
var convertedToPerlHex: String { applyingTransform(.toPerlHex, reverse: false) ?? "" }
}


"෴".convertedToUnicodeHex  // U+0DF4
"෴".convertedToJavaHex // \u0DF4
"෴".convertedToXMLHex // ෴
"෴".convertedToPerlHex // \x{DF4}
"෴".unicodes // [3572]
0x0DF4 // 3572


Related Topics



Leave a reply



Submit