Encoding Swift String as Escaped Unicode

Encoding Swift string as escaped unicode?

You can use reduce in your collection and check if each character isASCII, if true return that character otherwise convert the special character to unicode:

Swift 5.1 • Xcode 11

extension Unicode.Scalar {
var hexa: String { .init(value, radix: 16, uppercase: true) }
}

extension Character {
var hexaValues: [String] {
unicodeScalars
.map(\.hexa)
.map { #"\\U"# + repeatElement("0", count: 8-$0.count) + $0 }
}
}

extension StringProtocol where Self: RangeReplaceableCollection {
var asciiRepresentation: String { map { $0.isASCII ? .init($0) : $0.hexaValues.joined() }.joined() }
}

let textContainingUnicode = """
Let's go in the .
And some new lines.
"""

let asciiRepresentation = textContainingUnicode.asciiRepresentation
print(asciiRepresentation) // "Let's go \\U0001F3CA in the \\U0001F30A.\n And some new lines."

Convert data with escaped unicode characters to string

Assuming your data has the same content as something like this:

let data = #"Pla\u010daj Izbri\u0161i"#.data(using: .utf8)!
print(data as NSData) //->{length = 24, bytes = 0x506c615c7530313064616a20497a6272695c753031363169}

You can decode it in this way:

    public func decode(data: Data) throws -> String {
guard let text = String(data: data, encoding: .utf8) else {
throw SomeError()
}

let transform = StringTransform(rawValue: "Any-Hex/Java")
return text.applyingTransform(transform, reverse: true) ?? text
}

But, if you really get this sort of data from the web api, you should better tell the api engineer to use some normal encoding scheme.

Convert unicode symbols \uXXXX in String to Character in Swift

You can use \u{my_unicode}:

print("Ain\u{2019}t this a beautiful day")
/* Prints "Ain’t this a beautiful day"

From the Language Guide - Strings and Characters - Unicode:

String literals can include the following special characters:

...

  • An arbitrary Unicode scalar, written as \u{n}, where n is a 1–8 digit
    hexadecimal number with a value equal to a valid Unicode code point

HowTo convert escaped Unicode String to Unicode Character

First of all, you need to know that plist editor of Xcode can contain emojis directly:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>emojis</key>
<array>
<string>lt;/string>
<string>lt;/string>
<string>😁</string><!-- or you can use numeric reference of XML, in XML editor -->
</array>
</dict>
</plist>

But, if you already have some escaped version of plist file, you may need to convert it. Using NSRegularExpression, you can write something like this:

class EscapedUnicodeConverter: NSRegularExpression {
override func replacementString(for result: NSTextCheckingResult, in string: String, offset: Int, template templ: String) -> String {
if
result.numberOfRanges == 2,
case let nsString = string as NSString,
case let matchedString = nsString.substring(with: result.rangeAt(1)),
let unicodeScalarValue = UInt32(matchedString, radix: 16),
let unicodeScalar = UnicodeScalar(unicodeScalarValue)
{
return String(unicodeScalar)
} else {
return super.replacementString(for: result, in: string, offset: offset, template: templ)
}
}
}
//Using pattern for Swift
let unicodeConverterForSwift = try! EscapedUnicodeConverter(pattern: "\\\\u\\{([0-9A-Fa-f]+)\\}")

let origStr = "\\u{1f600}"
let result = unicodeConverterForSwift.stringByReplacingMatches(in: origStr, range: NSRange(0..<origStr.utf16.count), withTemplate: "???")
print(result) //->br>

I receive an improperly formatted unicode in a String

You may want to give us more context regarding what the raw server payload looked like, and show us how you're displaying the string. Some ways of examining strings in the debugger (or if you're looking at raw JSON) will show you escape strings, but if you use the string in the app, you'll see the actual Unicode character.

I wonder if you're just looking at raw JSON.
For example, I passed the JSON, {"foo": "Eat pok\u00e9."} to the following code:

let jsonString = String(data: data, encoding: NSUTF8StringEncoding)!
print(jsonString)
let dictionary = try! NSJSONSerialization.JSONObjectWithData(data, options: []) as! [String: String]
print(dictionary["foo"]!)

And it output:


{"foo": "Eat pok\u00e9."}
Eat poké.

By the way, this standard JSON escape syntax should not be confused with Swift's string literal escape syntax, in which the hex sequence must be wrapped in braces:

print("Eat pok\u{00e9}.")

Swift uses a different escape syntax in their string literals, and it should not be confused with that employed by formats like JSON.

swift can't decode string with escape sequence

You need 4 backslashes in the swift string to represent an actual backslash in model.str:

let json = """
{
"str": "\\\\",
}
"""
let jsonData = Data(json.utf8)

let decoder = JSONDecoder()

do {
let model = try decoder.decode(Model.self, from: jsonData)
print(model.str) // prints a single backslash
} catch {
print(error.localizedDescription)
}

A backslash in a JSON string needs to be escaped, so you need 2 backslashes in the JSON string, but to write this in a Swift string literal, you need to escape those two backslashes too. Hence the 4 backslashes.

Using Swift to unescape unicode characters, ie \u1234

It's fairly similar in Swift, though you still need to use the Foundation string classes:

let transform = "Any-Hex/Java"
let input = "\\u5404\\u500b\\u90fd" as NSString
var convertedString = input.mutableCopy() as NSMutableString

CFStringTransform(convertedString, nil, transform as NSString, 1)

println("convertedString: \(convertedString)")
// convertedString: 各個都

(The last parameter threw me for a loop until I realized that Boolean in Swift is a type alias for UInt - YES in Objective-C becomes 1 in Swift for these types of methods.)

How to get JSON response without unicode escape sequence Swift

Your JSON string has some issues. When using \U it should send 4 bytes (8 hexa characters) but it is sending only 2 bytes (4 hexa characters). If you have control of what is being returned the easiest solution is to return \u instead of \U which is the correct unicode escape for 2 bytes otherwise you will have to do some string manipulation and apply a string transform before decoding your json:

let jsonData = Data(#"{"x_city":"\U041a\U0438\U0457\U0432"}"#.utf8)
let jsonString = String(data: jsonData, encoding: .utf8) ?? ""
let cleanedData = Data(
jsonString
.replacingOccurrences(of: #"\U"#, with: #"\u"#)
.applyingTransform(.init("Hex-Any"), reverse: false)!
.utf8
)
struct Response: Codable {
let xCity: String
}
do {
let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase
let response = try decoder.decode(Response.self, from: cleanedData)

print(response) // Response(xCity: "Київ")
} catch {
print(error)
}


Related Topics



Leave a reply



Submit