Swift Utf8 Encoding and Non Utf8 Character

Swift UTF8 encoding and non UTF8 character

I've found a solution.

The UTF8 take 8 bit of table ASCII, and the UTF16 take 16 bit ASCII table, the solution is simple by modifying my function to:

func stringToUTF16String (stringaDaConvertire stringa: String) -> String {
let encodedData = stringa.dataUsingEncoding(NSUTF16StringEncoding)!
let attributedOptions = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]
let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!
//println(attributedString.string)
return attributedString.string
}

Swift: How to get UTF-8 representation of characters (as 0xXX 0xXX 0xXX...)?

The .nonLossyASCII conversion converts each non-ASCII character to a "\uNNNN" escape sequence, which is why your approach does not work.

self.utf8 gives the UTF-8 representation of a String. Then format each UTF-8 code point as a "0xNN" string, and join the results with space characters:

extension String {
var utf8Representation: String {
return self.utf8.map { String(format: "0x%02hhx", $0) }.joined(separator: " ")
}

}

Example:

print(".utf8Representation)
// 0xf0 0x9f 0x98 0x80

Converting any UTF-8 encoded Characters in String in Swift from API

You can use NSAttributedString to convert these HTML entities to string.

let htmlString = "test北京的test"
if let htmldata = htmlString.dataUsingEncoding(NSUTF8StringEncoding), let attributedString = try? NSAttributedString(data: htmldata, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType], documentAttributes: nil) {
let finalString = attributedString.string
print(finalString)
//output: test北京的test
}

Get the UTF-8 Encoding of a Character in Bytes

Character has no direct (public) accessor to its UTF-8 representation.

There are some internal methods in Character.swift dealing with the UTF-8 bytes, but the public stuff is implemented in
String.UTF8View in StringUTF8.swift.

Therefore String(myChar).utf8.count is the correct way to obtain
the length of the characters UTF-8 representation.

Swift base64 decode non alphabetic and non utf-8 strings

You're looking a fairly old examples. The syntax changed to this:

let decodedData = Data(base64Encoded: base64String)

I've tested with your examples, and they work fine. Keep in mind that the output is raw Data, this isn't a String in any encoding (Windows-1252 or ISO-8859-1, etc). It's just a sequence of random bytes, and that's what it is expected to be. The online tool you're using is just trying to decode it as ISO-8859-1, but that's gibberish, and is in fact corrupted in the output you've shown. It's not displaying the first byte (which is 0x16, and unprintable).

Can the conversion of a String to Data with UTF-8 encoding ever fail?

UTF-8 can represent all valid Unicode code points, therefore a conversion
of a Swift string to UTF-8 data cannot fail.

The forced unwrap in

let string = "some string .."
let data = string.data(using: .utf8)!

is safe.

The same would be true for .utf16 or .utf32, but not for
encodings which represent only a restricted character set,
such as .ascii or .isoLatin1.

You can alternatively use the .utf8 view of a string to create UTF-8 data,
avoiding the forced unwrap:

let string = "some string .."
let data = Data(string.utf8)


Related Topics



Leave a reply



Submit