Nsattributedstring and Emojis: Issue with Positions and Lengths

NSAttributedString and emojis: issue with positions and lengths

A Swift String provides different "views" on its contents.
A good overview is given in "Strings in Swift 2" in the Swift Blog:

  • characters is a collection of Character values, or extended grapheme clusters.
  • unicodeScalars is a collection of Unicode scalar values.
  • utf8 is a collection of UTF–8 code units.
  • utf16 is a collection of UTF–16 code units.

As it turned out in the discussion, pos and len from your API
are indices into the Unicode scalars view.

On the other hand, the addAttribute() method of NSMutableAttributedString takes an NSRange, i.e. the range corresponding
to indices of the UTF-16 code points in an NSString.

String provides methods to "translate" between indices of the
different views (compare NSRange to Range<String.Index>):

let text = "@ericd Some text. ✌ @apero"
let pos = 22
let len = 6

// Compute String.UnicodeScalarView indices for first and last position:
let from32 = text.unicodeScalars.index(text.unicodeScalars.startIndex, offsetBy: pos)
let to32 = text.unicodeScalars.index(from32, offsetBy: len)

// Convert to String.UTF16View indices:
let from16 = from32.samePosition(in: text.utf16)
let to16 = to32.samePosition(in: text.utf16)

// Convert to NSRange by computing the integer distances:
let nsRange = NSRange(location: text.utf16.distance(from: text.utf16.startIndex, to: from16),
length: text.utf16.distance(from: from16, to: to16))

This NSRange is what you need for the attributed string:

let attrString = NSMutableAttributedString(string: text)
attrString.addAttribute(NSForegroundColorAttributeName,
value: UIColor.red,
range: nsRange)

Update for Swift 4 (Xcode 9): In Swift 4, the standard library
provides methods to convert between Swift String ranges and NSString
ranges, therefore the calculations simplify to

let text = "@ericd Some text. ✌ @apero"
let pos = 22
let len = 6

// Compute String.UnicodeScalarView indices for first and last position:
let fromIdx = text.unicodeScalars.index(text.unicodeScalars.startIndex, offsetBy: pos)
let toIdx = text.unicodeScalars.index(fromIdx, offsetBy: len)

// Compute corresponding NSRange:
let nsRange = NSRange(fromIdx..<toIdx, in: text)

Swift: NSAttributedString & emojis

The most important thing to remember when working with NSAttributedString, NSRange, and String is that NSAttributedString (and NSString) and NSRange are based on UTF-16 encoded lengths. But String and its count are based on actual character counts. They don't mix.

If you ever try to create an NSRange with someSwiftString.count, you will get the wrong range. Always use someSwiftString.utf16.count.

In your specific case you are applying attributes to half for the ❤️ character due to the wrong length in the NSRange and that cascades to the errors you see.

And in the code you posted, you need to change:

guard (range.location + range.length - 1) < string.count else {

to:

guard (range.location + range.length - 1) < string.utf16.count else {

for the same reasons described above.

Emoji loss when using NSMutableAttributedString on UILabel

Your problem is with your NSRange calculations when setting the attributes. NS[Mutable]AttributeString needs the NSRange to be based on NSString ranges, not String ranges.

So code like:

NSRange.init(location: 0, length: commentString.count)

Needs to be written as either:

NSRange(location: 0, length: (commentString as NSString).length)

or:

NSRange(location: 0, length: commentString.utf16.count)

The following demonstrates the issue with commentString.count:

let comment = "br>print(comment.count) // 3
print((comment as NSString).length) // 6
print(comment.utf16.count) // 6

This is why your code seems to be splitting the middle character in half. You are passing in half (in this case) the needed length.

Emoji doesn't show in NSAttributedString when typed using iOS keyboard, does when typed on Android

We verified in the comments that the source string is okay. There's just one single parameter to change! Swift usually uses UTF-16 for strings. You're losing information processing it as UTF-8.

Change:

msg.dataUsingEncoding(NSUTF8StringEncoding

To:

msg.dataUsingEncoding(NSUTF16StringEncoding

For swift 5: use this msg.data(using: String.Encoding.utf16, allowLossyConversion: false)

Only 2 emoji return an incorrect length when compared against a character set containing them

Hardly an answer but to much to fit into a comment so bear with me :)

I don't know if you've already seen this but I think your problem is addressed in the Platform State of the Union talk from WWDC 2017 (https://developer.apple.com/videos/play/wwdc2017/102/) in the section about what is new in Swift 4.

If you look at the video at about the 23 minutes 12 seconds mark you'll see Ted Kremenek talk about how they've fixed separating unicode characters out as expected in Swift 4 using Unicode 9 Grapheme Braking.

Also, have a look at this question and answer.

Yes...Don't ask me in detail what all this means, but it seems as if they're working on it :)

Use NSAttributedString multiple times in one string

I found a way to do it:

I first check for the brackets, never bind what the content is. Once I've done that, I identify more precisely the stringiest. For example:

func regexMatchesToString(_ pattern: String, string: String) throws -> [String]? {
let regex = try NSRegularExpression(pattern: pattern)
let range = NSRange(string.startIndex..<string.endIndex, in: string)
let results = regex.matches(in: string, options: [], range: range)
return results.map {
String(string[Range($0.range, in: string)!])
}
}
func regexMatches(_ pattern: String, string: String) throws -> [NSTextCheckingResult]? {
let regex = try NSRegularExpression(pattern: pattern)
let range = NSRange(string.startIndex..<string.endIndex, in: string)
return regex.matches(in: string, options: [], range: range)
}

string = algAtCell //The text displayed in my UITableView
atStr = NSMutableAttributedString(string: string)

if let stringMatches1 = try? regexMatchesToString("\\((.*?)\\)", string: string) {
for triggerString in stringMatches1 {
print("String \(string)")
if triggerString == ("(R U R' U')") {
let matches = try? regexMatches("\\((R U R' U')\\)", string: string)
for match in matches! {
atStr.addAttribute(.foregroundColor, value: UIColor.init(displayP3Red: 255/255, green: 10/255, blue: 10/255, alpha: 1.0), range: match.range)
}

Hope this helps if anyone needs it.

HKTextview emoji detection

I found the answer to this - It turns out that HKWTextView does some rewiring of the UITextView delegate methods that are fired. Try handling the input in the UITextView delegate method textViewDidChangeSelection. That method will be fired when an emoji is typed.



Related Topics



Leave a reply



Submit