How to Check If a String Contains Chinese in Swift

How can I check if a string contains Chinese in Swift?

This answer
to How to determine if a character is a Chinese character can also easily be translated from
Ruby to Swift (now updated for Swift 3):

extension String {
var containsChineseCharacters: Bool {
return self.range(of: "\\p{Han}", options: .regularExpression) != nil
}
}

if myString.containsChineseCharacters {
print("Contains Chinese")
}

In a regular expression, "\p{Han}" matches all characters with the
"Han" Unicode property, which – as I understand it – are the characters
from the CJK languages.

How to get the link if url contain Chinese by swift

The URL you are trying to use needs to be properly encoded. One solution is to build the URL using URLComponents.

var root = "https://zh.wikipedia.org"
var path = "/wiki/斯蒂芬·科里"
var urlcomps = URLComponents(string: root)!
urlcomps.path = path
let url = urlcomps.url!
print(url)

Output:

https://zh.wikipedia.org/wiki/%E6%96%AF%E8%92%82%E8%8A%AC%C2%B7%E7%A7%91%E9%87%8C

Check whether a string contains Japanese/Chinese characters

The ranges of Unicode characters which are routinely used for Chinese and Japanese text are:

  • U+3040 - U+30FF: hiragana and katakana (Japanese only)
  • U+3400 - U+4DBF: CJK unified ideographs extension A (Chinese, Japanese, and Korean)
  • U+4E00 - U+9FFF: CJK unified ideographs (Chinese, Japanese, and Korean)
  • U+F900 - U+FAFF: CJK compatibility ideographs (Chinese, Japanese, and Korean)
  • U+FF66 - U+FF9F: half-width katakana (Japanese only)

As a regular expression, this would be expressed as:

/[\u3040-\u30ff\u3400-\u4dbf\u4e00-\u9fff\uf900-\ufaff\uff66-\uff9f]/

This does not include every character which will appear in Chinese and Japanese text, but any significant piece of typical Chinese or Japanese text will be mostly made up of characters from these ranges.

Note that this regular expression will also match on Korean text that contains hanja. This is an unavoidable result of Han unification.

How can I get the Swift/Xcode console to show Chinese characters instead of unicode?

Instead of

println(json)

use

println((json as NSDictionary)["result"]!)

This will print the correct Chinese result.

Reason: the first print will call the debug description for NSDictionary which escapes not only Chinese chars.

Special - Chinese characters in string

I have done localisation in my app in Chinese as well, no problem so far, but I use mostly %d, not %ld.

Can you try using %d instead?

self.title.text = [NSString stringWithFormat:NSLocalizedString(@"Q%d", nil), (int)quizNumber];

Take a look at
https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/Strings/Articles/formatSpecifiers.html

How can I check if a string contains only latin characters?

Similarly as in How can I check if a string contains Chinese in Swift?,
you can use a regular expression to check if there is no character
not in the "Latin" character class:

extension String {
var latinCharactersOnly: Bool {
return self.range(of: "\\P{Latin}", options: .regularExpression) == nil
}
}

\P{Latin} (with capital "P") is the pattern matching any character not having the "Latin" Unicode character property.

Check if string latin or cyrillic

What about something like this?

extension String {
var isLatin: Bool {
let upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
let lower = "abcdefghijklmnopqrstuvwxyz"

for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}

return true
}

var isCyrillic: Bool {
let upper = "АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЮЯ"
let lower = "абвгдежзийклмнопрстуфхцчшщьюя"

for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}

return true
}

var isBothLatinAndCyrillic: Bool {
return self.isLatin && self.isCyrillic
}
}

Usage:

let s = "Hello"
if s.isLatin && !s.isBothLatinAndCyrillic {
// String is latin
} else if s.isCyrillic && !s.isBothLatinAndCyrillic {
// String is cyrillic
} else if s.isBothLatinAndCyrillic {
// String can be either latin or cyrillic
} else {
// String is not latin nor cyrillic
}

Considere there are cases where the given string could be both, for example the string:

let s = "A"

Can be both latin or cyrillic. So that's why there's the function "is both".

And it can also be none of them:

let s = "*"

How to limit UILabel's max characters

You can iterate your string indices counting the characters, if it is a chinese character add 2 otherwise add 1. If the count is equal to 16 return the substring up to the current index with "…" at the end. Something like:

extension Character {
var isChinese: Bool {
String(self).range(of: "\\p{Han}", options: .regularExpression) != nil
}
}


extension StringProtocol {
var limitedLenghtLabelText: String {
var count = 0
for index in indices {
count += self[index].isChinese ? 2 : 1
if count == 16 {
let upperBound = self.index(after: index)
return String(self[..<upperBound]) + (upperBound < endIndex ? "…" : "") }
}
return String(self)
}
}


"蝙蝠侠喜欢猫女和阿福".limitedLenghtLabelText   // "蝙蝠侠喜欢猫女和…"

"Batman喜欢猫女和阿福".limitedLenghtLabelText // "Batman喜欢猫女和…"


Related Topics



Leave a reply



Submit