swift 3 - search result also with diacritics
You can use range(of:, options:)
with options for a diacritic insensitive
(and optionally case insensitive) search. Example:
let list = ["holesovice", "holešovice"]
let searchTerm = "sovi"
let filtered = list.filter {
$0.range(of: searchTerm, options: [.diacriticInsensitive, .caseInsensitive]) != nil
}
print(filtered) // ["holesovice", "holešovice"]
Search arabic text while ignoring diacritics or accents
NSStringCompareOptions
also has a DiacriticInsensitiveSearch
you can use (in the same way as the case insensitive).
Diacritic Sensitive sort for objects fetched from core data?
There're two approaches I can think of:
Therefore should I sort the results after fetching them from core data?
- that's a one way of doing it. Apple doesn't provide exact string sorting complexities, but I think the bigger problem is the need to first fetch all objects from the persistent store. If you have a lot of data it can hinder the performance. It's best to profile it and only then decide if the performance is acceptable.You can try to use
NSString
methods which are translated into SQL:localizedStandardCompare:
,localizedCompare:
orlocalizedCaseInsensitiveCompare:
. A sort descriptor using any of these methods can be created in the following way:sortDescriptor = [NSSortDescriptor sortDescriptorWithKey:@"sortTitle"
ascending:YES
selector:@selector(localizedCaseInsensitiveCompare:)];If none of these methods sorts the data the way you want, you can also preprocess it beforehand, e.g. when the title changes you remove the diacritics etc. (see
Normalize User-Generated Content
at NSHipster - CFStringTransform). UPDATE: Let me assume the title attribute is namedtitle
and the title for sorting is namedsortTitle
. In a subclass ofNSManagedObject
you can overridedidChangeValueForKey:
as follows:- (void)didChangeValueForKey:(NSString *)key
{
[super didChangeValueForKey:key];
if ([key isEqualToString:@"title"]) {
NSString *cleanTitle = [self.title mutableCopy];
CFStringTransform((__bridge CFMutableStringRef)(cleanTitle), NULL, kCFStringTransformStripCombiningMarks, NO);
self.sortTitle = [cleanTitle copy];
}
}
How to remove diacritics from a String in Swift?
This can also be done applying a StringTransform
:
let foo = "één"
let bar = foo.applyingTransform(.stripDiacritics, reverse: false)!
print(bar) // een
Or implementing a custom property to StringProtocol
extension StringProtocol {
var stripingDiacritics: String {
applyingTransform(.stripDiacritics, reverse: false)!
}
}
let bar = foo.stripingDiacritics
print(bar) // een
swift string diacriticInsensitive not working correct
This precisely matches the meaning of diacriticInsensitive
. UTR #30 covers this. "Diacritic removal" includes "stroke, hook, descender" and all other "diacritics" returning the "related base character." While in Swedish å
is considered a separate letter for sorting purposes, it still has a "base character" of (Latin) a
. (Similarly for ä and ö.) This is a complex problem in Swedish, but the results should not be surprising.
The ultimate rules are in Unicode's DiacriticFolding. These rules are not locale specific. It's possible that Foundation applies some additional locale rules, but clearly not in this case. The relevant Unicode folding rule is:
0061 030A; 0061 # å → a LATIN SMALL LETTER A, COMBINING RING ABOVE → LATIN SMALL LETTER A
Many cultures have subtle definitions of what is "a letter" vs "an extension of another letter" vs "a half-letter" vs "a non-letter symbol." When computing diacritics, the Turkish "İ" has a base form of "I", but "i" does not have a base form of "ı". That's bizarre, but true, because it's treating "basic latin" as the base alphabet. ("Basic Latin" is itself a bizarre classification, with letters j, u, and w being somewhat modern additions. But still we call it "Latin.")
Unicode tries to "thread the needle" on these complex issues, with varying success. It tends to be biased towards Romance languages (and particularly Western European countries). But it does try. And it has a focus on what users will expect. So should a search for "halla" find "Hallå." I'm betting that most Swedes would consider that "close enough."
Keyboards are designed to be useful to the cultures they're created for, so whether a particular symbol appears on the keyboard shouldn't be assumed to be making any strong claim about how the alphabet works. The iOS Arabic keyboard includes the half-letter "ء". That isn't making a claim about how the alphabet works. It's just saying that ء is somewhat commonly typed when writing Arabic.
FMDB: SQLite Statement ORDER BY orders diacritics incorrectly
You can define your own SQLite function that uses CFStringTransform
to remove the accents. Using FMDB 2.7:
db.makeFunctionNamed("unaccented", arguments: 1) { context, argc, argv in
guard db.valueType(argv[0]) == .text || db.valueType(argv[0]) == .null else {
db.resultError("Expected string parameter", context: context)
return
}
if let string = db.valueString(argv[0])?.folding(options: .diacriticInsensitive, locale: nil) {
db.resultString(string, context: context)
} else {
db.resultNull(context: context)
}
}
You can then use this new unaccented
function in your SQL:
do {
try db.executeQuery("SELECT * FROM spesenValues ORDER BY unaccented(country) ASC" values: nil)
while rs.next() {
// do what you want with results
}
rs.close()
} else {
NSLog("executeQuery error: %@", db.lastErrorMessage())
}
You suggest that you want to replace "ä", "ö", and "ü" with "ae", "oe", and "ue", respectively. This is generally only done with proper names and geographical names (see Wikipedia's entry for German orthography), but if you wanted to do that, have your custom function (which I've renamed "sortstring") replace these values as appropriate:
db.makeFunctionNamed("sortstring", arguments: 1) { context, argc, argv in
guard argc == 1 && (db.valueType(argv[0]) == .text || db.valueType(argv[0]) == .null) else {
db.resultError("Expected string parameter", context: context)
return
}
let replacements = ["ä": "ae", "ö": "oe", "ü": "ue", "ß": "ss"]
var string = db.valueString(argv[0])!.lowercased()
for (searchString, replacement) in replacements {
string = string.replacingOccurrences(of: searchString, with: replacement)
}
db.resultString(string.folding(options: .diacriticInsensitive, locale: nil), context: context)
}
By the way, since you're using this just for sorting, you probably want to convert this to lowercase, too, so that the upper case values are not separated from the lower case values.
But the idea is the same, define whatever function you want for sorting, and then you can use FMDB's makeFunctionNamed
to make it available in SQLite.
Bounding box of character with diacritics using CoreText
In the first example, you seem to ignore the fact that the bounding rect for glyphs has most probably a negative y origin. The returned rect usually treats y=0
as the baseline for text. You thus set an offset in bounds rect and that is probably also the reason the layer has an offset in the text. (didn't try but think so)
If you're not interested in the bounds of a specific text but choosing a height that encloses all kinds of text, you might also want to go for CTFontGetBoundingBox
.
Check if string contains special characters in Swift
Your code check if no character in the string is from the given set.
What you want is to check if any character is not in the given set:
if (searchTerm!.rangeOfCharacterFromSet(characterSet.invertedSet).location != NSNotFound){
println("Could not handle special characters")
}
You can also achieve this using regular expressions:
let regex = NSRegularExpression(pattern: ".*[^A-Za-z0-9].*", options: nil, error: nil)!
if regex.firstMatchInString(searchTerm!, options: nil, range: NSMakeRange(0, searchTerm!.length)) != nil {
println("could not handle special characters")
}
The pattern [^A-Za-z0-9]
matches a character which is not from the ranges A-Z,
a-z, or 0-9.
Update for Swift 2:
let searchTerm = "a+b"
let characterset = NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
if searchTerm.rangeOfCharacterFromSet(characterset.invertedSet) != nil {
print("string contains special characters")
}
Update for Swift 3:
let characterset = CharacterSet(charactersIn: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
if searchTerm.rangeOfCharacter(from: characterset.inverted) != nil {
print("string contains special characters")
}
Related Topics
Swift - Reorder Uitableview Cells
How to Deallocate Realitykit Arview()
Wkwebview Auto Fill Login Form Swift 2
Realitykit as a Framework to Build 3D Nonar Apps
Swiftui Out of Index When Deleting an Array Element in Foreach
Injecting a New Stylesheet into a Website via Uiwebview Using iOS8 Swift Xcode 6
Nothing Prints Out in the Console in Command Line Tool Xcode
Converting Url to String and Back Again
Read and Write Permission for User Selected Folder in MAC Os App
iOS 13: Threading Violation: Expected the Main Thread
Why Does the Following Code Crash on an iPhone 5 But Not an iPhone 5S
What Is Trailing Closure Syntax in Swift
Preventing a Coredata Crash for Upgrading Users
How to Convert Hexstring to Bytearray in Swift 3
In Swift,There's No Way to Get the Returned Function's Argument Names