Case Insensitive 'Contains(String)'

Case insensitive 'Contains(string)'

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)

To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it will be wrong in familiar ways.

How to check if a String contains another String in a case insensitive manner in Java?

Yes, contains is case sensitive. You can use java.util.regex.Pattern with the CASE_INSENSITIVE flag for case insensitive matching:

Pattern.compile(Pattern.quote(wantedStr), Pattern.CASE_INSENSITIVE).matcher(source).find();

EDIT: If s2 contains regex special characters (of which there are many) it's important to quote it first. I've corrected my answer since it is the first one people will see, but vote up Matt Quail's since he pointed this out.

String contains - ignore case

You can use

org.apache.commons.lang3.StringUtils.containsIgnoreCase(CharSequence str,
CharSequence searchStr);

Checks if CharSequence contains a search CharSequence irrespective of
case, handling null. Case-insensitivity is defined as by
String.equalsIgnoreCase(String).

A null CharSequence will return false.

This one will be better than regex as regex is always expensive in terms of performance.

For official doc, refer to : StringUtils.containsIgnoreCase

Update :

If you are among the ones who

  • don't want to use Apache commons library
  • don't want to go with the expensive regex/Pattern based solutions,
  • don't want to create additional string object by using toLowerCase,

you can implement your own custom containsIgnoreCase using java.lang.String.regionMatches

public boolean regionMatches(boolean ignoreCase,
int toffset,
String other,
int ooffset,
int len)

ignoreCase : if true, ignores case when comparing characters.

public static boolean containsIgnoreCase(String str, String searchStr)     {
if(str == null || searchStr == null) return false;

final int length = searchStr.length();
if (length == 0)
return true;

for (int i = str.length() - length; i >= 0; i--) {
if (str.regionMatches(true, i, searchStr, 0, length))
return true;
}
return false;
}

Make String.Contains() case insensitive

If x.IndexOf(y, StringComparison.CurrentCultureIgnoreCase) >= 0 Then
'do nothing
End If

What is the fastest, case insensitive, way to see if a string contains another string in C#?

Since ToUpper would actually result in a new string being created, StringComparison.OrdinalIgnoreCase would be faster, also, regex has a lot of overhead for a simple compare like this. That said, String.IndexOf(String, StringComparison.OrdinalIgnoreCase) should be the fastest, since it does not involve creating new strings.

I would guess (there I go again) that RegEx has the better worst case because of how it evaluates the string, IndexOf will always do a linear search, I'm guessing (and again) that RegEx is using something a little better. RegEx should also have a best case which would likely be close, though not as good, as IndexOf (due to additional complexity in it's language).

15,000 length string, 10,000 loop

00:00:00.0156251 IndexOf-OrdinalIgnoreCase
00:00:00.1093757 RegEx-IgnoreCase
00:00:00.9531311 IndexOf-ToUpper
00:00:00.9531311 IndexOf-ToLower

Placement in the string also makes a huge difference:

At start:
00:00:00.6250040 Match
00:00:00.0156251 IndexOf
00:00:00.9687562 ToUpper
00:00:01.0000064 ToLower

At End:
00:00:00.5781287 Match
00:00:01.0468817 IndexOf
00:00:01.4062590 ToUpper
00:00:01.4218841 ToLower

Not Found:
00:00:00.5625036 Match
00:00:01.0000064 IndexOf
00:00:01.3750088 ToUpper
00:00:01.3906339 ToLower

Case-insensitive contains in Linq

the easy way is to use ToLower() method

var lists = rec.Where(p => p.Name.ToLower().Contains(records.Name.ToLower())).ToList();

a better solution (based on this post: Case insensitive 'Contains(string)')

 var lists = rec.Where(p => 
CultureInfo.CurrentCulture.CompareInfo.IndexOf
(p.Name, records.Name, CompareOptions.IgnoreCase) >= 0).ToList();

Case-Insensitive List Search

Instead of String.IndexOf, use String.Equals to ensure you don't have partial matches. Also don't use FindAll as that goes through every element, use FindIndex (it stops on the first one it hits).

if(testList.FindIndex(x => x.Equals(keyword,  
StringComparison.OrdinalIgnoreCase) ) != -1)
Console.WriteLine("Found in list");

Alternately use some LINQ methods (which also stops on the first one it hits)

if( testList.Any( s => s.Equals(keyword, StringComparison.OrdinalIgnoreCase) ) )
Console.WriteLine("found in list");

Find a substring in a case-insensitive way - C#

You can use the IndexOf() method, which takes in a StringComparison type:

string s = "foobarbaz";
int index = s.IndexOf("BAR", StringComparison.CurrentCultureIgnoreCase); // index = 3

If the string was not found, IndexOf() returns -1.

Linq IEqualityComparerstring Ignore Case

Update as per comment

Search string inside another string:

var matchEle = listOfElements
.Where(e => e.Properties().Any(p => p.Name.IndexOf("Key", System.StringComparison.OrdinalIgnoreCase) >= 0))
.First();

Old solutions

You have two options, that depends on Name type:

1 - Without IEqualityComparer, and if Name in Properties is a string. replace Contains by Equals like :

var matchEle = listOfElements
.Where(e => e.Properties().Any(p => p.Name.Equals("Key", StringComparison.OrdinalIgnoreCase)))
.First();

2 - With IEqualityComparer, and if Name in Properties is a list of string:

2.1 : Create a custom comparer, like:

public class StringIEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return x.Equals(y, StringComparison.OrdinalIgnoreCase);
}

public int GetHashCode(string obj)
{
return obj.GetHashCode();
}
}

2.2 : change little your query to :

var matchEle = listOfElements
.Where(e => e.Properties().Any(p => p.Name.Contains("Key", new StringIEqualityComparer())))
.First();

I hope this helps you.



Related Topics



Leave a reply



Submit