Python Wildcard Search in String

Python wildcard search in string

Use fnmatch:

import fnmatch
lst = ['this','is','just','a','test']
filtered = fnmatch.filter(lst, 'th?s')

If you want to allow _ as a wildcard, just replace all underscores with '?' (for one character) or * (for multiple characters).

If you want your users to use even more powerful filtering options, consider allowing them to use regular expressions.

Wildcard matching in Python

It looks like you're essentially implementing a subset of regular expressions. Luckily, Python has a library for that built-in! If you're not familiar with how regular expressions (or, as their friends call them, regexes) work, I highly recommend you read through the documentation for them.

In any event, the function re.search is, I think, exactly what you're looking for. It takes, as its first argument, a pattern to match, and, as its second argument, the string to match it in. If the pattern is matched, search returns an SRE_Match object, which, conveniently, has a #start() method that returns the index at which the match starts.

To use the data from your example:

 import re
start_index = re.search(r'x.z', 'xxxxxgzg').start()

Note that, in regexes, . - not * -- is the wildcard, so you'll have to replace them in the pattern you're using.

Wildcard string search

For this simple form of wildcard search, you can use the in operator:

i_list = ['hello', 'good', 'bad', 'bye', 'yup', 'yupnogood', 'hellogood']
final_list = [f"{i}_hello" for i in temp_list if 'nogood' in i]

If you want something more complex, you could look into regular expressions.

How to use wildcard in string matching

You are going to want to look at the re module. This will let you do a regular expression and accomplish the same thing as the * does in the linux command line.

String Matching with wildcard in Python

The idea is to convert what you are looking for, ABCDEF in this case, into the following regular expression:

([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)

Each character is placed in [] in case it turns out to be a regex special character. The only complication is if one of the search characters is ^, as in ABCDEF^. The ^ character should just be escaped and is therefore handled specially.

Then you search the string for that pattern using re.search:

import re

substring = 'ABCDEF'
large_string = 'QQQQQABC.EF^QQQQQ'

new_substring = re.sub(r'([^^])', r'([\1]|\\.)', substring)
new_substring = re.sub(r'\^', r'(\\^|\\.)', new_substring)
print(new_substring)
regex = re.compile(new_substring)
m = regex.search(large_string)
if (m):
print(m.span())

Prints:

([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)
(5, 11)

Python: Find If Substring Exists in String Including Wildcard

def search(fullstring, substring):
def check(s1, s2):
for a, b in zip(s1, s2):
if a != b and b != "*":
return False
return True

for i in range(len(fullstring) - len(substring) + 1):
if check(fullstring[i : i + len(substring)], substring):
return True

return False

print(search("hitherehello", "the*e"))

Prints:

True

More tests:

print(search("hiXherehello", "*he*e")) # True
print(search("hitherXhello", "the*e")) # False

Find strings in list using wildcard

You can use fnmatch.filter() for this:

import fnmatch
l = ['RT07010534.txt', 'RT07010533.txt', 'RT02010534.txt']
pattern = 'RT0701*.txt'
matching = fnmatch.filter(l, pattern)
print(matching)

Outputs:

['RT07010534.txt', 'RT07010533.txt']

Is there a wildcard character in python

You can access the last character by slicing, e.g. -1 is the last one:

lst = ['&', 'A', 'B', 'C']

s = 'some random string which ends on &'

if s[-1] in lst:
print('hurray!')

#hurray!

Alternatively you can also use .endswith() if its only a few entries:

s = 'some random string which ends on &'

if s.endswith('&') or s.endswith('A'):
print('hurray!')

#hurray!

Since you also asked how to replace the last character, this can be done like this:

s = s[:-1] + '!'
#Out[72]: 'some random string which ends on !'

As per you comment, here is a wildcard solution:

import re
s = r' &'
pattern = r' .{1}$'
if re.search(pattern, s):
print('hurray!')
#hurray!

Search string in string with wildcard char

One way is to replace every letter of searched pattern allowing 'N' as alternative.

You can switch all the patterns using list comprehension:

raw_pattern = 'QWER'
pattern = ''.join(['(?:' + letter + '|N)' for letter in list(raw_pattern)])
#pattern = '(?:Q|N)(?:W|N)(?:E|N)(?:R|N)'

Then

sentence = 'QNENVFRZ'
re.findall(pattern, sentence)
>>> ['QNEN']

If the resulting list is not empty, the pattern was found in the sentence.

Edit:
The question was modified to only accept 'N' if it exchanges 'B', 'W', or 'C'.
Then we would like to create pattern like this:

pattern = ''.join(['(?:' + letter + '|N)' if letter in ('B', 'W', 'C') else letter for letter in list(raw_pattern)])
# pattern = 'Q(?:W|N)ER'

Of course then the original example does not match, as R was not able to replace N.
We get:

re.findall(pattern, sentence)
>>> []

We can check whether something was matched comparing to an empty list.

re.findall(pattern, sentence) == []
>>> True


Related Topics



Leave a reply



Submit