Regular Expressions: Search in List

Regular Expressions: Search in list

You can create an iterator in Python 3.x or a list in Python 2.x by using:

filter(r.match, list)

To convert the Python 3.x iterator to a list, simply cast it; list(filter(..)).

fastest and elegant way to check whether a given list contains some element by regular expression

As Loocid suggested, you can use any. I would do it with a generator expression like so:

newlist = ['this','thiis','thas','sada']
regex = re.compile('th.s')

result = any(regex.match(word) for word in newlist)
print(result) # True

Here is another version with map that is slightly faster:

result = any(map(regex.match, newlist))

RegEx for matching specific pattern in a Python list

In your pattern you are using 4 alternations but you are not taking the word data into account.

You could use re.match instead to start the match from the beginning of the string and use data\d+$ to match data followed by 1+ digits until the end of the string:

import re
strings_of_text = ['data0', 'data23', 'data2', 'data55', 'data_mismatch', 'green']
strings_to_keep = []
expression_to_use = r'data\d+$'

for string in strings_of_text:
# If the string is data#
if (re.match(expression_to_use, string)):
strings_to_keep.append(string)

print(strings_to_keep)

Python demo

You might keep working with a filtered collection instead of creating a new one using for example filter:

import re
strings_of_text = ['data0', 'data23', 'data2', 'data55', 'data_mismatch', 'green']
strings_to_keep = []
expression_to_use = r'data\d+$'

strings_of_text = list(filter(lambda x: re.match(expression_to_use, x), strings_of_text))
print(strings_of_text)

Result

['data0', 'data23', 'data2', 'data55']

Python demo

If any strings in a list match regex

You can use the builtin any():

r = re.compile('.*search.*')
if any(r.match(line) for line in output):
do_stuff()

Passing in the lazy generator to any() will allow it to exit on the first match without having to check any farther into the iterable.

Matching list of regular expression to list of strings

Use any() to test if any of the regular expressions match, rather than looping over the entire list.

Compile all the regular expressions first, so this doesn't have to be done repeatedly.

reg_list = [re.compile(rx) for rx in reg_list]

for word in y:
if any(rx.search(word) for rx in reg_list):
RESULT_LIST.append(word)

Find words in a list that match the input string using Regular Expressions

If you don't need to find overlapping matches, you can turn the list into a regular expression that uses | to match alternatives. Then use re.findall() to get all the matches.

import re

words = ["123","hello","nice","red","boy"]
string = "helloniceboy"
regex = re.compile('|'.join(re.escape(x) for x in words))
result = re.findall(regex, string)

re.escape() ensures that the words will be matched literally, even if they contain characters that have special meaning in regular expressions.

If you do need to find overlapping matches, the other answer that uses if word in input in a loop will work better.

How to match any string from a list of strings in regular expressions in python?

Join the list on the pipe character |, which represents different options in regex.

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))", x)

Output: ['fun']

You cannot use match as it will match from start.
Using search you will get only the first match. So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

Regular Expression in Middle of a String in a List (Python)

With filter(r.search, mylist), you just receive all items where there is a regex match anywhere inside an item. When you use filter(r.match, mylist), you only get items where the match is at the start of the string.

You may use

import re
mylist = ["dog", "cat named bob", "wildcat", "thundercat", "cow also named bob", "hooo"]
r = re.compile('named')
# You might gfo through the list, check if there is match
# by running a re.search, and there is, extract it
newlist = [r.search(x).group() for x in mylist if r.search(x)]
print(newlist)
# Or, use map to get the matches first, and then
# check if the object is not None and then retrieve the value
newlist = [x.group() for x in map(r.search, mylist) if x]
print(newlist)

See the Python demo

regular expressions search list, but return list of same size

Here's one way:

[m.group(0) if m else "" for m in map(r.match, mylist)]

Produces:

['', 'cat', 'wildcat', 'thundercat', '', '']

Find a specific pattern (regular expression) in a list of strings (Python)

Use re.search function along with the list comprehension.

>>> teststr = ['1 FirstString', '2x Sec String', '3rd String', 'x forString', '5X fifth']
>>> [i for i in teststr if re.search(r'\d+[xX]', i) ]
['2x Sec String', '5X fifth']

\d+ matches one or more digits. [xX] matches both upper and lowercase x.

By defining it as a separate function.

>>> def SomeFunc(s):
return [i for i in s if re.search(r'\d+[xX]', i)]

>>> print(SomeFunc(['1 FirstString', '2x Sec String', '3rd String', 'x forString', '5X fifth']))
['2x Sec String', '5X fifth']


Related Topics



Leave a reply



Submit