Python- How To Remove Elements From a List Containing a Specific Word
Using list comprehension we can easily accomplish this goal. Also using in
we can check if a key word is in any elements in the given list.
list1= [ 'one', 'one-test', 'two', 'two-test', 'three', 'three-test']
newList = [elements for elements in list1 if '-test' not in elements]
output
['one', 'two', 'three']
How to remove items from a list that contains words found in items in another list
Lists should not be modified while they're being iterated over. Doing so can have undesirable side effects, such as the loop skipping over items.
Generally in Python you should avoid loops that add and remove elements from lists one at a time. Usually those kinds of loops can be replaced with more idiomatic list comprehensions.
[sa for sa in a if not any(sb in sa for sb in b)]
For what it's worth, one way to fix your loops as written would be to iterate over a copy of the list so the loop isn't affected by the changes to the original.
for i in a[:]:
for x in b:
if x in i:
a.remove(i)
Remove an item from a list if it only contains one word
Check the length of the list resulting from splitting the string, e.g.:
my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [x for x in my_list if len(x.strip().split()) > 1]
print(my_list)
# ['how is it going today?', 'It is nice weather outside']
strip()
is removing trailing whitespaces, while split()
is dividing the string into whitespace-separated substrings.
This is is not very efficient because it is creating a temporary list
, which can be avoided in a number of common scenarios.
If you can assume that separators are not present at the beginning and/or end of the string, a more efficient approach (which would work for multi-char separators) would be:
separators = ' ', '\t', '\n' # etc.
my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [
x for x in my_list
if any(separator in x for separator in separators)]
print(my_list)
# ['how is it going today?', 'It is nice weather outside', ' word']
If you can assume that separators are single-chars, an efficient approach that is robust against trailing separators
would be:
separators = ' \t\n' # etc.
my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [
x for x in my_list
if any(separator in x.strip(separators) for separator in separators)]
print(my_list)
# ['how is it going today?', 'It is nice weather outside']
Remove an element in Python list with partial word in list
Use a comprehension:
>>> [i for i in ss if not i.startswith('Sheet')]
['14',
'13',
'11',
'10',
'9',
'8',
'6',
'3',
'2',
'1',
'0',
'7',
'4',
'12',
'5']
Delete item from list if it contains a substring from a "blacklist"
You could join the blacklist into one expression:
import re
blacklist = re.compile('|'.join([re.escape(word) for word in B]))
then filter words out if they match:
C = [word for word in A if not blacklist.search(word)]
Words in the pattern are escaped (so that .
and other meta characters are not treated as such, but as literal characters instead), and joined into a series of |
alternatives:
>>> '|'.join([re.escape(word) for word in B])
'XXX|BBB'
Demo:
>>> import re
>>> A = [ 'cat', 'doXXXg', 'monkey', 'hoBBBrse', 'fish', 'snake']
>>> B = ['XXX', 'BBB']
>>> blacklist = re.compile('|'.join([re.escape(word) for word in B]))
>>> [word for word in A if not blacklist.search(word)]
['cat', 'monkey', 'fish', 'snake']
This should outperform any explicit membership testing, especially as the number of words in your blacklist grows:
>>> import string, random, timeit
>>> def regex_filter(words, blacklist):
... [word for word in A if not blacklist.search(word)]
...
>>> def any_filter(words, blacklist):
... [word for word in A if not any(bad in word for bad in B)]
...
>>> words = [''.join([random.choice(string.letters) for _ in range(random.randint(3, 20))])
... for _ in range(1000)]
>>> blacklist = [''.join([random.choice(string.letters) for _ in range(random.randint(2, 5))])
... for _ in range(10)]
>>> timeit.timeit('any_filter(words, blacklist)', 'from __main__ import any_filter, words, blacklist', number=100000)
0.36232495307922363
>>> timeit.timeit('regex_filter(words, blacklist)', "from __main__ import re, regex_filter, words, blacklist; blacklist = re.compile('|'.join([re.escape(word) for word in blacklist]))", number=100000)
0.2499098777770996
The above tests 10 random blacklisted short words (2 - 5 characters) against a list of 1000 random words (3 - 20 characters long), the regex is about 50% faster.
How to delete an item in a list if it exists?
1) Almost-English style:Test for presence using the in
operator, then apply the remove
method.
if thing in some_list: some_list.remove(thing)
The remove
method will remove only the first occurrence of thing
, in order to remove all occurrences you can use while
instead of if
.
while thing in some_list: some_list.remove(thing)
- Simple enough, probably my choice.for small lists (can't resist one-liners)
This shoot-first-ask-questions-last attitude is common in Python. Instead of testing in advance if the object is suitable, just carry out the operation and catch relevant Exceptions:
try:
some_list.remove(thing)
except ValueError:
pass # or scream: thing not in some_list!
except AttributeError:
call_security("some_list not quacking like a list!")
Off course the second except clause in the example above is not only of questionable humor but totally unnecessary (the point was to illustrate duck-typing for people not familiar with the concept).
If you expect multiple occurrences of thing:
while True:
try:
some_list.remove(thing)
except ValueError:
break
- a little verbose for this specific use case, but very idiomatic in Python.
- this performs better than #1
- PEP 463 proposed a shorter syntax for try/except simple usage that would be handy here, but it was not approved.
However, with contextlib's suppress() contextmanager (introduced in python 3.4) the above code can be simplified to this:
with suppress(ValueError, AttributeError):
some_list.remove(thing)
Again, if you expect multiple occurrences of thing:
with suppress(ValueError):
while True:
some_list.remove(thing)
3) Functional style:Around 1993, Python got lambda
, reduce()
, filter()
and map()
, courtesy of a Lisp hacker who missed them and submitted working patches*. You can use filter
to remove elements from the list:
is_not_thing = lambda x: x is not thing
cleaned_list = filter(is_not_thing, some_list)
There is a shortcut that may be useful for your case: if you want to filter out empty items (in fact items where bool(item) == False
, like None
, zero, empty strings or other empty collections), you can pass None as the first argument:
cleaned_list = filter(None, some_list)
- [update]: in Python 2.x,
filter(function, iterable)
used to be equivalent to[item for item in iterable if function(item)]
(or[item for item in iterable if item]
if the first argument isNone
); in Python 3.x, it is now equivalent to(item for item in iterable if function(item))
. The subtle difference is that filter used to return a list, now it works like a generator expression - this is OK if you are only iterating over the cleaned list and discarding it, but if you really need a list, you have to enclose thefilter()
call with thelist()
constructor. - *These Lispy flavored constructs are considered a little alien in Python. Around 2005, Guido was even talking about dropping
filter
- along with companionsmap
andreduce
(they are not gone yet butreduce
was moved into the functools module, which is worth a look if you like high order functions).
List comprehensions became the preferred style for list manipulation in Python since introduced in version 2.0 by PEP 202. The rationale behind it is that List comprehensions provide a more concise way to create lists in situations where map()
and filter()
and/or nested loops would currently be used.
cleaned_list = [ x for x in some_list if x is not thing ]
Generator expressions were introduced in version 2.4 by PEP 289. A generator expression is better for situations where you don't really need (or want) to have a full list created in memory - like when you just want to iterate over the elements one at a time. If you are only iterating over the list, you can think of a generator expression as a lazy evaluated list comprehension:
for item in (x for x in some_list if x is not thing):
do_your_thing_with(item)
- See this Python history blog post by GvR.
- This syntax is inspired by the set-builder notation in math.
- Python 3 has also set and dict comprehensions.
- you may want to use the inequality operator
!=
instead ofis not
(the difference is important) - for critics of methods implying a list copy: contrary to popular belief, generator expressions are not always more efficient than list comprehensions - please profile before complaining
How to delete item in nested list if it contains keyword?
You could join each of the tuples into a string and then check if any keyword is in the string to filter your list.
newlist = [m for m in mylist if not any(k for k in keywords if k in ' '.join(m))]
print(newlist)
# [('Bob', 'English'), ('Brian', 'Math and Gym')]
Find and delete list elements if matching a string
Normally when we perform list comprehension, we build a new list and assign it the same name as the old list. Though this will get the desired result, but this will not remove the old list in place.
To make sure the reference remains the same, you must use this:
>>> stringlist[:] = [x for x in stringlist if "Two" not in x]
>>> stringlist
['elementOne', 'elementThree']
Advantages:
Since it is assigning to a list slice, it will replace the contents with the same Python list object, so the reference remains the same, thereby preventing some bugs if it is being referenced elsewhere.
If you do this below, you will lose the reference to the original list.
>>> stringlist = [x for x in stringlist if "Two" not in x]
>>> stringlist
['elementOne', 'elementThree']
So to preserve the reference, you build the list object and assign it the list slice.
To understand the subtle difference:
Let us take a list a1
containing some elements and assign list a2
equal to a1
.
>>> a1 = [1,2,3,4]
>>> a2 = a1
Approach-1:
>>> a1 = [x for x in a1 if x<2]
>>> a1
[1]
>>> a2
[1,2,3,4]
Approach-2:
>>> a1[:] = [x for x in a1 if x<2]
>>> a1
[1]
>>> a2
[1]
Approach-2 actually replaces the contents of the original a1
list whereas Approach-1 does not.
Related Topics
Passing a List of Values from Python to the in Clause of an SQL Query
How to Make Python Code to Execute Only Once
How to Repeatedly Execute a Function Every X Seconds
How to Display Last 2 Digits from a Number in Python
How to Select All Elements Greater Than a Given Values in a Dataframe
Delete Rows Containing Numeric Values in Strings from Pandas Dataframe
Print All Number Divisible by 7 and Contain 7 from 0 to 100
How to Convert Python Code to Application
Pandas Dataframe Calculations With Previous Row
How to Extract a Value (I Want an Int Not Row) from a Dataframe and Do Simple Calculations on It
How to Select the Last Column of Dataframe
Get Only Unique Words from a Sentence in Python
How to Read Numbers from File in Python
Django: Check Whether an Object Already Exists Before Adding
How to Get All Users in a Telegram Channel Using Telethon
Typeerror: the Json Object Must Be Str, Not 'Bytes'
How to Overwrite Part of a Text File in Python
Comparing Two Json Objects Irrespective of the Sequence of Elements in Them