How to Remove an Item from a List in Python If That Item Contains a Word

Python- How To Remove Elements From a List Containing a Specific Word

Using list comprehension we can easily accomplish this goal. Also using in we can check if a key word is in any elements in the given list.

list1= [ 'one', 'one-test', 'two', 'two-test', 'three', 'three-test']
newList = [elements for elements in list1 if '-test' not in elements]

output

['one', 'two', 'three']

How to remove items from a list that contains words found in items in another list

Lists should not be modified while they're being iterated over. Doing so can have undesirable side effects, such as the loop skipping over items.

Generally in Python you should avoid loops that add and remove elements from lists one at a time. Usually those kinds of loops can be replaced with more idiomatic list comprehensions.

[sa for sa in a if not any(sb in sa for sb in b)]

For what it's worth, one way to fix your loops as written would be to iterate over a copy of the list so the loop isn't affected by the changes to the original.

for i in a[:]:
for x in b:
if x in i:
a.remove(i)

Remove an item from a list if it only contains one word

Check the length of the list resulting from splitting the string, e.g.:

my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [x for x in my_list if len(x.strip().split()) > 1]


print(my_list)
# ['how is it going today?', 'It is nice weather outside']

strip() is removing trailing whitespaces, while split() is dividing the string into whitespace-separated substrings.
This is is not very efficient because it is creating a temporary list, which can be avoided in a number of common scenarios.


If you can assume that separators are not present at the beginning and/or end of the string, a more efficient approach (which would work for multi-char separators) would be:

separators = ' ', '\t', '\n'  # etc.


my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [
x for x in my_list
if any(separator in x for separator in separators)]


print(my_list)
# ['how is it going today?', 'It is nice weather outside', ' word']

If you can assume that separators are single-chars, an efficient approach that is robust against trailing separators would be:

separators = ' \t\n'  # etc.


my_list = ['how is it going today?','good','the','It is nice weather outside',' word']
my_list = [
x for x in my_list
if any(separator in x.strip(separators) for separator in separators)]


print(my_list)
# ['how is it going today?', 'It is nice weather outside']

Remove an element in Python list with partial word in list

Use a comprehension:

>>> [i for i in ss if not i.startswith('Sheet')]
['14',
'13',
'11',
'10',
'9',
'8',
'6',
'3',
'2',
'1',
'0',
'7',
'4',
'12',
'5']

Delete item from list if it contains a substring from a "blacklist"

You could join the blacklist into one expression:

import re

blacklist = re.compile('|'.join([re.escape(word) for word in B]))

then filter words out if they match:

C = [word for word in A if not blacklist.search(word)]

Words in the pattern are escaped (so that . and other meta characters are not treated as such, but as literal characters instead), and joined into a series of | alternatives:

>>> '|'.join([re.escape(word) for word in B])
'XXX|BBB'

Demo:

>>> import re
>>> A = [ 'cat', 'doXXXg', 'monkey', 'hoBBBrse', 'fish', 'snake']
>>> B = ['XXX', 'BBB']
>>> blacklist = re.compile('|'.join([re.escape(word) for word in B]))
>>> [word for word in A if not blacklist.search(word)]
['cat', 'monkey', 'fish', 'snake']

This should outperform any explicit membership testing, especially as the number of words in your blacklist grows:

>>> import string, random, timeit
>>> def regex_filter(words, blacklist):
... [word for word in A if not blacklist.search(word)]
...
>>> def any_filter(words, blacklist):
... [word for word in A if not any(bad in word for bad in B)]
...
>>> words = [''.join([random.choice(string.letters) for _ in range(random.randint(3, 20))])
... for _ in range(1000)]
>>> blacklist = [''.join([random.choice(string.letters) for _ in range(random.randint(2, 5))])
... for _ in range(10)]
>>> timeit.timeit('any_filter(words, blacklist)', 'from __main__ import any_filter, words, blacklist', number=100000)
0.36232495307922363
>>> timeit.timeit('regex_filter(words, blacklist)', "from __main__ import re, regex_filter, words, blacklist; blacklist = re.compile('|'.join([re.escape(word) for word in blacklist]))", number=100000)
0.2499098777770996

The above tests 10 random blacklisted short words (2 - 5 characters) against a list of 1000 random words (3 - 20 characters long), the regex is about 50% faster.

How to delete an item in a list if it exists?

1) Almost-English style:

Test for presence using the in operator, then apply the remove method.

if thing in some_list: some_list.remove(thing)

The removemethod will remove only the first occurrence of thing, in order to remove all occurrences you can use while instead of if.

while thing in some_list: some_list.remove(thing)    
  • Simple enough, probably my choice.for small lists (can't resist one-liners)
2) Duck-typed, EAFP style:

This shoot-first-ask-questions-last attitude is common in Python. Instead of testing in advance if the object is suitable, just carry out the operation and catch relevant Exceptions:

try:
some_list.remove(thing)
except ValueError:
pass # or scream: thing not in some_list!
except AttributeError:
call_security("some_list not quacking like a list!")

Off course the second except clause in the example above is not only of questionable humor but totally unnecessary (the point was to illustrate duck-typing for people not familiar with the concept).

If you expect multiple occurrences of thing:

while True:
try:
some_list.remove(thing)
except ValueError:
break
  • a little verbose for this specific use case, but very idiomatic in Python.
  • this performs better than #1
  • PEP 463 proposed a shorter syntax for try/except simple usage that would be handy here, but it was not approved.

However, with contextlib's suppress() contextmanager (introduced in python 3.4) the above code can be simplified to this:

with suppress(ValueError, AttributeError):
some_list.remove(thing)

Again, if you expect multiple occurrences of thing:

with suppress(ValueError):
while True:
some_list.remove(thing)
3) Functional style:

Around 1993, Python got lambda, reduce(), filter() and map(), courtesy of a Lisp hacker who missed them and submitted working patches*. You can use filter to remove elements from the list:

is_not_thing = lambda x: x is not thing
cleaned_list = filter(is_not_thing, some_list)

There is a shortcut that may be useful for your case: if you want to filter out empty items (in fact items where bool(item) == False, like None, zero, empty strings or other empty collections), you can pass None as the first argument:

cleaned_list = filter(None, some_list)
  • [update]: in Python 2.x, filter(function, iterable) used to be equivalent to [item for item in iterable if function(item)] (or [item for item in iterable if item] if the first argument is None); in Python 3.x, it is now equivalent to (item for item in iterable if function(item)). The subtle difference is that filter used to return a list, now it works like a generator expression - this is OK if you are only iterating over the cleaned list and discarding it, but if you really need a list, you have to enclose the filter() call with the list() constructor.
  • *These Lispy flavored constructs are considered a little alien in Python. Around 2005, Guido was even talking about dropping filter - along with companions map and reduce (they are not gone yet but reduce was moved into the functools module, which is worth a look if you like high order functions).
4) Mathematical style:

List comprehensions became the preferred style for list manipulation in Python since introduced in version 2.0 by PEP 202. The rationale behind it is that List comprehensions provide a more concise way to create lists in situations where map() and filter() and/or nested loops would currently be used.

cleaned_list = [ x for x in some_list if x is not thing ]

Generator expressions were introduced in version 2.4 by PEP 289. A generator expression is better for situations where you don't really need (or want) to have a full list created in memory - like when you just want to iterate over the elements one at a time. If you are only iterating over the list, you can think of a generator expression as a lazy evaluated list comprehension:

for item in (x for x in some_list if x is not thing):
do_your_thing_with(item)
  • See this Python history blog post by GvR.
  • This syntax is inspired by the set-builder notation in math.
  • Python 3 has also set and dict comprehensions.
Notes

  1. you may want to use the inequality operator != instead of is not (the difference is important)
  2. for critics of methods implying a list copy: contrary to popular belief, generator expressions are not always more efficient than list comprehensions - please profile before complaining

How to delete item in nested list if it contains keyword?

You could join each of the tuples into a string and then check if any keyword is in the string to filter your list.

newlist = [m for m in mylist if not any(k for k in keywords if k in ' '.join(m))]

print(newlist)
# [('Bob', 'English'), ('Brian', 'Math and Gym')]

Find and delete list elements if matching a string

Normally when we perform list comprehension, we build a new list and assign it the same name as the old list. Though this will get the desired result, but this will not remove the old list in place.

To make sure the reference remains the same, you must use this:

>>> stringlist[:] = [x for x in stringlist if "Two" not in x]
>>> stringlist
['elementOne', 'elementThree']

Advantages:

Since it is assigning to a list slice, it will replace the contents with the same Python list object, so the reference remains the same, thereby preventing some bugs if it is being referenced elsewhere.

If you do this below, you will lose the reference to the original list.

>>> stringlist = [x for x in stringlist if "Two" not in x]
>>> stringlist
['elementOne', 'elementThree']

So to preserve the reference, you build the list object and assign it the list slice.

To understand the subtle difference:

Let us take a list a1 containing some elements and assign list a2 equal to a1.

>>> a1 = [1,2,3,4]
>>> a2 = a1

Approach-1:

>>> a1 = [x for x in a1 if x<2]

>>> a1
[1]
>>> a2
[1,2,3,4]

Approach-2:

>>> a1[:] = [x for x in a1 if x<2]

>>> a1
[1]
>>> a2
[1]

Approach-2 actually replaces the contents of the original a1 list whereas Approach-1 does not.



Related Topics



Leave a reply



Submit