Remove Substring from the String

How can I remove a substring from a given String?

You could easily use String.replace():

String helloWorld = "Hello World!";
String hellWrld = helloWorld.replace("o","");

How to remove specific substrings from a set of strings in Python?

Strings are immutable. str.replace creates a new string. This is stated in the documentation:

str.replace(old, new[, count])

Return a copy of the string with all occurrences of substring old replaced by new. [...]

This means you have to re-allocate the set or re-populate it (re-allocating is easier with a set comprehension):

new_set = {x.replace('.good', '').replace('.bad', '') for x in set1}

P.S. if you want to change the prefix or suffix of a string and you're using Python 3.9 or newer, use str.removeprefix() or str.removesuffix() instead:

new_set = {x.removesuffix('.good').removesuffix('.bad') for x in set1}

How do I remove a list of substrings from a given string in Python?

You could just use str.replace() to replace the substrings with "". This also means that the final result would need to be split and joined by " " to only have one whitespace between the words after replacing. You can use str.split() and str.join() for this.

string = "Play soccer tomorrow from 2pm to 3pm @homies"

times = ["tomorrow", "from 2pm to 3pm"]

for time in times:
string = string.replace(time, "")

print(" ".join(string.split()))
# Play soccer @homies

Note: Strings are immutable in python, so you cannot simply modify it in-place with string.replace(time, ""). You need to reassign the string with string = string.replace(time, "").

Remove substring from the string

You can use the slice method:

a = "foobar"
a.slice! "foo"
=> "foo"
a
=> "bar"

there is a non '!' version as well. More info can be seen in the documentation about other versions as well:
http://www.ruby-doc.org/core/classes/String.html#method-i-slice-21

Remove substring from string like lstrip but not as single characters

If your goal is to replace all occurrences of ABC, then use replace like the other answers. If you only want to remove from the left, then use a regex:

import re

s = "ABCABCABCBCADABC"
re.sub("^(ABC)+", "", s) # 'BCADABC'

Remove substring only if followed by space or nothing (i.e. end of string) - Python

Actually, I think the logic you want here is:

remove_list = ['tree']
terms = r'\s*\b(?:' + '|'.join(remove_list) + r')\b\s*'

df['column'] = df['column'].str.replace(terms, ' ', regex=True).str.strip()

Note that the regex pattern used above is, for a one word term list, \s*\b(?:tree)\b\s*. This will match only the exact word tree and not when tree appears as a substring of another word. We also attempt to grab any spaces on either side of the word. Then, we replace with just a single space, and trim the column to make sure there are no stray spaces at the start or end.

Edit:

To address the edge case put forth by @user2357112, consider the following input:

apple tree tree squirrel

In this case, the above solution would leave behind two spaces in between apple and squirrel. We can get around this by expanding our regex pattern to allow for multiple consecutive keyword matches:

terms = r'\s*\b(?:' + '|'.join(remove_list) + r')\b(?: \b(?:' + '|'.join(remove_list) + r'))*\b\s*'
df['column'] = df['column'].str.replace(terms, ' ', regex=True).str.strip()

Here we are using the following regex pattern:

\s*\b(?:tree)\b(?: \b(?:tree))*\b\s*


Related Topics



Leave a reply



Submit