Replace Multiple Words from a String Based on the Values in an Array

PySpark replace multiple words in string column based on values in array column

Use aggregate function on text_entity array with splitted text column as the initial value like this:

from pyspark.sql import functions as F

jsonSting = """{"id":1,"text":"I talked with Christian today at Cafe Heimdal last Wednesday","text_entity":[{"word":"Christian","index":4,"start":14,"end":23},{"word":"Heimdal","index":8,"start":38,"end":45}]}"""
df = spark.read.json(spark.sparkContext.parallelize([jsonSting]))

df1 = df.withColumn(
"text",
F.array_join(
F.expr(r"""aggregate(
text_entity,
split(text, " "),
(acc, x) -> transform(acc, (y, i) -> IF(i=x.index, '(BLEEP)', y))
)"""),
" "
)
)

df1.show(truncate=False)
#+---+----------------------------------------------------------+----------------------------------------------+
#|id |text |text_entity |
#+---+----------------------------------------------------------+----------------------------------------------+
#|1 |I talked with (BLEEP) today at Cafe (BLEEP) last Wednesday|[{23, 4, 14, Christian}, {45, 8, 38, Heimdal}]|
#+---+----------------------------------------------------------+----------------------------------------------+

Replace multiple strings with multiple other strings

As an answer to:

looking for an up-to-date answer

If you are using "words" as in your current example, you might extend the answer of Ben McCormick using a non capture group and add word boundaries \b at the left and at the right to prevent partial matches.

\b(?:cathy|cat|catch)\b
  • \b A word boundary to prevent a partial match
  • (?: Non capture group
    • cathy|cat|catch match one of the alternatives
  • ) Close non capture group
  • \b A word boundary to prevent a partial match

Example for the original question:

let str = "I have a cat, a dog, and a goat.";
const mapObj = {
cat: "dog",
dog: "goat",
goat: "cat"
};
str = str.replace(/\b(?:cat|dog|goat)\b/gi, matched => mapObj[matched]);
console.log(str);

Replace multiple strings at once

You could extend the String object with your own function that does what you need (useful if there's ever missing functionality):

String.prototype.replaceArray = function(find, replace) {
var replaceString = this;
for (var i = 0; i < find.length; i++) {
replaceString = replaceString.replace(find[i], replace[i]);
}
return replaceString;
};

For global replace you could use regex:

String.prototype.replaceArray = function(find, replace) {
var replaceString = this;
var regex;
for (var i = 0; i < find.length; i++) {
regex = new RegExp(find[i], "g");
replaceString = replaceString.replace(regex, replace[i]);
}
return replaceString;
};

To use the function it'd be similar to your PHP example:

var textarea = $(this).val();
var find = ["<", ">", "\n"];
var replace = ["<", ">", "<br/>"];
textarea = textarea.replaceArray(find, replace);

Replace multiple words in string

If you're planning on having a dynamic number of replacements, which could change at any time, and you want to make it a bit cleaner, you could always do something like this:

// Define name/value pairs to be replaced.
var replacements = new Dictionary<string,string>();
replacements.Add("<Name>", client.FullName);
replacements.Add("<EventDate>", event.EventDate.ToString());

// Replace
string s = "Dear <Name>, your booking is confirmed for the <EventDate>";
foreach (var replacement in replacements)
{
s = s.Replace(replacement.Key, replacement.Value);
}

How to replace multiple substrings of a string?

Here is a short example that should do the trick with regular expressions:

import re

rep = {"condition1": "", "condition2": "text"} # define desired replacements here

# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems())
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)

For example:

>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'

Replacing multiple words in a string from different data sets in Python

The following program may be somewhat closer to what you are trying to accomplish. Please note that documentation has been included to help explain what is going on. The templates are a little different than yours but provide customization options.

#! /usr/bin/env python3
import random

PATH_TEMPLATE = './prototype/default_{}s.txt'

def main():
"""Demonstrate the StringReplacer class with a test sting."""
replacer = StringReplacer(PATH_TEMPLATE)
text = "Just been to see {film} in {location}, I'd highly recommend it!"
result = replacer.process(text)
print(result)

class StringReplacer:

"""StringReplacer(path_template) -> StringReplacer instance"""

def __init__(self, path_template):
"""Initialize the instance attribute of the class."""
self.path_template = path_template
self.cache = {}

def process(self, text):
"""Automatically discover text keys and replace them at random."""
keys = self.load_keys(text)
result = self.replace_keys(text, keys)
return result

def load_keys(self, text):
"""Discover what replacements can be made in a string."""
keys = {}
while True:
try:
text.format(**keys)
except KeyError as error:
key = error.args[0]
self.load_to_cache(key)
keys[key] = ''
else:
return keys

def load_to_cache(self, key):
"""Warm up the cache as needed in preparation for replacements."""
if key not in self.cache:
with open(self.path_template.format(key)) as file:
unique = set(filter(None, map(str.strip, file)))
self.cache[key] = tuple(unique)

def replace_keys(self, text, keys):
"""Build a dictionary of random replacements and run formatting."""
for key in keys:
keys[key] = random.choice(self.cache[key])
new_string = text.format(**keys)
return new_string

if __name__ == '__main__':
main()

(Multiple) Replace string with array

No jQuery is required.

var reps = {
UN: "Ali",
LC: "Turkey",
AG: "29",
...
};

return str.replace(/\[(\w+)\]/g, function(s, key) {
return reps[key] || s;
});
  • The regex /\[(\w+)\]/g finds all substrings of the form [XYZ].
  • Whenever such a pattern is found, the function in the 2nd parameter of .replace will be called to get the replacement.
  • It will search the associative array and try to return that replacement if the key exists (reps[key]).
  • Otherwise, the original substring (s) will be returned, i.e. nothing is replaced. (See In Javascript, what does it mean when there is a logical operator in a variable declaration? for how || makes this work.)


Related Topics



Leave a reply



Submit