Replace Text Based on a Dictionary

How to replace words in a string using a dictionary mapping

Here is one way.

a = "you don't need a dog"

d = {"don't": "do not" }

res = ' '.join([d.get(i, i) for i in a.split()])

# 'you do not need a dog'

Explanation

  • Never name a variable after a class, e.g. use d instead of dict.
  • Use str.split to split by whitespace.
  • There is no need to wrap str around values which are already strings.
  • str.join works marginally better with a list comprehension versus a generator expression.

replace text based on a dictionary

Usage: awk -f foo.awk dict.dat user.dat
http://www.gnu.org/software/gawk/manual/html_node/String-Functions.html

http://www.gnu.org/software/gawk/manual/html_node/Arrays.html

NR == FNR {
rep[$1] = $2
next
}

{
for (key in rep)
gsub(key, rep[key])
print
}

Easiest way to replace a string using a dictionary of replacements?

Using re:

import re

s = 'Спорт not russianA'
d = {
'Спорт':'Досуг',
'russianA':'englishA'
}

pattern = re.compile(r'\b(' + '|'.join(d.keys()) + r')\b')
result = pattern.sub(lambda x: d[x.group()], s)
# Output: 'Досуг not englishA'

This will match whole words only. If you don't need that, use the pattern:

pattern = re.compile('|'.join(d.keys()))

Note that in this case you should sort the words descending by length if some of your dictionary entries are substrings of others.

Replacing words in text file using a dictionary

I used items() to iterate over key and values of your fields dict.

I skip the blank lines with continue and clean the others with rstrip()

I replace every keys found in the line by the values in your fields dict, and I write every lines with print.

import fileinput

text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}

for line in fileinput.input(text, inplace=True):
line = line.rstrip()
if not line:
continue
for f_key, f_value in fields.items():
if f_key in line:
line = line.replace(f_key, f_value)
print line

Replacing text with dictionary keys (having multiple values) in Python - more efficiency

You can build a reverse index of product to type, by creating a dictionary where the keys are the values of the sublists

product_to_type = {}
for typ, product_lists in CountryList.items():
for product_list in product_lists:
for product in product_list:
product_to_type[product] = typ

A little python magic lets you compress this step into a generator that creates the dict

product_to_type = {product:typ for typ, product_lists in CountryList.items()
for product_list in product_lists for product in product_list}

Then you can create a function that splits the ingredients and maps them to type and apply that to the dataframe.

import pandas as pd

CountryList = {'FRUIT': [['apple'], ['orange'], ['banana']],
'CEREAL': [['oat'], ['wheat'], ['corn']],
'MEAT': [['chicken'], ['lamb'], ['pork'], ['turkey'], ['duck']]}

product_to_type = {product:typ for typ, product_lists in CountryList.items()
for product_list in product_lists for product in product_list}

def convert_product_to_type(products):
return " ".join(product_to_type.get(product, product)
for product in products.split(" "))

df = pd.DataFrame({'Dish': ['A', 'B','C'],
'Price': [15,8,20],
'Ingredient': ['apple banana apricot lamb ', 'wheat pork venison', 'orange lamb guinea']
})

df["Ingredient"] = df["Ingredient"].apply(convert_product_to_type)

print(df)

Note: This solution splits the ingredient list on word boundaries which assumes that ingredients themselves don't have spaces in them.

How do I replace letters in text using a dictionary?

You're iterating over the words in text, not over the letters in the words.

There's no need to use text.split(). Just iterate over text itself to get the letters. And then join them using an empty string.

res = "".join(Letters.get(i,i) for i  in text)

Replace text in a list of dictionaries

The answer by @CryptoFool seems like the one you want. A slightly more blunt force answer might be to just work with stings.

import json
orig= [
{"health": "good", "status": "up", "date":"2022.03.10","device.id":"device01"},
{"health": "poor", "status": "down", "date":"2022.03.10","device.id":"device02"}
]
orig_new = json.loads(json.dumps(orig).replace(".","_"))
print(orig_new)

That will give you :

[
{'health': 'good', 'status': 'up', 'date': '2022_03_10', 'device_id': 'device01'},
{'health': 'poor', 'status': 'down', 'date': '2022_03_10', 'device_id': 'device02'}
]


Related Topics



Leave a reply



Submit