Replace Specific Characters Within Strings

Replacing instances of a character in a string

Strings in python are immutable, so you cannot treat them as a list and assign to indices.

Use .replace() instead:

line = line.replace(';', ':')

If you need to replace only certain semicolons, you'll need to be more specific. You could use slicing to isolate the section of the string to replace in:

line = line[:10].replace(';', ':') + line[10:]

That'll replace all semi-colons in the first 10 characters of the string.

Replace specific characters within strings

With a regular expression and the function gsub():

group <- c("12357e", "12575e", "197e18", "e18947")
group
[1] "12357e" "12575e" "197e18" "e18947"

gsub("e", "", group)
[1] "12357" "12575" "19718" "18947"

What gsub does here is to replace each occurrence of "e" with an empty string "".


See ?regexp or gsub for more help.

Best way to replace multiple characters in a string?

Replacing two characters

I timed all the methods in the current answers along with one extra.

With an input string of abc&def#ghi and replacing & -> \& and # -> \#, the fastest way was to chain together the replacements like this: text.replace('&', '\&').replace('#', '\#').

Timings for each function:

  • a) 1000000 loops, best of 3: 1.47 μs per loop
  • b) 1000000 loops, best of 3: 1.51 μs per loop
  • c) 100000 loops, best of 3: 12.3 μs per loop
  • d) 100000 loops, best of 3: 12 μs per loop
  • e) 100000 loops, best of 3: 3.27 μs per loop
  • f) 1000000 loops, best of 3: 0.817 μs per loop
  • g) 100000 loops, best of 3: 3.64 μs per loop
  • h) 1000000 loops, best of 3: 0.927 μs per loop
  • i) 1000000 loops, best of 3: 0.814 μs per loop

Here are the functions:

def a(text):
chars = "&#"
for c in chars:
text = text.replace(c, "\\" + c)


def b(text):
for ch in ['&','#']:
if ch in text:
text = text.replace(ch,"\\"+ch)


import re
def c(text):
rx = re.compile('([&#])')
text = rx.sub(r'\\\1', text)


RX = re.compile('([&#])')
def d(text):
text = RX.sub(r'\\\1', text)


def mk_esc(esc_chars):
return lambda s: ''.join(['\\' + c if c in esc_chars else c for c in s])
esc = mk_esc('&#')
def e(text):
esc(text)


def f(text):
text = text.replace('&', '\&').replace('#', '\#')


def g(text):
replacements = {"&": "\&", "#": "\#"}
text = "".join([replacements.get(c, c) for c in text])


def h(text):
text = text.replace('&', r'\&')
text = text.replace('#', r'\#')


def i(text):
text = text.replace('&', r'\&').replace('#', r'\#')

Timed like this:

python -mtimeit -s"import time_functions" "time_functions.a('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.b('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.c('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.d('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.e('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.f('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.g('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.h('abc&def#ghi')"
python -mtimeit -s"import time_functions" "time_functions.i('abc&def#ghi')"

Replacing 17 characters

Here's similar code to do the same but with more characters to escape (\`*_{}>#+-.!$):

def a(text):
chars = "\\`*_{}[]()>#+-.!$"
for c in chars:
text = text.replace(c, "\\" + c)


def b(text):
for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
if ch in text:
text = text.replace(ch,"\\"+ch)


import re
def c(text):
rx = re.compile('([&#])')
text = rx.sub(r'\\\1', text)


RX = re.compile('([\\`*_{}[]()>#+-.!$])')
def d(text):
text = RX.sub(r'\\\1', text)


def mk_esc(esc_chars):
return lambda s: ''.join(['\\' + c if c in esc_chars else c for c in s])
esc = mk_esc('\\`*_{}[]()>#+-.!$')
def e(text):
esc(text)


def f(text):
text = text.replace('\\', '\\\\').replace('`', '\`').replace('*', '\*').replace('_', '\_').replace('{', '\{').replace('}', '\}').replace('[', '\[').replace(']', '\]').replace('(', '\(').replace(')', '\)').replace('>', '\>').replace('#', '\#').replace('+', '\+').replace('-', '\-').replace('.', '\.').replace('!', '\!').replace('$', '\$')


def g(text):
replacements = {
"\\": "\\\\",
"`": "\`",
"*": "\*",
"_": "\_",
"{": "\{",
"}": "\}",
"[": "\[",
"]": "\]",
"(": "\(",
")": "\)",
">": "\>",
"#": "\#",
"+": "\+",
"-": "\-",
".": "\.",
"!": "\!",
"$": "\$",
}
text = "".join([replacements.get(c, c) for c in text])


def h(text):
text = text.replace('\\', r'\\')
text = text.replace('`', r'\`')
text = text.replace('*', r'\*')
text = text.replace('_', r'\_')
text = text.replace('{', r'\{')
text = text.replace('}', r'\}')
text = text.replace('[', r'\[')
text = text.replace(']', r'\]')
text = text.replace('(', r'\(')
text = text.replace(')', r'\)')
text = text.replace('>', r'\>')
text = text.replace('#', r'\#')
text = text.replace('+', r'\+')
text = text.replace('-', r'\-')
text = text.replace('.', r'\.')
text = text.replace('!', r'\!')
text = text.replace('$', r'\$')


def i(text):
text = text.replace('\\', r'\\').replace('`', r'\`').replace('*', r'\*').replace('_', r'\_').replace('{', r'\{').replace('}', r'\}').replace('[', r'\[').replace(']', r'\]').replace('(', r'\(').replace(')', r'\)').replace('>', r'\>').replace('#', r'\#').replace('+', r'\+').replace('-', r'\-').replace('.', r'\.').replace('!', r'\!').replace('$', r'\$')

Here's the results for the same input string abc&def#ghi:

  • a) 100000 loops, best of 3: 6.72 μs per loop
  • b) 100000 loops, best of 3: 2.64 μs per loop
  • c) 100000 loops, best of 3: 11.9 μs per loop
  • d) 100000 loops, best of 3: 4.92 μs per loop
  • e) 100000 loops, best of 3: 2.96 μs per loop
  • f) 100000 loops, best of 3: 4.29 μs per loop
  • g) 100000 loops, best of 3: 4.68 μs per loop
  • h) 100000 loops, best of 3: 4.73 μs per loop
  • i) 100000 loops, best of 3: 4.24 μs per loop

And with a longer input string (## *Something* and [another] thing in a longer sentence with {more} things to replace$):

  • a) 100000 loops, best of 3: 7.59 μs per loop
  • b) 100000 loops, best of 3: 6.54 μs per loop
  • c) 100000 loops, best of 3: 16.9 μs per loop
  • d) 100000 loops, best of 3: 7.29 μs per loop
  • e) 100000 loops, best of 3: 12.2 μs per loop
  • f) 100000 loops, best of 3: 5.38 μs per loop
  • g) 10000 loops, best of 3: 21.7 μs per loop
  • h) 100000 loops, best of 3: 5.7 μs per loop
  • i) 100000 loops, best of 3: 5.13 μs per loop

Adding a couple of variants:

def ab(text):
for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
text = text.replace(ch,"\\"+ch)


def ba(text):
chars = "\\`*_{}[]()>#+-.!$"
for c in chars:
if c in text:
text = text.replace(c, "\\" + c)

With the shorter input:

  • ab) 100000 loops, best of 3: 7.05 μs per loop
  • ba) 100000 loops, best of 3: 2.4 μs per loop

With the longer input:

  • ab) 100000 loops, best of 3: 7.71 μs per loop
  • ba) 100000 loops, best of 3: 6.08 μs per loop

So I'm going to use ba for readability and speed.

Addendum

Prompted by haccks in the comments, one difference between ab and ba is the if c in text: check. Let's test them against two more variants:

def ab_with_check(text):
for ch in ['\\','`','*','_','{','}','[',']','(',')','>','#','+','-','.','!','$','\'']:
if ch in text:
text = text.replace(ch,"\\"+ch)

def ba_without_check(text):
chars = "\\`*_{}[]()>#+-.!$"
for c in chars:
text = text.replace(c, "\\" + c)

Times in μs per loop on Python 2.7.14 and 3.6.3, and on a different machine from the earlier set, so cannot be compared directly.

╭────────────╥──────┬───────────────┬──────┬──────────────────╮
│ Py, input ║ ab │ ab_with_check │ ba │ ba_without_check │
╞════════════╬══════╪═══════════════╪══════╪══════════════════╡
│ Py2, short ║ 8.81 │ 4.22 │ 3.45 │ 8.01 │
│ Py3, short ║ 5.54 │ 1.34 │ 1.46 │ 5.34 │
├────────────╫──────┼───────────────┼──────┼──────────────────┤
│ Py2, long ║ 9.3 │ 7.15 │ 6.85 │ 8.55 │
│ Py3, long ║ 7.43 │ 4.38 │ 4.41 │ 7.02 │
└────────────╨──────┴───────────────┴──────┴──────────────────┘

We can conclude that:

  • Those with the check are up to 4x faster than those without the check

  • ab_with_check is slightly in the lead on Python 3, but ba (with check) has a greater lead on Python 2

  • However, the biggest lesson here is Python 3 is up to 3x faster than Python 2! There's not a huge difference between the slowest on Python 3 and fastest on Python 2!

How do you replace a specific character in a string in R with an existing character?

You can use

str_replace(string=sec, pattern = "^(\\d)-$", "0\\1")

The regex matches

  • ^ - start of string
  • (\d) - capturing group #1 (\1 in the replacement pattern refers to the string captured in this group): a digit
  • - - a hyphen
  • $ - end of string.

How to replace specific characters within a string in JavaScript?

You need to move your alert line out of the for loop to the end, and use Array.join() to convert newa into a string without commas.

Also, it would make sense to move the newa array declaration inside the swit function.

Lastly, you should consider also adding an else condition where you just push the exact same current character onto the newa array if it is not an a or b, so the output string length is the same as the original string.

function swit(x) {   var newa = [];  for (var i = 0; i < x.length; i++) {    if (x[i] === 'a') {      newa.push('b');    } else if (x[i] === 'b') {      newa.push('a');    } else {      newa.push(x[i])    }  }  alert(newa.join(""));}
swit("aaab");swit("aasdfcvbab");

How to replace set or group of characters with string in Python

My knowledge tells me there are 3 different ways of doing this, all of which are shorter than your method:

  • Using a for-loop
  • Using a generator-comprehension
  • Using regular expressions

First, using a for-loop. This is probably the most straight-forward improvement to your code and essentially just reduces the 5 lines with .replace on down to 2:

def replace_all(text, repl):
for c in "aeiou":
text = text.replace(c, repl)
return text

You could also do it in one-line using a generator-comprehension, combined with the str.join method. This would be faster (if that is of importance) as it is of complexity O(n) since we will go through each character and evaluate it once (the first method is complexity O(n^5) as Python will loop through text five times for the different replaces).

So, this method is simply:

def replace_all(text, repl):
return ''.join(repl if c in 'aeiou' else c for c in text)

Finally, we can use re.sub to substitute all of the characters in the set: [aeiou] with the text repl. This is the shortest of the solutions and probably what I would recommend:

import re
def replace_all(text, repl):
return re.sub('[aeiou]', repl, text)

As I said at the start, all these methods complete the task so there is no point me providing individual test cases but they do work as seen in this test:

>>> replace_all('hello world', 'x')
'hxllx wxrld'

Update

A new method has been brought to my attention: str.translate.

>>> {c:'x' for c in 'aeiou'}
{'a': 'x', 'e': 'x', 'i': 'x', 'o': 'x', 'u': 'x'}
>>> 'hello world'.translate({ord(c):'x' for c in 'aeiou'})
'hxllx wxrld'

This method is also O(n), so just as efficient as the previous two.

Java ArrayList - Replace a specific letter or character within an ArrayList of Strings?

newList.get(i).replace(".", "");

This doesn't update the list element - you construct the replaced string, but then discard that string.

You could use set:

newList.set(i, newList.get(i).replace(".", ""));

Or you could use a ListIterator:

ListIterator<String> it = newList.listIterator();
while (it.hasNext()) {
String s = it.next();
if (s.contains(".")) {
it.set(s.replace(".", ""));
}
}

but a better way would be to use replaceAll, with no need to loop explicitly:

newList.replaceAll(s -> s.replace(".", ""));

There's no need to check contains either: replace won't do anything if the search string is not present.



Related Topics



Leave a reply



Submit