Python: String Iteration Replace a Space With a Hyphen (Or Other Character)

How do I replace hyphen (-) with space in a string except for the matched pattern in a string?

You have to describe the two possibilities for your hyphen using negative lookarounds:

  • not preceded by four digits: (?<!\b[0-9]{4})
  • not followed by two or four digits: (?![0-9]{2}(?:[0-9]{2})?\b)

( "not preceded by A or not followed by B" is the negation of "preceded by A and followed by B" )

example:

import re

text = 'Studied b-tech from college in 2010-13'

result = re.sub(r'-(?:(?<!\b[0-9]{4}-)|(?![0-9]{2}(?:[0-9]{2})?\b))', ' ', text)

demo

( writing - (?: (?<! ... - ) | (?! ... ) ) is more efficient than (?<! ... )-|-(?! ... ), that's why you retrieve the hyphen in the lookbehind )

Python: Replace - with whitespace

You have the following errors:

Method 1: You are not assigning the return value of replace to any variable

Method 2: Strip only strips the characters from the beginning and the end of the string

Method 3 and 4: You are joining every character using either an empty string ('') or space (' '), not every word.

You can try this method:

text1 = [x.replace('-', ' ') for x in text1]

or this:

text1 = [' '.join(x.split('-')) for x in text1]

How to replace whitespaces with underscore?

You don't need regular expressions. Python has a built-in string method that does what you need:

mystring.replace(" ", "_")

Remove words with spaces or - in them Python

The problem with your regex is grouping. Using (-)?|( )? as a separator does not do what you think it does.

Consider what happens when the list of words is a,b:

>>> regex = "(-)?|( )?".join(["a", "b"])
>>> regex
'a(-)?|( )?b'

You'd like this regex to match ab or a b or a-b, but clearly it does not do that. It matches a, a-, b or <space>b instead!

>>> re.match(regex, 'a')
<_sre.SRE_Match object at 0x7f68c9f3b690>
>>> re.match(regex, 'a-')
<_sre.SRE_Match object at 0x7f68c9f3b718>
>>> re.match(regex, 'b')
<_sre.SRE_Match object at 0x7f68c9f3b690>
>>> re.match(regex, ' b')
<_sre.SRE_Match object at 0x7f68c9f3b718>

To fix this you can enclose the separator in its own group: ([- ])?.

If you also want to match words like wonder - land (i.e. where there are spaces before/after the hyphen) you should use the following (\s*-?\s*)?.

How can I replace a nested list of phrases with dashes?

You have to loop over the characters in the string in a nested loop, and replace the non-space characters with _.

result = []
for word1, word2 in LIST:
hide = ""
for c in word2:
if c == " ":
hide += " "
else:
hide += "_"
result.append([word1, hide])


Related Topics



Leave a reply



Submit