Merge Multiple Spaces to Single Space; Remove Trailing/Leading Spaces

Merge Multiple spaces to single space; remove trailing/leading spaces

This seems to meet your needs.

string <- "  Hi buddy   what's up   Bro "
library(stringr)
str_replace(gsub("\\s+", " ", str_trim(string)), "B", "b")
# [1] "Hi buddy what's up bro"

How do I replace multiple spaces with a single space in C#?


string sentence = "This is a sentence with multiple    spaces";
RegexOptions options = RegexOptions.None;
Regex regex = new Regex("[ ]{2,}", options);
sentence = regex.Replace(sentence, " ");

Is there a simple way to remove multiple spaces in a string?


>>> import re
>>> re.sub(' +', ' ', 'The quick brown fox')
'The quick brown fox'

Regex to replace multiple spaces with a single space

Given that you also want to cover tabs, newlines, etc, just replace \s\s+ with ' ':

string = string.replace(/\s\s+/g, ' ');

If you really want to cover only spaces (and thus not tabs, newlines, etc), do so:

string = string.replace(/  +/g, ' ');

Replace multiple spaces in string, but leave singles spaces be

Whenever I come across string and reggex problems I like to refer to the stringr cheat sheet: https://raw.githubusercontent.com/rstudio/cheatsheets/master/strings.pdf

On the second page you can see a section titled "Quantifiers", which tells us how to solve this:

library(tidyverse)

s <- "This is the first address This is the second one"

str_replace(s, "\\s{2,}", "_")

(I am loading the complete tidyverse instead of just stringr here due to force of habit).
Any 2 or more whitespace characters will no be replaced with _.

Substitute multiple whitespace with single whitespace in Python

A simple possibility (if you'd rather avoid REs) is

' '.join(mystring.split())

The split and join perform the task you're explicitly asking about -- plus, they also do the extra one that you don't talk about but is seen in your example, removing trailing spaces;-).

Removing multiple spaces and trailing spaces using gsub

Use a positive lookbehind to see if the current space is preceded by a space:

^ *|(?<= ) | *$

See it here in action: http://regex101.com/r/bJ1mU0

Remove leading/ending and internal multiple spaces but NOT tabs, newlines, or return characters, in Python

In that case str.strip() won't help you (even if you use " " as an argument because it won't remove the spaces inside, only at the start/end of your string, and it would remove the single space before "and" as well.

Instead, use regex to remove 2 or more spaces from your strings:

l= ['\n                        \n                    ',
'\n ',
'Some text',
' and some more text\n',
' and on another a line some more text']

import re

result = "".join([re.sub(" +","",x) for x in l])

print(repr(result))

prints:

'\n\n\nSome text and some more text\n and on another a line some more text'

EDIT: if we apply the regex to each line, we cannot detect \n in some cases, as you noted. So, the alternate and more complex solution would be to join the strings before applying regex, and apply a more complex regex (note that I changed the test list of strings to add more corner cases):

l= ['\n                        \n                    ',
'\n ',
'Some text',
' and some more text \n',
'\n and on another a line some more text ']

import re

result = re.sub("(^ |(?<=\n) | +| (?=\n)| $)","","".join(l))

print(repr(result))

prints:

'\n\n\nSome text and some more text\n\nand on another a line some more text'

There are 5 cases in the regex now that will be removed:

  • start by one space
  • space following a newline
  • 2 or more spaces
  • space followed by a newline
  • end by one space

Aftertought: looks (and is) complicated. There is a non-regex solution after all which gives exactly the same result (if there aren't multiple spaces between words):

result = "\n".join([x.strip(" ") for x in "".join(l).split("\n")])
print(repr(result))

just join the strings, then split according to newline, apply strip with " " as argument to preserve tabs, and join again according to newline.

Chain with re.sub(" +"," ",x.strip(" ")) to take care of possible double spaces between words:

result = "\n".join([re.sub("  +"," ",x.strip(" ")) for x in "".join(l).split("\n")])


Related Topics



Leave a reply



Submit