Merge Multiple spaces to single space; remove trailing/leading spaces
This seems to meet your needs.
string <- " Hi buddy what's up Bro "
library(stringr)
str_replace(gsub("\\s+", " ", str_trim(string)), "B", "b")
# [1] "Hi buddy what's up bro"
How do I replace multiple spaces with a single space in C#?
string sentence = "This is a sentence with multiple spaces";
RegexOptions options = RegexOptions.None;
Regex regex = new Regex("[ ]{2,}", options);
sentence = regex.Replace(sentence, " ");
Is there a simple way to remove multiple spaces in a string?
>>> import re
>>> re.sub(' +', ' ', 'The quick brown fox')
'The quick brown fox'
Regex to replace multiple spaces with a single space
Given that you also want to cover tabs, newlines, etc, just replace \s\s+
with ' '
:
string = string.replace(/\s\s+/g, ' ');
If you really want to cover only spaces (and thus not tabs, newlines, etc), do so:
string = string.replace(/ +/g, ' ');
Replace multiple spaces in string, but leave singles spaces be
Whenever I come across string and reggex problems I like to refer to the stringr
cheat sheet: https://raw.githubusercontent.com/rstudio/cheatsheets/master/strings.pdf
On the second page you can see a section titled "Quantifiers", which tells us how to solve this:
library(tidyverse)
s <- "This is the first address This is the second one"
str_replace(s, "\\s{2,}", "_")
(I am loading the complete tidyverse
instead of just stringr
here due to force of habit).
Any 2 or more whitespace characters will no be replaced with _
.
Substitute multiple whitespace with single whitespace in Python
A simple possibility (if you'd rather avoid REs) is
' '.join(mystring.split())
The split and join perform the task you're explicitly asking about -- plus, they also do the extra one that you don't talk about but is seen in your example, removing trailing spaces;-).
Removing multiple spaces and trailing spaces using gsub
Use a positive lookbehind to see if the current space is preceded by a space:
^ *|(?<= ) | *$
See it here in action: http://regex101.com/r/bJ1mU0
Remove leading/ending and internal multiple spaces but NOT tabs, newlines, or return characters, in Python
In that case str.strip()
won't help you (even if you use " "
as an argument because it won't remove the spaces inside, only at the start/end of your string, and it would remove the single space before "and"
as well.
Instead, use regex to remove 2 or more spaces from your strings:
l= ['\n \n ',
'\n ',
'Some text',
' and some more text\n',
' and on another a line some more text']
import re
result = "".join([re.sub(" +","",x) for x in l])
print(repr(result))
prints:
'\n\n\nSome text and some more text\n and on another a line some more text'
EDIT: if we apply the regex to each line, we cannot detect \n
in some cases, as you noted. So, the alternate and more complex solution would be to join the strings before applying regex, and apply a more complex regex (note that I changed the test list of strings to add more corner cases):
l= ['\n \n ',
'\n ',
'Some text',
' and some more text \n',
'\n and on another a line some more text ']
import re
result = re.sub("(^ |(?<=\n) | +| (?=\n)| $)","","".join(l))
print(repr(result))
prints:
'\n\n\nSome text and some more text\n\nand on another a line some more text'
There are 5 cases in the regex now that will be removed:
- start by one space
- space following a newline
- 2 or more spaces
- space followed by a newline
- end by one space
Aftertought: looks (and is) complicated. There is a non-regex solution after all which gives exactly the same result (if there aren't multiple spaces between words):
result = "\n".join([x.strip(" ") for x in "".join(l).split("\n")])
print(repr(result))
just join the strings, then split according to newline, apply strip
with " "
as argument to preserve tabs, and join again according to newline.
Chain with re.sub(" +"," ",x.strip(" "))
to take care of possible double spaces between words:
result = "\n".join([re.sub(" +"," ",x.strip(" ")) for x in "".join(l).split("\n")])
Related Topics
Avoid String Printed to Console Getting Truncated (In Rstudio)
Calculate Cumsum() While Ignoring Na Values
Remove Null Elements from List of Lists
Equivalent to Unix "Less" Command Within R Console
Pass Function Arguments to Both Dplyr and Ggplot
Why Does Merge Result in More Rows Than Original Data
Handling Dates When We Switch to Daylight Savings Time and Back
Why Is Using '<<-' Frowned Upon and How to Avoid It
How to Plot a Stacked and Grouped Bar Chart in Ggplot
Merge by Range in R - Applying Loops
How to Make Graphics with Transparent Background in R Using Ggplot2
Legend Placement, Ggplot, Relative to Plotting Region
Joining Aggregated Values Back to the Original Data Frame
Comparing Two Vectors in an If Statement
What Does "S3 Methods" Mean in R
Similarity Scores Based on String Comparison in R (Edit Distance)