Remove Excess Whitespace from Within a String

Remove excess whitespace from within a string

Not sure exactly what you want but here are two situations:

  1. If you are just dealing with excess whitespace on the beginning or end of the string you can use trim(), ltrim() or rtrim() to remove it.

  2. If you are dealing with extra spaces within a string consider a preg_replace of multiple whitespaces " "* with a single whitespace " ".

Example:

$foo = preg_replace('/\s+/', ' ', $foo);

How to remove the extra spaces in a string?

You're close.

Remember that replace replaces the found text with the second argument. So:

newString = string.replace(/\s+/g,''); // "thiscontainsspaces"

Finds any number of sequential spaces and removes them. Try replacing them with a single space instead!

newString = string.replace(/\s+/g,' ').trim();

Remove all whitespace in a string

If you want to remove leading and ending spaces, use str.strip():

>>> "  hello  apple  ".strip()
'hello apple'

If you want to remove all space characters, use str.replace() (NB this only removes the “normal” ASCII space character ' ' U+0020 but not any other whitespace):

>>> "  hello  apple  ".replace(" ", "")
'helloapple'

If you want to remove duplicated spaces, use str.split() followed by str.join():

>>> " ".join("  hello  apple  ".split())
'hello apple'

How to remove all whitespace from a string?

In general, we want a solution that is vectorised, so here's a better test example:

whitespace <- " \t\n\r\v\f" # space, tab, newline, 
# carriage return, vertical tab, form feed
x <- c(
" x y ", # spaces before, after and in between
" \u2190 \u2192 ", # contains unicode chars
paste0( # varied whitespace
whitespace,
"x",
whitespace,
"y",
whitespace,
collapse = ""
),
NA # missing
)
## [1] " x y "
## [2] " ← → "
## [3] " \t\n\r\v\fx \t\n\r\v\fy \t\n\r\v\f"
## [4] NA

The base R approach: gsub

gsub replaces all instances of a string (fixed = TRUE) or regular expression (fixed = FALSE, the default) with another string. To remove all spaces, use:

gsub(" ", "", x, fixed = TRUE)
## [1] "xy" "←→"
## [3] "\t\n\r\v\fx\t\n\r\v\fy\t\n\r\v\f" NA

As DWin noted, in this case fixed = TRUE isn't necessary but provides slightly better performance since matching a fixed string is faster than matching a regular expression.

If you want to remove all types of whitespace, use:

gsub("[[:space:]]", "", x) # note the double square brackets
## [1] "xy" "←→" "xy" NA

gsub("\\s", "", x) # same; note the double backslash

library(regex)
gsub(space(), "", x) # same

"[:space:]" is an R-specific regular expression group matching all space characters. \s is a language-independent regular-expression that does the same thing.


The stringr approach: str_replace_all and str_trim

stringr provides more human-readable wrappers around the base R functions (though as of Dec 2014, the development version has a branch built on top of stringi, mentioned below). The equivalents of the above commands, using [str_replace_all][3], are:

library(stringr)
str_replace_all(x, fixed(" "), "")
str_replace_all(x, space(), "")

stringr also has a str_trim function which removes only leading and trailing whitespace.

str_trim(x) 
## [1] "x y" "← →" "x \t\n\r\v\fy" NA
str_trim(x, "left")
## [1] "x y " "← → "
## [3] "x \t\n\r\v\fy \t\n\r\v\f" NA
str_trim(x, "right")
## [1] " x y" " ← →"
## [3] " \t\n\r\v\fx \t\n\r\v\fy" NA

The stringi approach: stri_replace_all_charclass and stri_trim

stringi is built upon the platform-independent ICU library, and has an extensive set of string manipulation functions. The equivalents of the above are:

library(stringi)
stri_replace_all_fixed(x, " ", "")
stri_replace_all_charclass(x, "\\p{WHITE_SPACE}", "")

Here "\\p{WHITE_SPACE}" is an alternate syntax for the set of Unicode code points considered to be whitespace, equivalent to "[[:space:]]", "\\s" and space(). For more complex regular expression replacements, there is also stri_replace_all_regex.

stringi also has trim functions.

stri_trim(x)
stri_trim_both(x) # same
stri_trim(x, "left")
stri_trim_left(x) # same
stri_trim(x, "right")
stri_trim_right(x) # same

Is there a simple way to remove multiple spaces in a string?

>>> import re
>>> re.sub(' +', ' ', 'The quick brown fox')
'The quick brown fox'

How to remove extra white space between words inside a character vector using?

gsub is your friend:

test <- "Hi,  this is a   good  time to   start working   together."
gsub("\\s+"," ",test)
#[1] "Hi, this is a good time to start working together."

\\s+ will match any space character (space, tab etc), or repeats of space characters, and will replace it with a single space " ".

Remove whitespaces inside a string in javascript

For space-character removal use

"hello world".replace(/\s/g, "");

for all white space use the suggestion by Rocket in the comments below!



Related Topics



Leave a reply



Submit