How can I remove non-numeric characters from strings using gsub in R?
Simply use
gsub("[^0-9.-]", "", x)
You can in case of multiple -
and .
have a second regEx dealing with that.
If you struggle with it, open a new question.
(Make sure to change .
with ,
if needed)
Regex to remove all non-digit symbols from string in R
Remove all non-digit symbols:
list <- c("1010.1-1", "1010.2-1", "1010.3-1", "1030-1", "1040-1", "1060.1-1", "1060.2-1", "1070-1", "1100.1-1", "1100.2-1")
as.numeric(gsub("\\D+", "", list))
## => [1] 101011 101021 101031 10301 10401 106011 106021 10701 110011 110021
See the R demo online
Remove non-numeric characters within parantheses
How about substituting
(?:\(([^)\d]+)\)(.*?))?\([^\d)]*(\d{5,6})[^\d)]*\)
to
$1$2($3)
(?:\(([^)\d]+)\)(.*?))?
the first optional part captures any preceding parenthesized stuff to$1
. Anything that might follow before the parenthesized 5-6 digit part is captured to$2
\([^\d)]*(\d{5,6})[^\d)]*\)
the second part captures the 5-6 digits to$3
See the demo at regex101
In r using gsub
:
gsub(pattern='(?:\\(([^)\\d]+)\\)(.*?))?\\([^\\d)(]*(\\d{5,6})[^\\d)(]*\\)',
replacement='\\1\\2(\\3)',
x=text,
perl=TRUE, fixed = FALSE)
Regex to remove all (non numeric OR period)
This should do it:
string s = "joe ($3,004.50)";
s = Regex.Replace(s, "[^0-9.]", "");
Remove non numeric values from vector in r
A simple solution is to use Filter
over vec <- list(1, 2, T, 'x', 'abc', '6', 7, F, F, 10)
, i.e.,
> unlist(Filter(is.numeric,vec))
[1] 1 2 7 10
How to replace all non numeric character from a string except any NewLine (\n) using regex?
All characters except newline and digits is pretty straight-forward.
Regex.Replace(text, "[^\r\n0-9]", "")
Newline on Windows is CR (\r
) and LF (\n
). 0-9
can also be written as \d
.
remove non-digits except E+ and E- in string
You may extract the numbers using the following regex:
[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?
Details
[-+]?
- either+
or-
[0-9]*
- 0+ digits\.?
- an optional.
[0-9]+
- 1+ digits([eE][-+]?[0-9]+)?
- an optional capturing group (add?:
after(
to use a non-capturing group) matching 1 or 0 occurrences of[eE]
-e
orE
[-+]?
- an optional-
or+
[0-9]+
- 1 or more digits
R demo:
vec <- c('1234', '+ 42', '1E+4', 'NR 12', '4.5E+04', '8.6E-02')
res <- regmatches(vec, regexpr("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?", vec))
unlist(res)
## => [1] "1234" "42" "1E+4" "12" "4.5E+04" "8.6E-02"
If multiple matches per item in a character vector are expected replace regexpr
with gregexpr
.
Split string column on non-numeric characters in R
You want to do something like this?
library(dplyr)
library(tidyr)
df %>%
separate(lat,into = paste0("lat",1:4),sep = "[^0-9]",remove = FALSE) %>%
separate(long,into = paste0("long",1:4),sep = "[^0-9]",remove = FALSE)
# A tibble: 4 x 10
lat lat1 lat2 lat3 lat4 long long1 long2 long3 long4
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 "22ª29'56.06\"" 22 29 56 06 "105º21'37.27\"" 105 21 37 27
2 "22°29`53.14\"" 22 29 53 14 "105°21'29.48\"" 105 21 29 48
3 "22º30'00.43\"" 22 30 00 43 "105°21'37.46''" 105 21 37 46
4 "105'29'27.17\"" 105 29 27 17 "105°21'39.68" 105 21 39 68
Related Topics
How to Pop Up the Graphics Window from Rscript
Placement of Error Bars in Barplot Using Ggplot2
R Bnlearn Eval Inside Function
Writing a Function to Calculate the Mean of Columns in a Dataframe in R
Reconstruct a Categorical Variable from Dummies in R
Sum Columns Row-Wise with Similar Names
Find Closest Points (Lat/Lon) from One Data Set to a Second Data Set
How to Set Bin Width with Geom_Bar Stat="Identity" in a Time Series Plot
Cant Create File Name with Time Stamp
Drawing Minor Ticks (Not Grid Ticks) in Ggplot2 in a Date Format Axis
Changes in Plotting an Xts Object
How to Convert Class of Several Variables at Once
Ggplot Line Plot Different Colors for Sections
Axis Does Not Plot with Date Labels
Logistic Regression: How to Try Every Combination of Predictors in R