Can I use gsub() on each element of a data frame?
Well I think you could do it the following way, but I don't know if it is better or cleaner than yours :
df <- data.frame(tbl)
df[,-1] <- as.numeric(gsub("%", "", as.matrix(df[,-1])))
Which gives :
R> head(df)
Date Internet.Explorer Chrome Firefox Safari Opera Mobile
1 January 2013 30.71 36.52 21.42 8.29 1.19 14.13
2 December 2012 30.78 36.42 21.89 7.92 1.26 14.55
3 November 2012 31.23 35.72 22.37 7.83 1.39 13.08
4 October 2012 32.08 34.77 22.32 7.81 1.63 12.30
5 September 2012 32.70 34.21 22.40 7.70 1.61 12.03
6 August 2012 32.85 33.59 22.85 7.39 1.63 11.78
R> sapply(df, class)
Date Internet.Explorer Chrome Firefox
"factor" "numeric" "numeric" "numeric"
Safari Opera Mobile
"numeric" "numeric" "numeric"
gsub() on all values in a dataframe with multiple replacements
lapply
returns a list you can assign it to dataframe with []
to keep the dimensions.
Land_Use[] <- lapply(Land_Use, function(y) gsub("native forest", "forest", y))
Here gsub
will be applied to all the column in the dataframe.
For one column you need to assign the output back to column again instead of dataframe.
Land_Use$`1972` <- gsub('native forest','forest.',Land_Use$`1972`)
If you want to change multiple values into one value you may want to look at fct_collapse
function from forcats
.
library(dplyr)
library(forcats)
Land_Use %>%
mutate(across(.fns = ~fct_collapse(.x, 'Forest' = c('native forest', 'exotic forest'),
'water' = c('lake', 'river', 'ocean', 'pond')))) -> Land_Use
Land_Use
apply gsub over a certain column in a list of data frames
Solution with tidyverse
library(purrr)
library(dplyr)
library(stringr)
map(results1, ~.x[]%>%
mutate(names = str_replace_all(names,"\\.\\.", "")))
[[1]]
names coefficients
1 a15.pdf 1.27679608
2 a17.pdf 1.05090176
3 a18.pdf 1.51820192
4 a21.pdf 2.30296037
5 a2TTT.pdf 1.48568732
6 a5.pdf 0.49371310
7 B11.pdf 1.02705905
8 B12.pdf 0.99974736
9 B13.pdf 2.40828102
10 B22.pdf 0.69515213
Using gsub in list of dataframes with R
Maybe you can try something like the following:
lapply(rapply(lt, function(x)
gsub("^-$", "", x), how = "list"),
as.data.frame)
# [[1]]
# name1 name2
# 1 nd:f 21-12-2001
# 2 nd:i name
# 3 nd:c
# 4 nd:g 15
# 5 b:rd
#
# [[2]]
# name1 name2
# 1 nd:i 11-01-2001
# 2 nd:c name
# 3 nd:g 3
# 4 nd:y
# 5 a:nd
It seems like although rapply
can handle keeping the data as a list
, the data.frame
attribute is lost (hence the extra lapply(..., as.data.frame)
.
By using "^_$"
as our pattern in gsub
, we're saying to look for exactly that pattern. Dates won't be affected.
Perhaps a better option, though, is to convert those "-"
s into NA
. For this, you can try my makemeNA
function from my "SOfun" package.
To use this approach you would simply do:
library(SOfun)
lapply(lt, makemeNA, "-")
# [[1]]
# name1 name2
# 1 nd:f 21-12-2001
# 2 nd:i name
# 3 nd:c <NA>
# 4 nd:g 15
# 5 b:rd <NA>
#
# [[2]]
# name1 name2
# 1 nd:i 11-01-2001
# 2 nd:c name
# 3 nd:g 3
# 4 nd:y <NA>
# 5 a:nd <NA>
Applying gsub to various columns
You can use apply
to apply it to the whole data.frame
apply(x, 2, function(y) as.numeric(gsub("%", "", y)))
x1 x2 x3
[1,] 10 60 1
[2,] 20 50 2
[3,] 30 40 3
Removing some text string and characters from a column in dataframe in R
We can match the .
(\\.
- escaped as it is a metacharacter that matches any character) and one or more digits (\\d+
) till the end ($
) of the string and replace with blank (""
) and wrap with gsub
to match the backquote ("`") and remove it
df$Regression <- gsub("`", "", sub("\\.\\d+$", '', df$Regression))
df$Regression
[1] "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A"
Using gsub or sub function to only get part of a string?
Following may help you here too.
sub("([^:]*):([^:]*).*","\\1:\\2",df$dat)
Output will be as follows.
> sub("([^:]*):([^:]*).*","\\1:\\2",df$dat)
[1] "WBU-ARGU*06:03" "WBU-ARDU*08:01" "WBU-ARFU*11:03" "WBU-ARFU*03:456b"
Where Input for data frame is as follows.
dat <- c("WBU-ARGU*06:03:04","WBU-ARDU*08:01:01","WBU-ARFU*11:03:05","WBU-ARFU*03:456b")
df <- data.frame(dat)
Explanation: Following is only for explanation purposes.
sub(" ##using sub for global subtitution function of R here.
([^:]*) ##By mentioning () we are keeping the matched values from vector's element into 1st place of memory(which we could use later), which is till next colon comes it will match everything.
: ##Mentioning letter colon(:) here.
([^:]*) ##By mentioning () making 2nd place in memory for matched values in vector's values which is till next colon comes it will match everything.
.*" ##Mentioning .* to match everything else now after 2nd colon comes in value.
,"\\1:\\2" ##Now mentioning the values of memory holds with whom we want to substitute the element values \\1 means 1st memory place \\2 is second memory place's value.
,df$dat) ##Mentioning df$dat dataframe's dat value.
How to replace '+' using gsub() function in R
Simply replace it with fixed = TRUE
(no need to use a regular expression) but you have to do the replacement for each "column" of the data.frame by specifying the column name:
txtdf <- data.frame(job = c("government", "poli+tician", "parliament"))
txtdf
gives
job
1 government
2 poli+tician
3 parliament
Now replace the "+":
txtdf$job <- gsub("+", "", txtdf$job, fixed = TRUE)
txtdf
The result is:
job
1 government
2 politician
3 parliament
using gsub with a column on a dataframe
You will need to escape the .
with either \\.
or [.]
. See ?regex
. So the call becomes
sub("\\..*", "", dat$Dx1)
For example,
x <- c("F20.0", "F13.2", "F31.3", "F33.1")
sub("\\..*", "", x)
# [1] "F20" "F13" "F31" "F33"
We can use sub()
instead of gsub()
since we are always matching the first (and only) occurrence of .
.
Related Topics
Let Each Plot in Facet_Grid Have Its Own Y-Axis Value
Mathematical Expression in Axis Label
How to Use "Cast" in Reshape Without Aggregation
Major and Minor Tickmarks with Plotly
Chloropleth Map with Geojson and Ggplot2
How to Join Data from 2 Different CSV-Files in R
Efficiently Counting Non-Na Elements in Data.Table
In R: Joining Vector Elements by Row, Converting Vector Rows to Strings
Avoid Copying the Whole Vector When Replacing an Element (A[1] <- 2)
Rstudio Shiny Not Able to Use Ggvis
Changing the Appearance of Facet Labels Size
Error in Bind_Rows_(X, .Id):Column Can't Be Converted from Factor to Numeric
Build Word Co-Occurence Edge List in R
Combining Vector and Bitmap Graphics in a PDF
Pass String as Name of Attached Data Column Name
How to Round a Date to the Quarter Start/End
Regression Line for the Entire Data Set Together with Regression Lines Based on Groups