How do I map a vector of values to another vector with my own custom map in R
A couple of options, all using:
myVector<-c(1,2,3,2,3,3,1)
Factor
newvals <- c(.2,.4,.5)
newvals[as.factor(myVector)]
#[1] 0.2 0.4 0.5 0.4 0.5 0.5 0.2
Named vector
newvals <- c(`1`=.2,`2`=.4,`3`=.5)
newvals
# 1 2 3
#0.2 0.4 0.5
newvals[as.character(myVector)]
# 1 2 3 2 3 3 1
#0.2 0.4 0.5 0.4 0.5 0.5 0.2
Lookup table
mapdf <- data.frame(old=c(1,2,3),new=c(.2,.4,.5))
mapdf$new[match(myVector,mapdf$old)]
#[1] 0.2 0.4 0.5 0.4 0.5 0.5 0.2
Benchmarks to quantify @Joe 's comment below and address @Ananda's comment as well.
myVector <- c(1,2,3,2,3,3,1)
# setup for the benchmarking
test <- sample(myVector,1e6,replace=TRUE)
newvals <- c(.2,.4,.5)
newvalsvec <- c(`1`=.2,`2`=.4,`3`=.5)
mapdf <- data.frame(old=c(1,2,3),new=c(.2,.4,.5))
microbenchmark(
newvals[as.factor(test)],
newvalsvec[as.character(test)],
mapdf$new[match(test,mapdf$old)],
newvals[test],
times=10L
)
#Unit: milliseconds
# expr min lq median uq max
#factor 1863.40146 1876.04197 1890.99147 1913.13046 2014.23609
#namedvector 1809.26883 1812.76272 1837.18852 1851.42954 1858.44996
#lookup 38.48697 38.83405 39.90146 69.65140 71.75051
#newvals[test] 34.07380 34.55885 50.61287 65.69495 66.08699
Pass a vector of arguments to map function
Skip the group_by()
step and just use nest()
- otherwise your data will remain grouped after nesting and need to be ungrouped. To get your function to work, just pass the parameters as a list.
library(tidyverse)
mtcars %>%
nest(data = -cyl) %>%
mutate(
newold = map2_df(data, list(c(5, 10)), myf)
) %>%
unpack(newold)
# A tibble: 3 x 4
cyl data old new
<dbl> <list> <dbl> <dbl>
1 6 <tibble [7 x 10]> 19.7 30.7
2 4 <tibble [11 x 10]> 26.7 31.1
3 8 <tibble [14 x 10]> 15.1 17.0
Map a function over a vector
With purrr
, we use map
library(purrr)
map(myknots, ~ myfoo(x, y, nknots = .x))
Applying purrr::map over each of a vector of characters
Your syntax is a little bit off, you'd either use map(animals, make_df)
or map(animals, ~ make_df(.))
, the second argument of map
needs to be a function, which is the same as lapply
:
data.frame(animals) %>% mutate(ldf = map(animals, make_df)) %>% as.tibble()
# A tibble: 3 x 2
# animals ldf
# <fctr> <list>
#1 sheep <data.frame [5 x 3]>
#2 cow <data.frame [5 x 3]>
#3 horse <data.frame [5 x 3]>
data.frame(animals) %>% mutate(ldf = map(animals, ~ make_df(.))) %>% as.tibble()
# A tibble: 3 x 2
# animals ldf
# <fctr> <list>
#1 sheep <data.frame [5 x 3]>
#2 cow <data.frame [5 x 3]>
#3 horse <data.frame [5 x 3]>
Or if using the data.frame
constructor, you need to use I
to create a list type column:
data.frame(animals, ldf = I(lapply(animals, make_df)))
# ^
How to replace the elements of a large character vector
You could create a lookup
dataframe with vec1
and vec2
as columns. Note that, there were some whitespaces in the data that your shared which I have removed.
dat <- data.frame(a = c("Zone 1A", "Zone 1C","Zone 2B","Zone 3C"), b = 1:4)
vec1 <- c("Zone 1A","Zone 1B","Zone 1C","Zone 2A","Zone 2B","Zone 2C")
vec2 <- c("Zone 1","Zone 1","Zone 1","Zone 2","Zone 2","Zone 2")
lookup <- data.frame(vec1, vec2)
You can use match
to replace values.
dat$a <- lookup$vec2[match(dat$a, lookup$vec1)]
Based on the data shared we can remove the last character in a
column which returns us the proper "Zone" value. You can use sub
to do that.
dat$a <- sub('[A-Z]$', '', dat$a)
Replace values in a vector based on another vector
Working with factors might be faster:
xf <- as.factor(x)
y[xf]
Note, that levels(xf)
gives you a character vector similar to your x.lvl. Thus, for this method to work, elements of y should correspond to appropriate elements of levels(xf)
.
How to sort a vector by alternating its values
We may use rowid
library(data.table)
x[order(rowid(x))]
[1] 4 5 4 5
Related Topics
Merge by Range in R - Applying Loops
Differencebetween Parent.Frame() and Parent.Env() in R; How Do They Differ in Call by Reference
Dynamic Column Names in Data.Table
R Error "Sum Not Meaningful for Factors"
How to Create a "Macro" for Regressors in R
R: += (Plus Equals) and ++ (Plus Plus) Equivalent from C++/C#/Java, etc.
Faster Weighted Sampling Without Replacement
Split a String by Any Number of Spaces
How to Prevent Rbind() from Geting Really Slow as Dataframe Grows Larger
Embedded Nul in String' Error When Importing CSV with Fread
How to Change the Background Color of a Plot Made with Ggplot2
Is There a Way of Manipulating Ggplot Scale Breaks and Labels
Ggmap Error: Geomrasterann Was Built with an Incompatible Version of Ggproto
How to Change the First Row to Be the Header in R
R Keep Rows with at Least One Column Greater Than Value