How to Do Vlookup and Fill Down (Like in Excel) in R

How to do vlookup and fill down (like in Excel) in R?

If I understand your question correctly, here are four methods to do the equivalent of Excel's VLOOKUP and fill down using R:

# load sample data from Q
hous <- read.table(header = TRUE,
stringsAsFactors = FALSE,
text="HouseType HouseTypeNo
Semi 1
Single 2
Row 3
Single 2
Apartment 4
Apartment 4
Row 3")

# create a toy large table with a 'HouseType' column
# but no 'HouseTypeNo' column (yet)
largetable <- data.frame(HouseType = as.character(sample(unique(hous$HouseType), 1000, replace = TRUE)), stringsAsFactors = FALSE)

# create a lookup table to get the numbers to fill
# the large table
lookup <- unique(hous)
HouseType HouseTypeNo
1 Semi 1
2 Single 2
3 Row 3
5 Apartment 4

Here are four methods to fill the HouseTypeNo in the largetable using the values in the lookup table:

First with merge in base:

# 1. using base 
base1 <- (merge(lookup, largetable, by = 'HouseType'))

A second method with named vectors in base:

# 2. using base and a named vector
housenames <- as.numeric(1:length(unique(hous$HouseType)))
names(housenames) <- unique(hous$HouseType)

base2 <- data.frame(HouseType = largetable$HouseType,
HouseTypeNo = (housenames[largetable$HouseType]))

Third, using the plyr package:

# 3. using the plyr package
library(plyr)
plyr1 <- join(largetable, lookup, by = "HouseType")

Fourth, using the sqldf package

# 4. using the sqldf package
library(sqldf)
sqldf1 <- sqldf("SELECT largetable.HouseType, lookup.HouseTypeNo
FROM largetable
INNER JOIN lookup
ON largetable.HouseType = lookup.HouseType")

If it's possible that some house types in largetable do not exist in lookup then a left join would be used:

sqldf("select * from largetable left join lookup using (HouseType)")

Corresponding changes to the other solutions would be needed too.

Is that what you wanted to do? Let me know which method you like and I'll add commentary.

Using R function similar to VLOOKUP in Excel

No need to use dplyr, simple base-R is enough, with the match-function.

largedataset$Val <- lookuptable$Val[match(largedataset$Lookup, lookuptable$Lookup)]

If you need to do the lookup more often and have a really large lookuptable, there may be some benefit in using the fastmatch-package with the similar fmatch-function, but only if regular match is too slow.

R equivalent of Excel VLOOKUP

We can use deframe to create a named vector and remap more efficiently

library(dplyr)
library(tibble)
tibble(
country = c("AE", "AF", "foo")
) %>% mutate(country_remap = deframe(test_mapping)[country])
# A tibble: 3 x 2
# country country_remap
# <chr> <chr>
#1 AE United Arab Emirates
#2 AF Afghanistan
#3 foo <NA>

Using Merge in R to replicate Vlookup

A dplyr solution.

library(dplyr)
dbase <- data.frame(machine_number = c("10","20","30","10","10","50"),
second_attribute=c("a","b","c","c","a","d"))
mbase <- data.frame(machine_number = c("10","20","30","40","50","60","70","80","90","100"),
rate=c(22,22,25,17,15,15,55,12,15,19))

left_join(dbase, mbase, by = "machine_number")

machine_number second_attribute rate
1 10 a 22
2 20 b 22
3 30 c 25
4 10 c 22
5 10 a 22
6 50 d 15

Fill down column based on values in another

You can use ave to fill na per group.

df$Tag  <- with(df, ave(Tag, ID, FUN=function(x) {
i <- is.na(x)
x[i] <- x[!i][1]
x
}))
df
# ID Tag
#1 999 YES
#2 999 YES
#3 100 NO
#4 100 NO
#5 500 <NA>

Or using fill

library(dplyr)
library(tidyr)
df %>% group_by(ID) %>% fill(Tag, .direction = "downup")
## A tibble: 5 x 2
## Groups: ID [3]
# ID Tag
# <int> <fct>
#1 999 YES
#2 999 YES
#3 100 NO
#4 100 NO
#5 500 NA

Matching values in a column using R - excel vlookup

We can just use match to find the index of those matching elements of 'y' with that of 'x' and use that to index to get the corresponding 'z'

dt[, Result1 := z[match(y,x)]]
dt
# x y z Result Result1
#1: 1 6 a na NA
#2: 2 5 b e e
#3: 3 4 c d d
#4: 4 3 d c c
#5: 5 2 e b b
#6: 5 1 f a a

VLookup type method in R

merge them:

> merge(full_list, replace, by.x="Ticker", by.y="Symbol")
Ticker Long_Name Type Location Month
1 AC Ethanol -- CBOT F US U13
2 AIC DJ UBS Commodity Index -- CBOT F US U13
3 BBS South American Soybeans -- CBOT F US U13
4 BO Soybean Oil -- CBT F US V13
5 C Corn -- CBT F US U13
6 DF Dow Jones Industrial Average -- CBT F US U13

VLOOKUP in R Programming Language. Currently doing VLOOKUP in Excel, still cannot get the desired results in R

Here's a demonstration of doing this using dplyr and magrittr (for %>%) packages

library(dplyr)
library(magrittr)

df1 <- data.frame(AccountID = 1:3, AccountName = c("Superman", "Batman", "Joker"))
df2 <- data.frame(AccountID = 1:2, AccountBalance = c(1000, 1000000000))

# joining (or vlookup) to get account balance for all accounts

left_join(df1, df2, by = "AccountID") %>% mutate(Flag = ifelse(is.na(AccountBalance), "N", "Y"))

# AccountID AccountName AccountBalance Flag
# 1 1 Superman 1e+03 Y
# 2 2 Batman 1e+09 Y
# 3 3 Joker NA N

Hope this is what you were looking for. If not, share some sample data and desired output.

How to turn simple Excel LOOKUP into R code?

This should do:

library(tidyverse)

dfmain <- dfmain %>% left_join(dfsub1) %>% left_join(dfsub2)

see here for details on left_join.



Related Topics



Leave a reply



Submit