Change Values in Multiple Columns of a Dataframe Using a Lookup Table

Change values in multiple columns of a dataframe using a lookup table

Here's a solution that works on each column successively using lapply():

as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
## a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E

Alternatively, if you don't mind switching over to a matrix, you can achieve a "more vectorized" solution, as a matrix will allow you to call match() and index lookup$letter just once for the entire input:

matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
## [,1] [,2] [,3]
## [1,] "A" "E" "A"
## [2,] "B" "D" "D"
## [3,] "C" "C" "C"
## [4,] "D" "B" "B"
## [5,] "E" "A" "E"

(And of course you can coerce back to data.frame via as.data.frame() afterward, although you'll have to restore the column names as well if you want them, which can be done with setNames(...,names(example)). But if you really want to stick with a data.frame, my first solution is probably preferable.)

Efficiently replace string in multiple columns based on lookup table

With stri_replace_all_fixed from stringi, you can replace many patterns at once. The syntax is a bit confusing, but when you set vectorise_all = FALSE it replaces all instances of all patterns with corresponding replacements.

First, let's create some example data as you did not provide any:

library(tidyverse)
set.seed(1)
exp <- data.frame(matrix(sample(LETTERS, 1000, replace = TRUE), ncol = 100))

lookup <- tribble(
~pattern, ~replacement,
"A", ":",
"F", " ",
"Y", "Test"
)

Use mutate + across which is the new version of mutate_at in this case (mutate_at is slowly phased out):

exp %>% 
mutate(across(c(X1, X3), ~ stringi::stri_replace_all_fixed(
str = .x,
pattern = lookup[["pattern"]],
replacement = lookup[["replacement"]],
vectorise_all = FALSE
))) %>%
as_tibble()
#> # A tibble: 10 × 100
#> X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Test A U L T Y N H M V W B U
#> 2 D U E O T W B F H L S J L
#> 3 G U I A Z X M W Y P V A G
#> 4 : J Test T L F R L P A R K X
#> 5 B V N C Y Z V F Y M Z Z U
#> 6 W N E F W G N H W U P O V
#> 7 K J E J F S F G N F K Z H
#> 8 N G B J Y J A K T Q J X A
#> 9 R I J F H F S Q G I G J S
#> 10 S O Test O L X S D M G S P Z
#> # … with 87 more variables: X14 <chr>, X15 <chr>, X16 <chr>, X17 <chr>,
#> # X18 <chr>, X19 <chr>, X20 <chr>, X21 <chr>, X22 <chr>, X23 <chr>,
#> # X24 <chr>, X25 <chr>, X26 <chr>, X27 <chr>, X28 <chr>, X29 <chr>,
#> # X30 <chr>, X31 <chr>, X32 <chr>, X33 <chr>, X34 <chr>, X35 <chr>,
#> # X36 <chr>, X37 <chr>, X38 <chr>, X39 <chr>, X40 <chr>, X41 <chr>,
#> # X42 <chr>, X43 <chr>, X44 <chr>, X45 <chr>, X46 <chr>, X47 <chr>,
#> # X48 <chr>, X49 <chr>, X50 <chr>, X51 <chr>, X52 <chr>, X53 <chr>, …

Created on 2022-02-16 by the reprex package (v2.0.1)

This is as fast as it gets I believe.

Replace values in column of Pandas DataFrame using a Series lookup table

you can use map() function for that:

In [38]: df_normalised['name'] = df_normalised['code'].map(name)

In [39]: df_normalised
Out[39]:
code name
0 8 Human development
1 11 Environment and natural resources management
2 1 Economic management
3 6 Social protection and risk management
4 5 Trade and integration
5 2 Public sector governance
6 11 Environment and natural resources management
7 6 Social protection and risk management
8 7 Social dev/gender/inclusion
9 7 Social dev/gender/inclusion

R- How do I use a lookup table containing threshold values that vary for different variables (columns) to replace values below those thresholds?

Perhaps this helps

library(dplyr)
dat %>%
mutate(across(all_of(detect_level$Parameter),
~ pmax(., detect_level$LOD[match(cur_column(), detect_level$Parameter)])))

For the updated case

dat %>%
mutate(across(all_of(detect_level$Parameter),
~ replace(., . < detect_level$LOD[match(cur_column(),
detect_level$Parameter)],detect_level$halfLOD[match(cur_column(),
detect_level$Parameter)])))

Function to replace values in data.table using a lookup table

We don't need as.name. Object on the lhs of = is not evaluated correctly. Instead, we could use a named vector in on with setNames

dt.replaceValueUsingLookup <- function(dt, col, dtLookup) {
dt[
dtLookup,
on = setNames("old", col),
(col) := new
]
}

-testing

dt %>% 
dt.replaceValueUsingLookup("chapter", dtLookup)

dt
# chapter
#1: 101
#2: 102
#3: 13
#4: 105
#5: 104

How to match multiple columns based on lookup table

We could unlist the dataframe and match directly.

new_df <- results
names(new_df) <- paste0("id", seq_along(new_df))
new_df[] <- lookup$id[match(unlist(new_df), lookup$price)]
cbind(results, new_df)

# price_1 price_2 id1 id2
#1 2 3 B C
#2 2 1 B A
#3 1 1 A A

In dplyr, we can do

library(dplyr)
bind_cols(results, results %>% mutate_all(~lookup$id[match(., lookup$price)]))

Replace values in a dataframe based on lookup table

You posted an approach in your question which was not bad. Here's a smiliar approach:

new <- df  # create a copy of df
# using lapply, loop over columns and match values to the look up table. store in "new".
new[] <- lapply(df, function(x) look$class[match(x, look$pet)])

An alternative approach which will be faster is:

new <- df
new[] <- look$class[match(unlist(df), look$pet)]

Note that I use empty brackets ([]) in both cases to keep the structure of new as it was (a data.frame).

(I'm using df instead of table and look instead of lookup in my answer)

How do I replace the values in a dataframe based on a lookup table in another dataframe

Use pandas' replace method : it will search for the keys in the dataframe and replace found keys with the associated values. your dataframe has a few missing NaNs, so I edited it to match what you posted

  #create a dictionary from the lookup
repl = lookup.set_index('value')['description'].to_dict()

#print(repl)

{653: '30 to 39',
654: '40 to 49',
1056: 'Belgium',
1158: 'Taiwan',
1203: 'Czech Republic',
545: 'White',
530: 'Other'}

#pass it using pandas' replace method
df.replace(repl)


age cty eth
0 30 to 39 Belgium NaN
1 30 to 39 Belgium White
2 40 to 49 NaN Other
3 30 to 39 Taiwan Other
4 30 to 39 Czech Republic White

Replace column values in table with values from lookup based on matches in R using data.table

We can do a join on the 'code' and 'old' from table and lookup respectively

table[lookup, code := new, on = .(code = old)]

-output

 table
code sn
1: CBa 1
2: CBe 2
3: CBa 3
4: CBe 4
5: OOO 5
6: PPP 6
7: CBa 7

Is there a way to calculate a new column of a dataframe on base of values in a(nother) lookup table

So for my understanding to apply the formula; for each column Ci we multiply it with values PV[i], W[i],RC_A[i] then sum over each result

result=0

for i in range(len(df_lookup)):
result=result+(df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] *
df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])

#result is a column

#then we multiply element wise

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP))*multiply(result, axis="index")


Related Topics



Leave a reply



Submit