How to Replace Numeric Codes with Value Labels from a Lookup Table

how do I replace numeric codes with value labels from a lookup table?

You can do it this way:

df <- data.frame(id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))

profession.code <- c(`0` = 0, Optometrists=1, Accountants=2, Veterinarians=3,
`Financial analysts`=4, Nurses=5)

df$profession.str <- names(profession.code)[match(df$profession, profession.code)]
df
# id profession profession.str
# 1 1 1 Optometrists
# 2 2 5 Nurses
# 3 3 4 Financial analysts
# 4 4 NA <NA>
# 5 5 0 0
# 6 6 5 Nurses

Note that I had to add a 0 entry in your profession.code vector to account for those zeroes.

EDIT: here is an updated solution to account for Eric's comment below that the data may contain any number of profession codes for which there are no corresponding descriptions:

match.idx <- match(df$profession, profession.code)
df$profession.str <- ifelse(is.na(match.idx),
df$profession,
names(profession.code)[match.idx])

how do I replace numeric codes in a data frame with value labels from a (data.frame) lookup table?

This is one approach in base:

df <- data.frame(id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))

pc <- data.frame(profession.code = c(1,2,3,4,5),
profession.label = c('Optometrists',
'Accountants', 'Veterinarians',
'Financial analysts', 'Nurses'))

df$new <- as.character(pc[match(df$profession,
pc$profession.code), 'profession.label'])
df[is.na(df$new), 'new'] <- df[is.na(df$new), 'profession']
df$new <- as.factor(df$new)
df

Which yields:

  id profession                new
1 1 1 Optometrists
2 2 5 Nurses
3 3 4 Financial analysts
4 4 NA <NA>
5 5 0 0
6 6 5 Nurses

Replace values in a dataframe based on lookup table

You posted an approach in your question which was not bad. Here's a smiliar approach:

new <- df  # create a copy of df
# using lapply, loop over columns and match values to the look up table. store in "new".
new[] <- lapply(df, function(x) look$class[match(x, look$pet)])

An alternative approach which will be faster is:

new <- df
new[] <- look$class[match(unlist(df), look$pet)]

Note that I use empty brackets ([]) in both cases to keep the structure of new as it was (a data.frame).

(I'm using df instead of table and look instead of lookup in my answer)

Use sjlabelled::set_labels with lookup table

Match the column names of data with id, split the answers on ; and pass the labels as a list.

library(sjlabelled)

data <- set_labels(data, labels = strsplit(lookup$answers[match(names(data), lookup$id)], ';'))
get_labels(data)

#$q1
#[1] "atext1" "btext1" "ctext1"

#$q2
#[1] "atext2" "btext2" "ctext2"

#$q3
#[1] "atext3" "btext3" "ctext3"

#$q4
#[1] "atext4" "btext4" "ctext4"

#$q5
#[1] "atext5" "btext5" "ctext5"

How to replace string values in a column based on a lookup table

An option would be gsubfn. We match one or more digits with a dot ([0-9.]+) as the pattern and in the replacement, pass a list of key/value pairs created from the second dataset ('df2'). For the matching pattern with the 'keys', it replace the corresponding 'value' within the string

library(gsubfn)
df1$Node <- gsubfn("([0-9.]+)", as.list(setNames(df2$Label,df2$Node)), df1$Node)
df1$Node
#[1] "One one > Two one > Three two" "One two > Two two > Three one" "One one > Two two > Three one" "One two > Two one > Three two"
#[5] "One one > Two two > Three two"

data

df1 <- data.frame(ID = 1:5, Node = c("1.1 > 2.1 > 3.2", "1.2 > 2.2 > 3.1", "1.1 > 2.2 > 3.1", "1.2 > 2.1 > 3.2", "1.1 > 2.2 > 3.2"), stringsAsFactors = FALSE)

df2 <- data.frame(Label = c("One one", "One two", "Two one", "Two two", "Three one", "Three two"), Node = c("1.1", "1.2", "2.1", "2.2", "3.1", "3.2"), stringsAsFactors = FALSE)

Creating a new variable from a lookup table

define your lookup table

lookup= data.frame( 
base=c(0,1,2,3,4),
aresult=c("strikeout","single","double","triple","home run"))

then use join from plyr

dataset = join(dataset,lookup,by='aresult')

How to replace value by label from another table

If there are many columns to be mapped by labels, my SQL require many
LEFT JOIN clauses. Is this avoidable?

This data model is somewhat suspicious, but if you have to deal with it you can avoid multiple left joins using pivot or case when:

SQL Fiddle demo

select c.id, c.name, 
max(case when col = 'gender' then label end) gender,
max(case when col = 'group' then label end) grp
from customer c
left join label l
on l.tbl = 'Customer' and (l.col, l.value) in (('gender', c.gender), ('group', c.grp))
group by c.id, c.name

Try to avoid naming columns as "table", "column", "group", they are reserved words in Oracle.

Label variables with lookup table and sjlabelled::set_label

You can use set_label directly on a dataframe.

library(sjlabelled)

data <- set_label(data, lookup$question[match(names(data), lookup$id)])

get_label(data)
# q1 q2 q3 q4 q5
#"qtext1" "qtext2" "qtext3" "qtext4" "qtext5"


Related Topics



Leave a reply



Submit