how do I replace numeric codes with value labels from a lookup table?
You can do it this way:
df <- data.frame(id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))
profession.code <- c(`0` = 0, Optometrists=1, Accountants=2, Veterinarians=3,
`Financial analysts`=4, Nurses=5)
df$profession.str <- names(profession.code)[match(df$profession, profession.code)]
df
# id profession profession.str
# 1 1 1 Optometrists
# 2 2 5 Nurses
# 3 3 4 Financial analysts
# 4 4 NA <NA>
# 5 5 0 0
# 6 6 5 Nurses
Note that I had to add a 0
entry in your profession.code
vector to account for those zeroes.
EDIT: here is an updated solution to account for Eric's comment below that the data may contain any number of profession codes for which there are no corresponding descriptions:
match.idx <- match(df$profession, profession.code)
df$profession.str <- ifelse(is.na(match.idx),
df$profession,
names(profession.code)[match.idx])
how do I replace numeric codes in a data frame with value labels from a (data.frame) lookup table?
This is one approach in base:
df <- data.frame(id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))
pc <- data.frame(profession.code = c(1,2,3,4,5),
profession.label = c('Optometrists',
'Accountants', 'Veterinarians',
'Financial analysts', 'Nurses'))
df$new <- as.character(pc[match(df$profession,
pc$profession.code), 'profession.label'])
df[is.na(df$new), 'new'] <- df[is.na(df$new), 'profession']
df$new <- as.factor(df$new)
df
Which yields:
id profession new
1 1 1 Optometrists
2 2 5 Nurses
3 3 4 Financial analysts
4 4 NA <NA>
5 5 0 0
6 6 5 Nurses
Replace values in a dataframe based on lookup table
You posted an approach in your question which was not bad. Here's a smiliar approach:
new <- df # create a copy of df
# using lapply, loop over columns and match values to the look up table. store in "new".
new[] <- lapply(df, function(x) look$class[match(x, look$pet)])
An alternative approach which will be faster is:
new <- df
new[] <- look$class[match(unlist(df), look$pet)]
Note that I use empty brackets ([]
) in both cases to keep the structure of new
as it was (a data.frame).
(I'm using df
instead of table
and look
instead of lookup
in my answer)
Use sjlabelled::set_labels with lookup table
Match the column names of data
with id
, split the answers
on ;
and pass the labels as a list.
library(sjlabelled)
data <- set_labels(data, labels = strsplit(lookup$answers[match(names(data), lookup$id)], ';'))
get_labels(data)
#$q1
#[1] "atext1" "btext1" "ctext1"
#$q2
#[1] "atext2" "btext2" "ctext2"
#$q3
#[1] "atext3" "btext3" "ctext3"
#$q4
#[1] "atext4" "btext4" "ctext4"
#$q5
#[1] "atext5" "btext5" "ctext5"
How to replace string values in a column based on a lookup table
An option would be gsubfn
. We match one or more digits with a dot ([0-9.]+
) as the pattern and in the replacement, pass a list
of key/value pairs created from the second dataset ('df2'). For the matching pattern with the 'keys', it replace the corresponding 'value' within the string
library(gsubfn)
df1$Node <- gsubfn("([0-9.]+)", as.list(setNames(df2$Label,df2$Node)), df1$Node)
df1$Node
#[1] "One one > Two one > Three two" "One two > Two two > Three one" "One one > Two two > Three one" "One two > Two one > Three two"
#[5] "One one > Two two > Three two"
data
df1 <- data.frame(ID = 1:5, Node = c("1.1 > 2.1 > 3.2", "1.2 > 2.2 > 3.1", "1.1 > 2.2 > 3.1", "1.2 > 2.1 > 3.2", "1.1 > 2.2 > 3.2"), stringsAsFactors = FALSE)
df2 <- data.frame(Label = c("One one", "One two", "Two one", "Two two", "Three one", "Three two"), Node = c("1.1", "1.2", "2.1", "2.2", "3.1", "3.2"), stringsAsFactors = FALSE)
Creating a new variable from a lookup table
define your lookup table
lookup= data.frame(
base=c(0,1,2,3,4),
aresult=c("strikeout","single","double","triple","home run"))
then use join from plyr
dataset = join(dataset,lookup,by='aresult')
How to replace value by label from another table
If there are many columns to be mapped by labels, my SQL require many
LEFT JOIN clauses. Is this avoidable?
This data model is somewhat suspicious, but if you have to deal with it you can avoid multiple left joins using pivot
or case when
:
SQL Fiddle demo
select c.id, c.name,
max(case when col = 'gender' then label end) gender,
max(case when col = 'group' then label end) grp
from customer c
left join label l
on l.tbl = 'Customer' and (l.col, l.value) in (('gender', c.gender), ('group', c.grp))
group by c.id, c.name
Try to avoid naming columns as "table", "column", "group", they are reserved words in Oracle.
Label variables with lookup table and sjlabelled::set_label
You can use set_label
directly on a dataframe.
library(sjlabelled)
data <- set_label(data, lookup$question[match(names(data), lookup$id)])
get_label(data)
# q1 q2 q3 q4 q5
#"qtext1" "qtext2" "qtext3" "qtext4" "qtext5"
Related Topics
Create Polygon from Set of Points Distributed
R - File.Choose() Customizing Dialogue Window
Add Na Value to Ggplot Legend for Continuous Data Map
How to Sweep Specific Columns with Dplyr
What Is R's Crossproduct Function
Grid.Arrange Using List of Plots
Frequency Table with Several Variables in R
Difference Between Installing a Package from Source and from Compiled Binary
Are Dataframe[ ,-1] and Dataframe[-1] the Same
How to Flip Rows and Columns in R
Creating a Sankey Diagram Using Networkd3 Package in R
Removing a List of Columns from a Data.Frame Using Subset
How to Change the Order of the Panels in Simple Lattice Graphs
Saving a List of Plots by Their Names()
Shiny R - Download the Result of a Table
Unquote the Variable Name on the Right Side of Mutate Function in Dplyr