coding variable values into classes using R
The cut
method as outlined by @Greg is probably what you want here. One thing to note is that cut
returns a factor by default, which you can suppress by supplying labels = FALSE
to return the integer values:
cut(data$wt, c(178, 200, 300, Inf), labels = FALSE)
Alternatively, if your cutting does not lend itself to natural breaks, you can use ifelse()
. You can "nest" the ifelse statements similar to Excel. I use "with" to cut down on the typing needed:
data$group2 <- with(data, ifelse(wt >= 179 & wt < 200, 1,
ifelse(wt >= 200 & wt < 300, 2, 3))
)
Recoding a dataset with variables of different classes
I've spotted at least one small problem with your custom function: if you're using ifelse
, you need to start off with the is.na
condition. See this example:
x <- c(1, 2, NA)
ifelse(x == 1, "foo", "bar")
# > [1] "foo" "bar" NA
Here's an alternative I've made. The coalesce
function comes from the dplyr
package.
recode.var <- function(x) {
if (is.character(x)) {
return(coalesce(as.numeric(x == "Yes"), 0))
}
if (is.numeric(x)) {
return(coalesce(x, 0))
}
if (is.logical(x)) {
return(coalesce(as.numeric(x), 0))
}
x
}
My version does not deal with values outside the options you've mentioned. I'm assuming they don't exist in your dataset, so they don't need to be accounted for, but do tell me if that's a problem.
The final step is how to apply the function to the dataframe. Using dplyr
you can use the following:
tmp2 <- mutate_all(tmp, recode.var)
how to get class of a variable in R using loop?
Since data.frame
s are really just a list
of columns, I do this often using lapply
:
lapply(df, class)
As for the for
loop you have in the example, when you call df$name
, R is trying to find the column called "name". Instead, you want df[, name]
:
for (i in names(df)){
name <- names(df[i])
print(name)
print(class(df[, name]))
}
Changing Class and Mode from Character to Numeric
The lines
as.factor(df$StudyAreaVisitNote)
as.numeric(df$Year)
as.numeric(df$Session)
do not permanently change the values in df
. They return transformed vectors that are printed to the console, then, because you do not save them anywhere, they disappear as soon as that line in done being called. Generally objects in R are not updated via referece, you must alwayts re-assign the returned result to wherevver you would like to store it. So try
df$Year <- as.numeric(df$Year)
df$Session <- as.numeric(df$Session)
instead
Reading the class of each variable in a DF based on a DF list of variables in R
I think what you want is the following:
sapply(DF1[, DF2[,1]], class)
What this does is first subset DF1 to only include those columns which are named in DF2, then maps the "class" function to each column, sapply
makes it return a vector. To get the class of each column in a dataset you need to us a mapping function like lapply, or a for loop. For instance lapply(mtcars, class
gives you the class of each column.
Change class of variables in a data frame using another reference data frame
You could try it like this:
Make sure both tables are in the same order:
variable_info <- variable_info[match(variable_info$variable_name, names(df)),]
Create a list of function calls:
funs <- sapply(paste0("as.", variable_info$variable_class), match.fun)
Then map them to each column:
df[] <- Map(function(dd, f) f(as.character(dd)), df, funs)
With data.table
you could do it almost the same way, except you replace the last line by:
library(data.table)
dt <- as.data.table(df) # or use setDT(df)
dt[, names(dt) := Map(function(dd, f) f(as.character(dd)), dt, funs)]
Related Topics
R: Numeric 'Envir' Arg Not of Length One in Predict()
Differencebetween a List and a Pairlist in R
R: Determine If a Script Is Running in Windows or Linux
Increase Plot Size (Width) in Ggplot2
How to Suppress Output When Using ':=' in R {Data.Table}, Prior to V1.8.3
How to Remove Row If It Has a Na Value in One Certain Column
Changing Font Size in R Datatables (Dt)
How to Remove Unique Entry and Keep Duplicates in R
How to Use Earlier Declared Variables Within Aes in Ggplot with Special Operators (..Count.., etc.)
Reading Excel File: How to Find the Start Cell in Messy Spreadsheets
Defer Code to End of Document in Knitr
How to Unscale the Coefficients from an Lmer()-Model Fitted with a Scaled Response
Conditional Rolling Mean (Moving Average) on Irregular Time Series
R - Store a Matrix into a Single Dataframe Cell
Adding S4 Dispatch to Base R S3 Generic