Convert factor to integer in a data frame
With anna.table
(it is a data frame by the way, a table is something else!), the easiest way will be to just do:
anna.table2 <- data.matrix(anna.table)
as data.matrix()
will convert factors to their underlying numeric (integer) levels. This will work for a data frame that contains only numeric, integer, factor or other variables that can be coerced to numeric, but any character strings (character) will cause the matrix to become a character matrix.
If you want anna.table2
to be a data frame, not as matrix, then you can subsequently do:
anna.table2 <- data.frame(anna.table2)
Other options are to coerce all factor variables to their integer levels. Here is an example of that:
## dummy data
set.seed(1)
dat <- data.frame(a = factor(sample(letters[1:3], 10, replace = TRUE)),
b = runif(10))
## sapply over `dat`, converting factor to numeric
dat2 <- sapply(dat, function(x) if(is.factor(x)) {
as.numeric(x)
} else {
x
})
dat2 <- data.frame(dat2) ## convert to a data frame
Which gives:
> str(dat)
'data.frame': 10 obs. of 2 variables:
$ a: Factor w/ 3 levels "a","b","c": 1 2 2 3 1 3 3 2 2 1
$ b: num 0.206 0.177 0.687 0.384 0.77 ...
> str(dat2)
'data.frame': 10 obs. of 2 variables:
$ a: num 1 2 2 3 1 3 3 2 2 1
$ b: num 0.206 0.177 0.687 0.384 0.77 ...
However, do note that the above will work only if you want the underlying numeric representation. If your factor has essentially numeric levels, then we need to be a bit cleverer in how we convert the factor to a numeric whilst preserving the "numeric" information coded in the levels. Here is an example:
## dummy data
set.seed(1)
dat3 <- data.frame(a = factor(sample(1:3, 10, replace = TRUE), levels = 3:1),
b = runif(10))
## sapply over `dat3`, converting factor to numeric
dat4 <- sapply(dat3, function(x) if(is.factor(x)) {
as.numeric(as.character(x))
} else {
x
})
dat4 <- data.frame(dat4) ## convert to a data frame
Note how we need to do as.character(x)
first before we do as.numeric()
. The extra call encodes the level information before we convert that to numeric. To see why this matters, note what dat3$a
is
> dat3$a
[1] 1 2 2 3 1 3 3 2 2 1
Levels: 3 2 1
If we just convert that to numeric, we get the wrong data as R converts the underlying level codes
> as.numeric(dat3$a)
[1] 3 2 2 1 3 1 1 2 2 3
If we coerce the factor to a character vector first, then to a numeric one, we preserve the original information not R's internal representation
> as.numeric(as.character(dat3$a))
[1] 1 2 2 3 1 3 3 2 2 1
If your data are like this second example, then you can't use the simple data.matrix()
trick as that is the same as applying as.numeric()
directly to the factor and as this second example shows, that doesn't preserve the original information.
How to convert data.frame column from Factor to numeric
breast$class <- as.numeric(as.character(breast$class))
If you have many columns to convert to numeric
indx <- sapply(breast, is.factor)
breast[indx] <- lapply(breast[indx], function(x) as.numeric(as.character(x)))
Another option is to use stringsAsFactors=FALSE
while reading the file using read.table
or read.csv
Just in case, other options to create/change columns
breast[,'class'] <- as.numeric(as.character(breast[,'class']))
or
breast <- transform(breast, class=as.numeric(as.character(breast)))
Convert factor to integer
You can combine the two functions; coerce to characters thence to numerics:
> fac <- factor(c("1","2","1","2"))
> as.numeric(as.character(fac))
[1] 1 2 1 2
how to convert factor levels to integer in r
We can use match
with unique
elements
library(dplyr)
dat %>%
mutate_all(funs(match(., unique(.))))
# ID Season Year Weekday
#1 1 1 1 1
#2 2 1 2 2
#3 3 2 1 1
#4 4 2 2 3
convert factor and character to numeric in a dataframe
It would help to have some example data to work with, but try:
df$your_factor_variable_now_numeric <-
as.numeric(as.character(df$your_old_factor_variable))
And use it only to convert a factor variable, not the complete dataframe. You can also have a look at type.convert
. If you want to convert all factors in the dataframe, you can use something along the lines
df[] <- lapply(df, function(x) as.numeric(as.character(x)))
Note that this converts all factors and might not be what you want if you have factors that do not represent numeric values. If unnecessary conversion is a problem, or if there are non-numeric factors or characters in the data, the following would be appropriate:
numerify <- function(x) if(is.factor(x)) as.numeric(as.character(x)) else x
df[] <- lapply(df, numerify)
On a more general point though, the type of your variables should not prevent you from filtering, if, with filtering, you mean subsetting the dataframe. However, the type conversion should be solved with the above code.
R: How to convert factors into numeric for a DATA FRAME?
We can try
yourdat[] <- lapply(yourdat, function(x) if(is.factor(x)) as.numeric(levels(x))[x]
else x)
Convert factor to numeric in data frame
One way of doing this:
tbl_alles[sapply(tbl_alles, is.factor)] <- lapply(tbl_alles[sapply(tbl_alles, is.factor)], function(x) as.numeric(as.character(x)))
This function will look up columns of type factor
and convert them to class numeric
Another option (maybe a bit faster) is using data.table
package
library(data.table)
setDT(tbl_alles)[, names(tbl_alles) := lapply(.SD, function(x) if(is.factor(x)) as.numeric(as.character(x)) else x)]
If your whole data set is of type factor
and you want to transfer all the columns to numeric
type, you could do
tbl_alles[] <- lapply(tbl_alles, function(x) as.numeric(as.character(x)))
Related Topics
R - What Algorithm Does Geom_Density() Use and How to Extract Points/Equation of Curves
Object Not Found Error When Passing Model Formula to Another Function
How to Remove Rows with 0 Values Using R
How to Annotate a Reference Line at the Same Angle as the Reference Line Itself
R: Extracting "Clean" Utf-8 Text from a Web Page Scraped with Rcurl
Modifying Ggplot Objects After Creation
Row-Wise Sort Then Concatenate Across Specific Columns of Data Frame
Show Multiple Plots from Ggplot on One Page in R
How to Deal with Hdf5 Files in R
Catching an Error and Then Branching Logic
Horizontal Dendrogram in R with Labels