How to convert data.frame column from Factor to numeric
breast$class <- as.numeric(as.character(breast$class))
If you have many columns to convert to numeric
indx <- sapply(breast, is.factor)
breast[indx] <- lapply(breast[indx], function(x) as.numeric(as.character(x)))
Another option is to use stringsAsFactors=FALSE
while reading the file using read.table
or read.csv
Just in case, other options to create/change columns
breast[,'class'] <- as.numeric(as.character(breast[,'class']))
or
breast <- transform(breast, class=as.numeric(as.character(breast)))
How to convert a data frame column to numeric type?
Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to numeric
. I suggest that you should apply transform
function in order to complete your task.
Now I'm about to demonstrate certain "conversion anomaly":
# create dummy data.frame
d <- data.frame(char = letters[1:5],
fake_char = as.character(1:5),
fac = factor(1:5),
char_fac = factor(letters[1:5]),
num = 1:5, stringsAsFactors = FALSE)
Let us have a glance at data.frame
> d
char fake_char fac char_fac num
1 a 1 1 a 1
2 b 2 2 b 2
3 c 3 3 c 3
4 d 4 4 d 4
5 e 5 5 e 5
and let us run:
> sapply(d, mode)
char fake_char fac char_fac num
"character" "character" "numeric" "numeric" "numeric"
> sapply(d, class)
char fake_char fac char_fac num
"character" "character" "factor" "factor" "integer"
Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.
Here goes: first two columns are character
. I've deliberately called 2nd one fake_char
. Spot the similarity of this character
variable with one that Dirk created in his reply. It's actually a numerical
vector converted to character
. 3rd and 4th column are factor
, and the last one is "purely" numeric
.
If you utilize transform
function, you can convert the fake_char
into numeric
, but not the char
variable itself.
> transform(d, char = as.numeric(char))
char fake_char fac char_fac num
1 NA 1 1 a 1
2 NA 2 2 b 2
3 NA 3 3 c 3
4 NA 4 4 d 4
5 NA 5 5 e 5
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion
but if you do same thing on fake_char
and char_fac
, you'll be lucky, and get away with no NA's:
> transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
char fake_char fac char_fac num
1 a 1 1 1 1
2 b 2 2 2 2
3 c 3 3 3 3
4 d 4 4 4 4
5 e 5 5 5 5
If you save transformed data.frame
and check for mode
and class
, you'll get:
> D <- transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
> sapply(D, mode)
char fake_char fac char_fac num
"character" "numeric" "numeric" "numeric" "numeric"
> sapply(D, class)
char fake_char fac char_fac num
"character" "numeric" "factor" "numeric" "integer"
So, the conclusion is: Yes, you can convert character
vector into a numeric
one, but only if it's elements are "convertible" to numeric
. If there's just one character
element in vector, you'll get error when trying to convert that vector to numerical
one.
And just to prove my point:
> err <- c(1, "b", 3, 4, "e")
> mode(err)
[1] "character"
> class(err)
[1] "character"
> char <- as.numeric(err)
Warning message:
NAs introduced by coercion
> char
[1] 1 NA 3 4 NA
And now, just for fun (or practice), try to guess the output of these commands:
> fac <- as.factor(err)
> fac
???
> num <- as.numeric(fac)
> num
???
Kind regards to Patrick Burns! =)
Converting a dataframe from factors to numerical creates all NA's
Sticking to the last approach of the OP; the function used to convert the NA's needs to be replaced with an, albeit less efficient, function that can deal with NA's, which is;
as.numeric(as.character(x))
The code then becomes:
df <- as.data.frame(df)
as.numeric.factor <- function(x) {as.numeric(as.character(x))}
df[] = lapply(df, as.numeric.factor)
df[df < 0] <- NA
df <- df[,colMeans(is.na(df)) <= 0.999]
df <- data.table(df)
cols = sapply(df, is.numeric)
cols = names(cols)[cols]
dfclevel = df[, lapply(.SD, mean, na.rm=TRUE), .SDcols = cols, by=matchcode]
Convert from factor to numeric a column in data.frames within a list
I think the problem is that your function in your second lapply
is only returning the vector of the numeric factor levels, not your entire data.frame
. I believe the following should work:
foo <- function(y) {
y$Var1 <- as.numeric(levels(y$Var1))[y$Var1]
return(y)
}
lst_table <- lapply(lst_table, foo)
R: How to convert factors into numeric for a DATA FRAME?
We can try
yourdat[] <- lapply(yourdat, function(x) if(is.factor(x)) as.numeric(levels(x))[x]
else x)
convert factor and character to numeric in a dataframe
It would help to have some example data to work with, but try:
df$your_factor_variable_now_numeric <-
as.numeric(as.character(df$your_old_factor_variable))
And use it only to convert a factor variable, not the complete dataframe. You can also have a look at type.convert
. If you want to convert all factors in the dataframe, you can use something along the lines
df[] <- lapply(df, function(x) as.numeric(as.character(x)))
Note that this converts all factors and might not be what you want if you have factors that do not represent numeric values. If unnecessary conversion is a problem, or if there are non-numeric factors or characters in the data, the following would be appropriate:
numerify <- function(x) if(is.factor(x)) as.numeric(as.character(x)) else x
df[] <- lapply(df, numerify)
On a more general point though, the type of your variables should not prevent you from filtering, if, with filtering, you mean subsetting the dataframe. However, the type conversion should be solved with the above code.
Convert factor to numeric in data frame
One way of doing this:
tbl_alles[sapply(tbl_alles, is.factor)] <- lapply(tbl_alles[sapply(tbl_alles, is.factor)], function(x) as.numeric(as.character(x)))
This function will look up columns of type factor
and convert them to class numeric
Another option (maybe a bit faster) is using data.table
package
library(data.table)
setDT(tbl_alles)[, names(tbl_alles) := lapply(.SD, function(x) if(is.factor(x)) as.numeric(as.character(x)) else x)]
If your whole data set is of type factor
and you want to transfer all the columns to numeric
type, you could do
tbl_alles[] <- lapply(tbl_alles, function(x) as.numeric(as.character(x)))
Convert multiple columns from factor to numeric but obtaining NAs in R
as.character
/as.numeric
expects a vector as input. With df[, cols]
you are passing a dataframe to it (check class(df[, cols])
).
If you are talking about the accepted answer in the link it says to change the code in for
loop and doesn't suggest to pass entire dataframe. To change class of multiple columns you can use for
loop, apply
or lapply
.
df[cols] <- lapply(df[cols], function(x) as.numeric(as.character(x)))
Related Topics
The Condition Has Length > 1 and Only the First Element Will Be Used in If Else Statement
How to Insert an Image into the Navbar on a Shiny Navbarpage()
How to Plot a Hybrid Boxplot: Half Boxplot with Jitter Points on the Other Half
Identify All Objects of Given Class for Further Processing
Forward and Backward Fill Data Frame in R
Extract Prediction Band from Lme Fit
Euclidean Distance of Two Vectors
Fast Levenshtein Distance in R
How to Show a Legend on Dual Y-Axis Ggplot
What Is the Most Useful R Trick
Finding 2 & 3 Word Phrases Using R Tm Package
How to Stop Executing of R Code Inside Shiny (Without Stopping the Shiny Process)
Conditional Coloring of Cells in Table
Formatting Reactive Data.Frames in Shiny
Convert from Billion to Million and Vice Versa