error in running factor() on a column of a data frame
Your data is a tbl_df
. I don't have your data, but we can look at an example using mtcars
.
library(dplyr)
tbl_df(mtcars)[, "mpg"]
# Source: local data frame [32 x 1]
#
# mpg
# (dbl)
# 1 21.0
# 2 21.0
# 3 22.8
# 4 21.4
# 5 18.7
# 6 18.1
# 7 14.3
# 8 24.4
# 9 22.8
# 10 19.2
# .. ...
It's still a data frame, whereas in base R it would have been dropped to an atomic vector. dplyr:::`[.tbl_df`
does not drop single columns, as is done in [.data.frame
from base R. This is why we can't run factor()
on it.
factor(tbl_df(mtcars)[, "mpg"])
# Error in sort.list(y) : 'x' must be atomic for 'sort.list'
# Have you called 'sort' on a list?
So you'll need to use [[
, as in df[["my_col"]]
, or just use $
.
df[["my_col"]] <- factor(df[["my_col"]])
Note: When you use the $
operator you can do it without the quotes around the column name.
df$my_col <- factor(df$my_col)
Why do I get this error while running factor() on a column of a data.frame
The column was created using dplyr's mutate() function by adding a list()
Thus the column was read as list()
to solve it..
mydata$finding<-unlist(mydata$finding)
factor(mydata$finding)
Now works
Credits to @User20650 for the solution
R: In a data frame, I get an error using a factor variable's level
The reason it doesn't work is that you are comparing a tibble with a tibble. Suggestion is read hadley wickham's R book, where it's written:
Subsetting a tibble with [ always returns a tibble:
We can try an example:
sizes <- factor(c(1,2,3,7,9,2,1,3,7,3,9,2,3), levels = c(1,3,2,7,9),ordered=TRUE)
write.csv(data.frame(A=1:length(sizes),sizes=sizes),"test.csv",row.names=FALSE)
A_Dataset <- read_csv("test.csv",
col_types = cols(A = col_integer(),
sizes = col_factor(levels = c("1","3", "2", "7", "9"))))
A_Dataset$sizes = factor(A_Dataset$sizes, levels=c(1,3,2,7,9),ordered=TRUE)
If you look at the class:
class(A_Dataset[1,2])
[1] "tbl_df" "tbl" "data.frame"
You cannot compare the data.frames, you can do:
class(A_Dataset$sizes[2])
[1] "ordered" "factor"
A_Dataset$sizes[2] > A_Dataset$sizes[1]
[1] TRUE
And this works:
as.data.frame(A_Dataset[2,2]) >as.data.frame(A_Dataset[1,2])
sizes
[1,] TRUE
Converting DF columns to factor is less than straightforward
To change multiple columns to factor, use:
DF[,1:3] <- lapply(DF[,1:3], factor)
To change from factor to numeric, remember to use as.numeric(as.character(x)), like this:
DF[,1:3] <- lapply(DF[,1:3], function(x) as.numeric(as.character(x)))
Only certain values of column as levels in factor
Yes. Use the labels
option:
x <- c("a","a","b","b","happy", "sad", "angry")
levels = c("a", "b", "happy", "sad", "angry")
labels = c("letter", "letter", "happy", "sad", "angry")
y <- factor(x, levels, labels = labels)
y
https://rdrr.io/r/base/factor.html
"Duplicated values in labels can be used to map different values of x to the same factor level."
EDIT: Your mistake in the above code example is the nested vector.
Error when mutating a dataframe in R to add a column with an if condition
We just need to change the 'date' to Date
class and it should work
data.cur$date <- as.Date(dta.cur$date)
as the error is mainly because of dealing with factor
column comparison where it requires a Date
class
R: unused argument in levels
From the help page?as.factor
it shows that the function only takes one argument (in your case the filtered_table$column
), and therefore the error message indicates that there's not another argument to match up with the second one you've specified in the function call. To specify the levels explicitly, you may need to use the factor()
function.
Running into R error with matching data frame columns
Consider forgoing the use of for
loop and use the base R merge() function of both dataframes. However, a little data management is needed: 1) temporarily convert factors to characters (or use stringAsFactors=FALSE
in read.csv()
or read.table()
) and 2) adding suffixes for repeat column names. Once calculated MAF is complete with ifelse()
, split the merged data frame and reset column names and data types to original structure:
# CONVERT FACTORS TO CHARACTER
gwas.data[, c("A1","A2")] <- sapply(gwas.data[,c("A1","A2")],as.character)
# SUFFIXING COL NAMES TO IDENTIFY IN MERGED DF
names(gwas.data) <- paste0(names(gwas.data), "_A")
# CONVERT FACTORS TO CHARACTER
correct.orientation[, c("A1","A2")] <- sapply(correct.orientation[,c("A1","A2")],as.character)
# SUFFIXING COL NAMES TO IDENTIFY IN MERGED DF
names(correct.orientation) <- paste0(names(correct.orientation ), "_B")
# MERGE DATA FRAMES (ASSUMING SNP IS UNIQUE IDENTIFIER)
comparedf <- merge(gwas.data, correct.orientation, by.x="SNP_A", by.y="SNP_B", all=TRUE)
# CALCULATE NEW MAF
comparedf$MAF_A <- ifelse(((comparedf$A1_A == comparedf$A2_B) &
(comparedf$A2_B == comparedf$A1_A)),
(1 - comparedf$MAF_A),
comparedf$MAF_A)
comparedf$zscore_A <- ifelse(((comparedf$A1_A == comparedf$A2_B) &
(comparedf$A2_B == comparedf$A1_A)),
-1 * comparedf$zscore_A,
comparedf$zscore_A)
# SPLIT MERGE BACK TO ORIGINAL STRUCTURE
newgwas.data <- comparedf[,names(gwas.data)]
# REMOVE SUFFIX
names(newgwas.data) <- gsub("_A", "", names(newgwas.data))
# RESET FACTORS
newgwas.data$A1 <- as.factor(newgwas.data$A1)
newgwas.data$A2 <- as.factor(newgwas.data$A2)
Related Topics
R Programming: Read.Csv() Skips Lines Unexpectedly
R: How to Get a Sum of Two Distributions
Logistic Regression: How to Try Every Combination of Predictors in R
Reshape Data from Wide to Long
Count Number of Distinct Values in a Vector
Change Standard Error Color for Geom_Smooth
Shiny Ui.R - Error in Tag("Div", List(...)) - Not Sure Where Error Is
Error in Install.Packages:Type =="Both" Cannot Be Used with 'Repos =Null'
Changing the Order of Dodged Bars in Ggplot2 Barplot
Character String Is Not in a Standard Unambiguous Format
How to Highlight Area Between Two Lines? Ggplot
Why Does Nls Function Not Work in Ggplot2
How to Change Gender Factor into an Numerical Coding in R