Change stringsAsFactors settings for data.frame
It depends on how you fill your data frame, for which you haven't given any code. When you construct a new data frame, you can do it like this:
x <- data.frame(aName = aVector, bName = bVector, stringsAsFactors = FALSE)
In this case, if e.g. aVector
is a character vector, then the dataframe column x$aName
will be a character vector as well, and not a factor vector. Combining that with an existing data frame (using rbind
, cbind
or similar) should preserve that mode.
When you execute
options(stringsAsFactors = FALSE)
you change the global default setting. So every data frame you create after executing that line will not auto-convert to factors unless explicitly told to do so. If you only need to avoid conversion in a single place, then I'd rather not change the default. However if this affects many places in your code, changing the default seems like a good idea.
One more thing: if your vector already contains factors, then neither of the above will change it back into a character vector. To do so, you should explicitly convert it back using as.character
or similar.
How to disable stringsAsFactors=TRUE in data.frame permanently?
Set options(stringsAsFactors = FALSE)
at the beginning of your R session, or in your .RProfile.
As the comments below may suggest, stringsAsFactors
is a bit of a controversial topic within the R community. How irritating you find this default value may depend somewhat on how much time you spend using R to fit many "standard" statistical models (lm
, glm
, etc). Many of those model fitting and related functions are built around using the factor data type.
If you spend most of your time doing other more "generic" types of data analysis, you might find this default more irritating.
It is widely considered dangerous to globally set stringsAsFactors = FALSE
for the reasons mentioned below: it can cause significant confusion when sharing code. Indeed, even if you work mainly alone, participating in online communities like StackOverflow can be tricky if you insist on running R with stringsAsFactors = FALSE
: your answer to a question may not work for the OP, or you may not be able to replicate errors others are seeing!
Of course, everyone can make their own choices about how best to manage these risks for themselves.
Reading in Data.Frames with Strings as factors = False in R using chain operator
Though the statement should logically be data.frame(stringsAsFactors=FALSE)
if you are applying chaining, even this statement doesn't produce the required output.
The reason is misunderstanding of use of stringsAsFactors
option. This option works only if you make the data.frame column by column. Example:
a <- data.frame(x = c('a','b'),y=c(1,2),stringsAsFactors = T)
str(a)
'data.frame': 2 obs. of 2 variables:
$ x: Factor w/ 2 levels "a","b": 1 2
$ y: num 1 2
a <- data.frame(x = c('a','b'),y=c(1,2),stringsAsFactors = F)
str(a)
'data.frame': 2 obs. of 2 variables:
$ x: chr "a" "b"
$ y: num 1 2
If you give data.frame as input, stringsAsFactors option doesn't work
Solution:
Store the chaining result to a variable like this:
library(rvest)
pvbData <- read_html(pvbURL)
pvbDF <- pvbData %>%
html_nodes(xpath = `//*[@id="ajax_result_table"]`) %>%
html_table()
And then apply this command:
data.frame(as.list(pvbDF),stringsAsFactors=F)
Update:
If the column is already a factor, then you can't convert it to character vector using this command. Better first as.character it and retry.
You may refer to Change stringsAsFactors settings for data.frame for more details.
Convert data.frame columns from factors to characters
Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:
bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)
This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.
As @hadley points out, the following is more concise.
bob[] <- lapply(bob, as.character)
In both cases, lapply
outputs a list; however, owing to the magical properties of R, the use of []
in the second case keeps the data.frame class of the bob
object, thereby eliminating the need to convert back to a data.frame using as.data.frame
with the argument stringsAsFactors = FALSE
.
Avoid (as)data.frame change data to factors when converting from zoo object
I found the solution based on a comment by @thelatemail. It works for the actual version of zoo (Sept/2017). As @G. Grothendieck commented, the future versions of zoo will consider the stringsAsFactors = FALSE
argument.
str(base:::as.data.frame(coredata(dtfz),stringsAsFactors = FALSE))
#'data.frame': 5 obs. of 2 variables:
# $ X1: chr "d" "d" "d" "d" ...
# $ X2: chr "d" "d" "d" "d" ...
mix `stringsAsFactors` in dataframe
# creating the dataset (no usage of rbind if possible) with factor columns by default
fruits <- data.frame(fruit = c("apple", "apple", "pear"),
path = c("jjrkgnser", "aprtgh", "akjreg"))
# transform this column to a character vector
fruits$path = as.character(fruits$path)
Factor variable shown as character variable in data frame after recent R update from 3.5.2 to 4.0.1
From what I now, you have two options:
- Set
options(stringsAsFactors = TRUE)
at every start of your R scripts. - Set
options(stringsAsFactors = TRUE)
on your.Rprofile
Related Topics
R: How to Draw a Line with Multiple Arrows in It
Developing Geographic Thematic Maps with R
How to Use Outlier Tests in R Code
Changing Shapes Used for Scale_Shape() in Ggplot2
Accessing Excel File from Sharepoint with R
How to Use the Row.Names Attribute to Order the Rows of My Dataframe in R
Warning Message: "Missing Values in Resampled Performance Measures" in Caret Train() Using Rpart
Extract Knots, Basis, Coefficients and Predictions for P-Splines in Adaptive Smooth
Scaling a Numeric Matrix in R with Values 0 to 1
How to Remove Na from Facet_Wrap in Ggplot2
Create an Expression from a Function for Data.Table to Eval
Monitoring for Changes in File(S) in Real Time
How Can Put Multiple Plots Side-By-Side in Shiny R
Align Violin Plots with Dodged Box Plots