Change Stringsasfactors Settings for Data.Frame

Change stringsAsFactors settings for data.frame

It depends on how you fill your data frame, for which you haven't given any code. When you construct a new data frame, you can do it like this:

x <- data.frame(aName = aVector, bName = bVector, stringsAsFactors = FALSE)

In this case, if e.g. aVector is a character vector, then the dataframe column x$aName will be a character vector as well, and not a factor vector. Combining that with an existing data frame (using rbind, cbind or similar) should preserve that mode.

When you execute

options(stringsAsFactors = FALSE)

you change the global default setting. So every data frame you create after executing that line will not auto-convert to factors unless explicitly told to do so. If you only need to avoid conversion in a single place, then I'd rather not change the default. However if this affects many places in your code, changing the default seems like a good idea.

One more thing: if your vector already contains factors, then neither of the above will change it back into a character vector. To do so, you should explicitly convert it back using as.character or similar.

How to disable stringsAsFactors=TRUE in data.frame permanently?

Set options(stringsAsFactors = FALSE) at the beginning of your R session, or in your .RProfile.

As the comments below may suggest, stringsAsFactors is a bit of a controversial topic within the R community. How irritating you find this default value may depend somewhat on how much time you spend using R to fit many "standard" statistical models (lm, glm, etc). Many of those model fitting and related functions are built around using the factor data type.

If you spend most of your time doing other more "generic" types of data analysis, you might find this default more irritating.

It is widely considered dangerous to globally set stringsAsFactors = FALSE for the reasons mentioned below: it can cause significant confusion when sharing code. Indeed, even if you work mainly alone, participating in online communities like StackOverflow can be tricky if you insist on running R with stringsAsFactors = FALSE: your answer to a question may not work for the OP, or you may not be able to replicate errors others are seeing!

Of course, everyone can make their own choices about how best to manage these risks for themselves.

Reading in Data.Frames with Strings as factors = False in R using chain operator

Though the statement should logically be data.frame(stringsAsFactors=FALSE) if you are applying chaining, even this statement doesn't produce the required output.

The reason is misunderstanding of use of stringsAsFactors option. This option works only if you make the data.frame column by column. Example:

a <- data.frame(x = c('a','b'),y=c(1,2),stringsAsFactors = T)
str(a)

'data.frame': 2 obs. of 2 variables:
$ x: Factor w/ 2 levels "a","b": 1 2
$ y: num 1 2

a <- data.frame(x = c('a','b'),y=c(1,2),stringsAsFactors = F)
str(a)

'data.frame': 2 obs. of 2 variables:
$ x: chr "a" "b"
$ y: num 1 2

If you give data.frame as input, stringsAsFactors option doesn't work

Solution:

Store the chaining result to a variable like this:

library(rvest)
pvbData <- read_html(pvbURL)
pvbDF <- pvbData %>%
html_nodes(xpath = `//*[@id="ajax_result_table"]`) %>%
html_table()

And then apply this command:

data.frame(as.list(pvbDF),stringsAsFactors=F)

Update:

If the column is already a factor, then you can't convert it to character vector using this command. Better first as.character it and retry.

You may refer to Change stringsAsFactors settings for data.frame for more details.

Convert data.frame columns from factors to characters

Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:

bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)

This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.

As @hadley points out, the following is more concise.

bob[] <- lapply(bob, as.character)

In both cases, lapply outputs a list; however, owing to the magical properties of R, the use of [] in the second case keeps the data.frame class of the bob object, thereby eliminating the need to convert back to a data.frame using as.data.frame with the argument stringsAsFactors = FALSE.

Avoid (as)data.frame change data to factors when converting from zoo object

I found the solution based on a comment by @thelatemail. It works for the actual version of zoo (Sept/2017). As @G. Grothendieck commented, the future versions of zoo will consider the stringsAsFactors = FALSE argument.

str(base:::as.data.frame(coredata(dtfz),stringsAsFactors = FALSE))
#'data.frame': 5 obs. of 2 variables:
# $ X1: chr "d" "d" "d" "d" ...
# $ X2: chr "d" "d" "d" "d" ...

mix `stringsAsFactors` in dataframe

# creating the dataset (no usage of rbind if possible) with factor columns by default
fruits <- data.frame(fruit = c("apple", "apple", "pear"),
path = c("jjrkgnser", "aprtgh", "akjreg"))

# transform this column to a character vector
fruits$path = as.character(fruits$path)

Factor variable shown as character variable in data frame after recent R update from 3.5.2 to 4.0.1

From what I now, you have two options:

  1. Set options(stringsAsFactors = TRUE) at every start of your R scripts.
  2. Set options(stringsAsFactors = TRUE) on your .Rprofile


Related Topics



Leave a reply



Submit