Convert data.frame columns from factors to characters
Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:
bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)
This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.
As @hadley points out, the following is more concise.
bob[] <- lapply(bob, as.character)
In both cases, lapply
outputs a list; however, owing to the magical properties of R, the use of []
in the second case keeps the data.frame class of the bob
object, thereby eliminating the need to convert back to a data.frame using as.data.frame
with the argument stringsAsFactors = FALSE
.
convert factor and character to numeric in a dataframe
It would help to have some example data to work with, but try:
df$your_factor_variable_now_numeric <-
as.numeric(as.character(df$your_old_factor_variable))
And use it only to convert a factor variable, not the complete dataframe. You can also have a look at type.convert
. If you want to convert all factors in the dataframe, you can use something along the lines
df[] <- lapply(df, function(x) as.numeric(as.character(x)))
Note that this converts all factors and might not be what you want if you have factors that do not represent numeric values. If unnecessary conversion is a problem, or if there are non-numeric factors or characters in the data, the following would be appropriate:
numerify <- function(x) if(is.factor(x)) as.numeric(as.character(x)) else x
df[] <- lapply(df, numerify)
On a more general point though, the type of your variables should not prevent you from filtering, if, with filtering, you mean subsetting the dataframe. However, the type conversion should be solved with the above code.
Change factor levels and rearrange dataframe
This mistakes is easy to make. You have to supply the column vector to fct_relevel
. Like so:
library(dplyr,warn.conflicts = F)
library(forcats)
df <-
structure(
list(layer = structure(
1:5,
.Label = c(
'CEOS and managers',
'Clerks and services',
'Production',
'Professionals',
'Technicians'
),
class = 'factor'
)),
row.names = c(NA,-5L),
class = c('tbl_df', 'tbl', 'data.frame')
)
df %>%
mutate(layer = forcats::fct_relevel(
layer,c(
'CEOS and managers',
'Professionals',
'Technicians',
'Clerks and services',
'Production'))) %>%
arrange(layer)
#> # A tibble: 5 x 1
#> layer
#> <fct>
#> 1 CEOS and managers
#> 2 Professionals
#> 3 Technicians
#> 4 Clerks and services
#> 5 Production
Created on 2021-01-11 by the reprex package (v0.3.0)
How to convert data.frame column from Factor to numeric
breast$class <- as.numeric(as.character(breast$class))
If you have many columns to convert to numeric
indx <- sapply(breast, is.factor)
breast[indx] <- lapply(breast[indx], function(x) as.numeric(as.character(x)))
Another option is to use stringsAsFactors=FALSE
while reading the file using read.table
or read.csv
Just in case, other options to create/change columns
breast[,'class'] <- as.numeric(as.character(breast[,'class']))
or
breast <- transform(breast, class=as.numeric(as.character(breast)))
Replace contents of factor column in R dataframe
I bet the problem is when you are trying to replace values with a new one, one that is not currently part of the existing factor's levels:
levels(iris$Species)
# [1] "setosa" "versicolor" "virginica"
Your example was bad, this works:
iris$Species[iris$Species == 'virginica'] <- 'setosa'
This is what more likely creates the problem you were seeing with your own data:
iris$Species[iris$Species == 'virginica'] <- 'new.species'
# Warning message:
# In `[<-.factor`(`*tmp*`, iris$Species == "virginica", value = c(1L, :
# invalid factor level, NAs generated
It will work if you first increase your factor levels:
levels(iris$Species) <- c(levels(iris$Species), "new.species")
iris$Species[iris$Species == 'virginica'] <- 'new.species'
If you want to replace "species A" with "species B" you'd be better off with
levels(iris$Species)[match("oldspecies",levels(iris$Species))] <- "newspecies"
Pandas: convert categories to numbers
First, change the type of the column:
df.cc = pd.Categorical(df.cc)
Now the data look similar but are stored categorically. To capture the category codes:
df['code'] = df.cc.cat.codes
Now you have:
cc temp code
0 US 37.0 2
1 CA 12.0 1
2 US 35.0 2
3 AU 20.0 0
If you don't want to modify your DataFrame but simply get the codes:
df.cc.astype('category').cat.codes
Or use the categorical column as an index:
df2 = pd.DataFrame(df.temp)
df2.index = pd.CategoricalIndex(df.cc)
Need an efficient way to change factor values from one column of a data frame to another columns
We can use fct_collapse
and it returns a factor
with new levels
library(dplyr)
library(forcats)
library(magrittr)
df %<>%
mutate(B = fct_collapse(B, CHANGED = as.character(B)[A== "Kelly"]))
glimpse(df)
#Rows: 7
#Columns: 2
#$ A <fct> Jerry, Kelly, Kelly, Lion, Zebra, Bear, Kelly
#$ B <fct> Eats, CHANGED, CHANGED, Roars, Runs, Sleeps, CHANGED
Convert the factors of a variable into the columns of the dataframe
Use an id variable for rows by group:
dat %>%
group_by(Concentration) %>%
mutate(id = row_number()) %>%
pivot_wider(names_from = Concentration, values_from = Value)
id Low Medium High
<int> <dbl> <dbl> <dbl>
1 1 0.21 0.85 2.21
2 2 0.1 0.5 1.85
3 3 0.36 NA NA
Convert data.frame column format from character to factor
Hi welcome to the world of R.
mtcars #look at this built in data set
str(mtcars) #allows you to see the classes of the variables (all numeric)
#one approach it to index with the $ sign and the as.factor function
mtcars$am <- as.factor(mtcars$am)
#another approach
mtcars[, 'cyl'] <- as.factor(mtcars[, 'cyl'])
str(mtcars) # now look at the classes
This also works for character, dates, integers and other classes
Since you're new to R I'd suggest you have a look at these two websites:
R reference manuals:
http://cran.r-project.org/manuals.html
R Reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf
Related Topics
Legend Venn Diagram in Venneuler
Plot Line on Top of Stacked Bar Chart in Ggplot2
Visualizing Two or More Data Points Where They Overlap (Ggplot R)
Continuous Color Bar with Separators Instead of Ticks
Using Facet Tags and Strip Labels Together in Ggplot2
Specifying the Colour Scale for Maps in Ggplot
R: Generating All Permutations of N Weights in Multiples of P
Why Is 'Unlist(Lapply)' Faster Than 'Sapply'
Ggsave Png Error with Larger Size
Shiny App File Upload: How to Save the Files Uploaded on a Shiny Gui to a Particular Destination
Convert a File Encoding Using R? (Ansi to Utf-8)
Concatenate Values Across Columns in Data.Table, Row by Row
Setting Working Directory: Julia Versus R
Error When Exporting Dataframe to Text File in R
Calculating Prediction Accuracy of a Tree Using Rpart's Predict Method
System Is Computationally Singular: Reciprocal Condition Number in R