typeof returns integer for something that is clearly a factor
If what you wanted to know was "what class was held by a vector?" then use class
. If you wanted to test "whether a vector was a factor?" then use is.factor
.
The value returned by typeof
being integer for factors is a language feature that confused me as well in my early days of R programming. The typeof
function is giving information that's at a "lower" level of abstraction. Factor variables (and also Dates) are stored as integers. Datetimes are stored as numeric
. Learn to use class
or str
rather than typeof
(or mode
). They give more useful information. You can look at the full "structure" of a factor variable with dput
:
dput( factor( rep( letters[1:5], 2) ) )
# structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L),
.Label = c("a", "b", "c", "d", "e"), class = "factor")
The character values that are usually thought of as the factor values are actually stored in an attribute (which is what "levels" returns), while the "main" part of the variable is a set of integer indices pointing to various level "attributes"), named .Label
, so mode
returns "numeric" and typeof
returns "integer". For this reason one usually needs to use as.character
that will coerce to what most people think of as "factors", namely their character representations.
Odd type of a column in a Data Frame
state
column is a factor.
class(incomeAndState$state)
#[1] "factor"
and factors are internally stored as integers hence you see
typeof(incomeAndState$state)
#[1] "integer"
which can also be verified with mode
mode(incomeAndState$state)
#[1] "numeric"
This can be avoided if you use stringsAsFactors = FALSE
while constructing the dataframe
incomeAndState <- data.frame(state=state, income=income, stringsAsFactors = FALSE)
which will give you
typeof(incomeAndState$state)
#[1] "character"
mode(incomeAndState$state)
#[1] "character"
Data type double not converting to factor
The issue here is that typeof
checks the internal representation of an object. Factors are represented as integers. To check that something is actually a factor, use is.factor
instead. From the docs:
typeof determines the (R internal) type or storage mode of any object
To verify this "claim", you can check the well known iris Species' column which is a factor. typeof(iris$Species)
will however return integer
because to R factors are integers.
Using is.factor
is a better option, this ultimately boils down to the difference between types and classes in R.
is.factor(iris$Species)
[1] TRUE
Return factor level from a function, not an integer in R
You can use the levels of the variable Classes
and the output of the ifelse statement as follows:
data <- data.frame(a = 1:10)
find_class <- function(i) {
classes <- factor(c('A', 'B', 'C'))
idx <- ifelse(i %in% c(1, 3, 5), classes[1],
ifelse(i %in% c(2, 4, 9), classes[2], classes[3]))
res <- levels(classes)[idx]
factor(res, levels(classes))
}
data$class <- find_class(data$a)
data$class
# [1] A B A B A C C C B C
# Levels: A B C
data
# a class
# 1 1 A
# 2 2 B
# 3 3 A
# 4 4 B
# 5 5 A
# 6 6 C
# 7 7 C
# 8 8 C
# 9 9 B
# 10 10 C
Check whether a factor variable is of type integer or float
Using the following code you can check whether the data are of type integer or not after being converted to a numeric format :
all.equal(as.numeric(levels(data_rating$rating)),
as.integer(as.numeric(levels(data_rating$rating)))) == TRUE
If that operation returns TRUE
then it is an integer, if it returns FALSE
it is a float.
R Remove an integer element from a factor-type vector
We could use !is.na(as.numeric())
to identify the strings that are numeric and remove them.
onlynumbers <- "123.4"
onlyletters <- "abcd."
strings <- c(onlynumbers, onlyletters)
!is.na(as.numeric(strings))
[1] TRUE FALSE
As you can see this is working, now the removal
result <- strings[is.na(as.numeric(strings))]
> result
[1] "abcd."
EDIT You should first convert your factors to character using as.character.factor
and after you can reconvert using as.factor
EDIT 2 to keep the names you could use names(result) <- names(strings)[is.na(as.numeric(strings))]
Related Topics
Ggplot2, Axis Not Showing After Using Theme(Axis.Line=Element_Line())
How to Create a Grouped Boxplot in R
Extreme Numerical Values in Floating-Point Precision in R
Too Few Periods for Decompose()
Ordering of Points in R Lines Plot
Returning Above and Below Rows of Specific Rows in R Dataframe
R's Read.CSV Prepending 1St Column Name with Junk Text
Error in Loading Rgl Package with MAC Os X
How to Spread or Cast Multiple Values in R
How to Filter Rows Based on Difference in Dates Between Rows in R
Lm Function in R Does Not Give Coefficients for All Factor Levels in Categorical Data
How to Complete Missing Factor Levels in Data Frame
Subset Xts Object by Time of Day
How to Jitter/Dodge Geom_Segments So They Remain Parallel
Error in New.Session():Could Not Establish Session After 5 Attempts
Insert Elements in a Vector in R
Add a Column with Count of Nas and Mean
How to Label a Barplot Bar with Positive and Negative Bars with Ggplot2