`levels-`( What sorcery is this?
The answers here are good, but they are missing an important point. Let me try and describe it.
R is a functional language and does not like to mutate its objects. But it does allow assignment statements, using replacement functions:
levels(x) <- y
is equivalent to
x <- `levels<-`(x, y)
The trick is, this rewriting is done by <-
; it is not done by levels<-
. levels<-
is just a regular function that takes an input and gives an output; it does not mutate anything.
One consequence of that is that, according to the above rule, <-
must be recursive:
levels(factor(x)) <- y
is
factor(x) <- `levels<-`(factor(x), y)
is
x <- `factor<-`(x, `levels<-`(factor(x), y))
It's kind of beautiful that this pure-functional transformation (up until the very end, where the assignment happens) is equivalent to what an assignment would be in an imperative language. If I remember correctly this construct in functional languages is called a lens.
But then, once you have defined replacement functions like levels<-
, you get another, unexpected windfall: you don't just have the ability to make assignments, you have a handy function that takes in a factor, and gives out another factor with different levels. There's really nothing "assignment" about it!
So, the code you're describing is just making use of this other interpretation of levels<-
. I admit that the name levels<-
is a little confusing because it suggests an assignment, but this is not what is going on. The code is simply setting up a sort of pipeline:
Start with
dat$product
Convert it to a factor
Change the levels
Store that in
res
Personally, I think that line of code is beautiful ;)
What do 'names-' and 'class-' do in R codes below?
‘class<-‘
(with single quotes, whether fancy‘
or straight-ascii'
) is wrong, likely a side-effect of copying and pasting and/or misinterpreting somebody else's code. (MS Word does this a lot ... never never never write/edit R code in Word, R is intolerant of these fancy-niceties :-).Typically, it should be backticks instead, as in
`class<-`
and`names<-`
.Typically, one would see something like
vec <- c(1, 3, 11)
names(vec) <- c("a", "BB", "quux")
vec
# a BB quux
# 1 3 11That last call is actually calling a special form of a function call
`names<-`
(notnames
itself), which tells R that it should do something different whennames(...)
is called on the left-hand side (LHS) of an assignment operator (=
or<-
in R).Sometimes, coders try to be slick (code-golf) by calling the function directly so that they can get the returned value without necessarily changing the original variable's value. For instance:
vec <- c(1, 3, 11)
`names<-`(vec, c("a", "BB", "quux"))
# a BB quux
# 1 3 11
vec
# [1] 1 3 11With
`class<-`
, it is changing the class of an object "inline".
Factor levels by group
A data.table
solution:
dt[, height_cat := cut(Height, breaks = c(0, 165, 180, 300), right = FALSE)]
dt[, height_f :=
factor(
paste(Sex, height_cat, sep = ":"),
levels = dt[, CJ(Sex, height_cat, unique = TRUE)][, paste(Sex, height_cat, sep = ":")]
)]
table(dt$height_f)
# F:[0,165) F:[165,180) F:[180,300) M:[0,165) M:[165,180) M:[180,300)
# 2 2 0 0 2 2
How does the `[-` function work in R?
This is what you would need to do to get the assignment to stick:
`<-`( `[`( x, 1, 2), 7) # or x <- `[<-`( x, 1, 2, 7)
x
[,1] [,2]
[1,] 1 7
[2,] 2 4
Essentially what is happening is that [
is creating a pointer into row-col location of x
and then <-
(which is really a synonym for assign
that can also be used in an infix notation) is doing the actual "permanent" assignment. Do not be misled into thinking this is a call-by-reference assignment. I'm reasonably sure there will still be a temporary value of x
created.
Your version did make a subassignment (as can be seen by what it returned) but that assignment was only in the local environment of the call to [<-
which did not encompass the global environment.
R: unused argument in levels
From the help page?as.factor
it shows that the function only takes one argument (in your case the filtered_table$column
), and therefore the error message indicates that there's not another argument to match up with the second one you've specified in the function call. To specify the levels explicitly, you may need to use the factor()
function.
summing rows by combining levels R
Using dplyr
, we arrange
by 'placette', 'year', grouped by 'placette', get the cumsum
of variables whose names starts_with
'SP'
library(dplyr)
data %>%
arrange(placette, year) %>%
group_by(placette) %>%
mutate_at(vars(starts_with("SP")), cumsum)
# A tibble: 12 x 4
# Groups: placette [3]
# placette year SP1 SP2
# <int> <int> <int> <int>
# 1 1 2013 43 4
# 2 1 2014 43 4
# 3 1 2015 59 7
# 4 1 2016 113 11
# 5 2 2013 30 0
# 6 2 2014 32 2
# 7 2 2015 48 3
# 8 2 2016 99 5
# 9 3 2013 23 3
#10 3 2014 28 3
#11 3 2015 48 3
#12 3 2016 99 3
data
data <- structure(list(placette = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L), year = c(2013L, 2013L, 2013L, 2014L, 2014L, 2014L,
2015L, 2015L, 2015L, 2016L, 2016L, 2016L), SP1 = c(43L, 30L,
23L, 0L, 2L, 5L, 16L, 16L, 20L, 54L, 51L, 51L), SP2 = c(4L, 0L,
3L, 0L, 2L, 0L, 3L, 1L, 0L, 4L, 2L, 0L)), class = "data.frame",
row.names = c(NA,
-12L))
Related Topics
Standard Evaluation in Dplyr: Summarise a Variable Given as a Character String
Assign Multiple New Variables on Lhs in a Single Line
How to Remove Outliers from a Dataset
Quit and Restart a Clean R Session from Within R
Combining Paste() and Expression() Functions in Plot Labels
How to Subtract/Add Days From/To a Date
Efficiently Generate a Random Sample of Times and Dates Between Two Dates
Apply Multiple Functions to Multiple Columns in Data.Table
How to Suppress Warnings Globally in an R Script
Finding All Positions For Multiple Elements in a Vector
Limit Ggplot2 Axes Without Removing Data (Outside Limits): Zoom
Overlay Histogram With Density Curve
How to Divide Each Row of a Matrix by Elements of a Vector in R
Dplyr: How to Use Group_By Inside a Function