Levels≪-'( What Sorcery Is This

`levels-`( What sorcery is this?

The answers here are good, but they are missing an important point. Let me try and describe it.

R is a functional language and does not like to mutate its objects. But it does allow assignment statements, using replacement functions:

levels(x) <- y

is equivalent to

x <- `levels<-`(x, y)

The trick is, this rewriting is done by <-; it is not done by levels<-. levels<- is just a regular function that takes an input and gives an output; it does not mutate anything.

One consequence of that is that, according to the above rule, <- must be recursive:

levels(factor(x)) <- y

is

factor(x) <- `levels<-`(factor(x), y)

is

x <- `factor<-`(x, `levels<-`(factor(x), y))

It's kind of beautiful that this pure-functional transformation (up until the very end, where the assignment happens) is equivalent to what an assignment would be in an imperative language. If I remember correctly this construct in functional languages is called a lens.

But then, once you have defined replacement functions like levels<-, you get another, unexpected windfall: you don't just have the ability to make assignments, you have a handy function that takes in a factor, and gives out another factor with different levels. There's really nothing "assignment" about it!

So, the code you're describing is just making use of this other interpretation of levels<-. I admit that the name levels<- is a little confusing because it suggests an assignment, but this is not what is going on. The code is simply setting up a sort of pipeline:

  • Start with dat$product

  • Convert it to a factor

  • Change the levels

  • Store that in res

Personally, I think that line of code is beautiful ;)

What do 'names-' and 'class-' do in R codes below?

  1. ‘class<-‘ (with single quotes, whether fancy or straight-ascii ') is wrong, likely a side-effect of copying and pasting and/or misinterpreting somebody else's code. (MS Word does this a lot ... never never never write/edit R code in Word, R is intolerant of these fancy-niceties :-).

    Typically, it should be backticks instead, as in `class<-` and `names<-`.

  2. Typically, one would see something like

    vec <- c(1, 3, 11)
    names(vec) <- c("a", "BB", "quux")
    vec
    # a BB quux
    # 1 3 11

    That last call is actually calling a special form of a function call `names<-` (not names itself), which tells R that it should do something different when names(...) is called on the left-hand side (LHS) of an assignment operator (= or <- in R).

    Sometimes, coders try to be slick (code-golf) by calling the function directly so that they can get the returned value without necessarily changing the original variable's value. For instance:

    vec <- c(1, 3, 11)
    `names<-`(vec, c("a", "BB", "quux"))
    # a BB quux
    # 1 3 11
    vec
    # [1] 1 3 11

    With `class<-`, it is changing the class of an object "inline".

Factor levels by group

A data.table solution:

dt[, height_cat := cut(Height, breaks = c(0, 165, 180, 300), right = FALSE)]
dt[, height_f :=
factor(
paste(Sex, height_cat, sep = ":"),
levels = dt[, CJ(Sex, height_cat, unique = TRUE)][, paste(Sex, height_cat, sep = ":")]
)]

table(dt$height_f)
# F:[0,165) F:[165,180) F:[180,300) M:[0,165) M:[165,180) M:[180,300)
# 2 2 0 0 2 2

How does the `[-` function work in R?

This is what you would need to do to get the assignment to stick:

 `<-`(    `[`(   x, 1, 2), 7)  # or x <- `[<-`(   x, 1, 2, 7)
x
[,1] [,2]
[1,] 1 7
[2,] 2 4

Essentially what is happening is that [ is creating a pointer into row-col location of x and then <- (which is really a synonym for assign that can also be used in an infix notation) is doing the actual "permanent" assignment. Do not be misled into thinking this is a call-by-reference assignment. I'm reasonably sure there will still be a temporary value of x created.

Your version did make a subassignment (as can be seen by what it returned) but that assignment was only in the local environment of the call to [<- which did not encompass the global environment.

R: unused argument in levels

From the help page?as.factor it shows that the function only takes one argument (in your case the filtered_table$column), and therefore the error message indicates that there's not another argument to match up with the second one you've specified in the function call. To specify the levels explicitly, you may need to use the factor() function.

summing rows by combining levels R

Using dplyr, we arrange by 'placette', 'year', grouped by 'placette', get the cumsum of variables whose names starts_with 'SP'

library(dplyr)
data %>%
arrange(placette, year) %>%
group_by(placette) %>%
mutate_at(vars(starts_with("SP")), cumsum)
# A tibble: 12 x 4
# Groups: placette [3]
# placette year SP1 SP2
# <int> <int> <int> <int>
# 1 1 2013 43 4
# 2 1 2014 43 4
# 3 1 2015 59 7
# 4 1 2016 113 11
# 5 2 2013 30 0
# 6 2 2014 32 2
# 7 2 2015 48 3
# 8 2 2016 99 5
# 9 3 2013 23 3
#10 3 2014 28 3
#11 3 2015 48 3
#12 3 2016 99 3

data

data <- structure(list(placette = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L), year = c(2013L, 2013L, 2013L, 2014L, 2014L, 2014L,
2015L, 2015L, 2015L, 2016L, 2016L, 2016L), SP1 = c(43L, 30L,
23L, 0L, 2L, 5L, 16L, 16L, 20L, 54L, 51L, 51L), SP2 = c(4L, 0L,
3L, 0L, 2L, 0L, 3L, 1L, 0L, 4L, 2L, 0L)), class = "data.frame",
row.names = c(NA,
-12L))


Related Topics



Leave a reply



Submit