What's the Difference Between '1L' and '1'

What's the difference between `1L` and `1`?

So, @James and @Brian explained what 3L means. But why would you use it?

Most of the time it makes no difference - but sometimes you can use it to get your code to run faster and consume less memory. A double ("numeric") vector uses 8 bytes per element. An integer vector uses only 4 bytes per element. For large vectors, that's less wasted memory and less to wade through for the CPU (so it's typically faster).

Mostly this applies when working with indices.
Here's an example where adding 1 to an integer vector turns it into a double vector:

x <- 1:100
typeof(x) # integer

y <- x+1
typeof(y) # double, twice the memory size
object.size(y) # 840 bytes (on win64)

z <- x+1L
typeof(z) # still integer
object.size(z) # 440 bytes (on win64)

...but also note that working excessively with integers can be dangerous:

1e9L * 2L # Works fine; fast lean and mean!
1e9L * 4L # Ooops, overflow!

...and as @Gavin pointed out, the range for integers is roughly -2e9 to 2e9.

A caveat though is that this applies to the current R version (2.13). R might change this at some point (64-bit integers would be sweet, which could enable vectors of length > 2e9). To be safe, you should use .Machine$integer.max whenever you need the maximum integer value (and negate that for the minimum).

Order of operator precedence when using : (the colon)

Because the operator : has precedence over + so 1+1:3 is really 1+(1:3) (i. e. 2:4) and not 2:3. Thus, to change the order of execution as defined operator precedence, use parentheses ()

You can see the order of precedence of operators in the help file ?Syntax. Here is the relevant part:

The following unary and binary operators are defined. They are listed in precedence groups, from highest to lowest.

:: ::: access variables in a namespace

$ @ component / slot extraction

[ [[ indexing

^ exponentiation (right to left)

- + unary minus and plus

: sequence operator

%any% special operators (including %% and %/%)

* / multiply, divide

+ - (binary) add, subtract

What's the difference between a double and a numeric?

I guess this has to do with converting your data.frame into a tibble. Replicating your code on mtcars dataset, we get:

mtcars %>%
as_tibble() %>%
mutate(year = as.double(seq(1956, 2009, 0.25)[1:nrow(mtcars)])) %>%
dplyr::select(year) %>%
head

# year
# <dbl>
# 1 1956
# 2 1956.
# 3 1956.
# 4 1957.
# 5 1957
# 6 1957.

Here's the difference if we comment as_tibble:

# year
# 1 1956.00
# 2 1956.25
# 3 1956.50
# 4 1956.75
# 5 1957.00
# 6 1957.25

Swapping as.double with as.numeric does not change anything.
From ?double:

as.double is a generic function. It is identical to as.numeric. 

In R programming, what's the difference between & vs &&, and | vs ||

they can only handle a single logical test on each side of the operator

a <- c(T, F, F, F)
b <- c(T, F, F, F)
a && b

Returns
[1] TRUE

Because only the first element of a and b are tested!

Edit:

Consider the following, where we 'rotate' a and b after each && test:

a <- c(T, F, T, F)
b <- c(T, F, F, T)
for (i in seq_along(a)){
cat(paste0("'a' is: ", paste0(a, collapse=", "), " and\n'b' is: ", paste0(b, collapse=", "),"\n"))
print(paste0("'a && b' is: ", a && b))
a <- c(a[2:length(a)], a[1])
b <- c(b[2:length(b)], b[i])
}

Gives us:

'a' is: TRUE, FALSE, TRUE, FALSE and
'b' is: TRUE, FALSE, FALSE, TRUE
[1] "'a && b' is: TRUE"
'a' is: FALSE, TRUE, FALSE, TRUE and
'b' is: FALSE, FALSE, TRUE, TRUE
[1] "'a && b' is: FALSE"
'a' is: TRUE, FALSE, TRUE, FALSE and
'b' is: FALSE, TRUE, TRUE, FALSE
[1] "'a && b' is: FALSE"
'a' is: FALSE, TRUE, FALSE, TRUE and
'b' is: TRUE, TRUE, FALSE, TRUE
[1] "'a && b' is: FALSE"

Additionally, &&, || stops as soon as the expression is clear:

FALSE & a_not_existing_object
TRUE | a_not_existing_object

Returns:

Error: object 'a_not_existing_object' not found
Error: object 'a_not_existing_object' not found

But:

FALSE && a_not_existing_object
TRUE || a_not_existing_object

Returns:

[1] FALSE

[1] TRUE

Because anything after FALSE AND something (and TRUE OR something) becomes FALSE and TRUE respectively

This last behavior of && and || is especially useful if you want to check in your control-flow for an element that may not exist:

if (exists(a_not_existing_object) && a_not_existing_object > 42) {...}

This way the evaluation stops after the first expression evaluates to FALSE and the a_not_existing_object > 42 part is not even atempted!

What's the difference between as.integer() and +0L used on booleans?

x + 0L is an element wise operation on x; as such, it often preserves the shape of the data. as.integer isn’t: it takes the whole structure – here, a matrix – and converts it into a one-dimensional integer vector.

That said, in the general case I’d strongly suggest using as.integer and discourage + 0L as a clever hack (remember: often, clever ≠ good). If you want to preserve the shape of data I suggest using David’s method from the comments, rather than the + 0L hack:

a[] = as.integer(a)

This uses the normal meaning of as.integer, but the result is assigned to the individual elements of a, rather than a itself. In other words, a’s shape remains untouched.

Comput the difference between two values for two dates?

The solution has two parts:

  • wrangle the dataset so that it is tidy for your purposes
  • plot the graph

Wrangling the data is straightforward. (The code here could be shortened, but I've written it as it is to make the various steps clear.) Also note the "obvious" correction to the typo mentioned by @user2974951.

# Extract the baseline values and convert to long format
baseline <- myd %>%
filter(year == 1990) %>%
select(-year) %>%
pivot_longer(everything(), names_to="variable", values_to="baseline")
# Extract the endpoint values and convert to long format
endpoint <- myd %>%
filter(year == 1999) %>%
select(-year) %>%
pivot_longer(everything(), names_to="variable", values_to="endpoint")
# Merge by variable and calculate difference
difference <- baseline %>%
full_join(endpoint, by="variable") %>%
mutate(diff=endpoint-baseline)

At this point, difference looks like this:

> difference
# A tibble: 2 × 4
variable baseline endpoint diff
<chr> <dbl> <dbl> <dbl>
1 ud 137. 128. -9.75
2 ax 67 68 1

Now create the bar chart.

# Create the bar chart
difference %>%
ggplot() +
geom_col(aes(x=variable, y=diff))

Sample Image

Note that this solution is robust with respect to the number of variables in the original dataset, and their names. It will also handle missing values without error. It could easily be generalised to calculate and plot the difference between any two years (eg earliest year as baseline and most recent as endpoint).

What is the difference between y ~ 1, y ~ 0 and y ~ -1 in R formulas?

From the ?formula documentation:

 The ‘-’ operator removes the specified terms, so that ‘(a+b+c)^2 -
a:b’ is identical to ‘a + b + c + b:c + a:c’. It can also used to
remove the intercept term: when fitting a linear model ‘y ~ x - 1’
specifies a line through the origin. A model with no intercept
can be also specified as ‘y ~ x + 0’ or ‘y ~ 0 + x’.

So:

  • y ~ 1 includes an intercept
  • y ~ 0 does not include an intercept
  • y ~ -1 does not include an intercept

The last two are functionally equivalent.



Related Topics



Leave a reply



Submit