R Object Identity

R object identity

UPDATE: A more robust and faster implementation of address(x) (not using .Internal(inspect(x))) was added to data.table v1.8.9. From NEWS :

New function address() returns the address in RAM of its argument. Sometimes useful in determining whether a value has been copied or not by R, programatically.

There's probably a neater way but this seems to work.

address = function(x) substring(capture.output(.Internal(inspect(x)))[1],2,17)
x = 1
y = 1
z = x
identical(x,y)
# [1] TRUE
identical(x,z)
# [1] TRUE
address(x)==address(y)
# [1] FALSE
address(x)==address(z)
# [1] TRUE

You could modify it to work on 32bit by changing 17 to 9.

Is there a native object-id or pointer-value in R for R objects?

address, inspect and object_size in the pryr package can be useful here.

library(pryr)

address(df)
## [1] "0x7e0b688"

inspect(df)
## <VECSXP 0x7e0b688>
##   <REALSXP 0x7e0d028>
##   <INTSXP 0x96e7278>
## ...snip...

For example, the space taken up by L1 plus the space taken up by L2 is greater than the space taken up by both of them so clearly there is some sharing going on. If we inspect them we see that the components of L2 are still stored in L1.

L1 <- list(1:2, 3:4, 5:6)
L2 <- L1[-2]

object_size(L1)
## 248 B
object_size(L2)
## 176 B
object_size(L1, L2)
## 312 B

inspect(L1)
## <VECSXP 0x88622a8>
##   <INTSXP 0x90ba950>
##   <INTSXP 0x90ba870>
##   <INTSXP 0x90ba790>

inspect(L2)
## <VECSXP 0x971dbf8>
##   <INTSXP 0x90ba950>
##   <INTSXP 0x90ba790>

R object identification

I usually start out with some combination of:

typeof(obj)
class(obj)
sapply(obj, class)
sapply(obj, attributes)
attributes(obj)
names(obj)

as appropriate based on what's revealed. For example, try with:

obj <- data.frame(a=1:26, b=letters)
obj <- list(a=1:26, b=letters, c=list(d=1:26, e=letters))
data(cars)
obj <- lm(dist ~ speed, data=cars)

..etc.

If obj is an S3 or S4 object, you can also try methods or showMethods, showClass, etc. Patrick Burns' R Inferno has a pretty good section on this (sec #7).

EDIT: Dirk and Hadley mention str(obj) in their answers. It really is much better than any of the above for a quick and even detailed peek into an object.

Evaluation error of identity in lubridate::interval objects

UPDATE

If you look at the code for Interval classes, you will see that when the object is created it stores the start date and then calculates the difference between start and end and stores that as .Data.

interval <- function(start, end = NULL, tzone = tz(start)) {

  if (is.null(tzone)) {
    tzone <- tz(end)
    if (is.null(tzone))
      tzone <- "UTC"
  }

  if (is.character(start) && is.null(end)) {
    return(parse_interval(start, tzone))
  }

  if (is.Date(start)) start <- date_to_posix(start)
  if (is.Date(end)) end <- date_to_posix(end)

  start <- as_POSIXct(start, tzone)
  end <- as_POSIXct(end, tzone)

  span <- as.numeric(end) - as.numeric(start)
  starts <- start + rep(0, length(span))
  if (tzone != tz(starts)) starts <- with_tz(starts, tzone)

  new("Interval", span, start = starts, tzone = tzone)
}

In other words, the returned object has no concept of the "end date". The default value for the end argument is NULL, meaning you can even create an interval without an end date.

interval("2019-03-29")
[1] 2019-03-29 UTC--NA

The "end date" is simply text generated from a calculation that occurs when the Interval object is formatted for printing.

format.Interval <- function(x, ...) {
  if (length(x@.Data) == 0) return("Interval(0)")
  paste(format(x@start, tz = x@tzone, usetz = TRUE), "--",
        format(x@start + x@.Data, tz = x@tzone, usetz = TRUE), sep = "")
}

int_end <- function(int) int@start + int@.Data

Both of those code snippets are taken from https://github.com/tidyverse/lubridate/blob/f7a7c2782ba91b821f9af04a40d93fbf9820c388/R/intervals.r.

Accessing the underlying attributes of overlap allows you to complete the comparison without converting to character. You have to check that start and .Data are both equal. Converting to character is much cleaner, but if you were trying to avoid it this is how you could do that.

ifelse(lead(overlap@start) == overlap@start & lead(overlap@.Data) == overlap@.Data, 1, 0)

Taken altogether:

df %>%
  mutate_at(2:3, funs(as.Date(., format = "%Y-%m-%d"))) %>%
  mutate(overlap = interval(time1, time2),
         overlap_char = as.character(interval(time1, time2))) %>%
  group_by(id) %>%
  mutate(cond1_original = ifelse(lead(overlap_char) == overlap_char, 1, 0),
         cond1_new = ifelse(lead(overlap@start) == overlap@start & lead(overlap@.Data) == overlap@.Data, 1, 0),
         cond2_original = ifelse(lag(overlap_char) == overlap_char, 1, 0),
         cond2_new = ifelse(lag(overlap@start) == overlap@start & lag(overlap@.Data) == overlap@.Data, 1, 0)) 

id time1      time2      overlap                        overlap_char                   cond1_original cond1_new cond2_original cond2_new
<int> <date>     <date>     <S4: Interval>                 <chr>                                   <dbl>     <dbl>          <dbl>     <dbl>
1     1 2008-10-12 2009-03-20 2008-10-12 UTC--2009-03-20 UTC 2008-10-12 UTC--2009-03-20 UTC              0         0             NA        NA
2     1 2008-08-10 2009-06-15 2008-08-10 UTC--2009-06-15 UTC 2008-08-10 UTC--2009-06-15 UTC             NA        NA              0         0
3     2 2006-01-09 2006-02-13 2006-01-09 UTC--2006-02-13 UTC 2006-01-09 UTC--2006-02-13 UTC              0         0             NA        NA
4     2 2008-03-13 2008-04-17 2008-03-13 UTC--2008-04-17 UTC 2008-03-13 UTC--2008-04-17 UTC             NA        NA              0         0
5     3 2008-09-12 2008-10-17 2008-09-12 UTC--2008-10-17 UTC 2008-09-12 UTC--2008-10-17 UTC              0         0             NA        NA
6     3 2007-05-30 2007-07-04 2007-05-30 UTC--2007-07-04 UTC 2007-05-30 UTC--2007-07-04 UTC             NA        NA              0         0
7     4 2003-09-29 2004-01-15 2003-09-29 UTC--2004-01-15 UTC 2003-09-29 UTC--2004-01-15 UTC              1         1             NA        NA
8     4 2003-09-29 2004-01-15 2003-09-29 UTC--2004-01-15 UTC 2003-09-29 UTC--2004-01-15 UTC             NA        NA              1         1
9     5 2003-04-01 2003-07-04 2003-04-01 UTC--2003-07-04 UTC 2003-04-01 UTC--2003-07-04 UTC              1         1             NA        NA
10    5 2003-04-01 2003-07-04 2003-04-01 UTC--2003-07-04 UTC 2003-04-01 UTC--2003-07-04 UTC             NA        NA              1         1

You can read more about Intervals here: https://lubridate.tidyverse.org/reference/Interval-class.html

I believe your exact case has to do with the == comparison. As you can see above, "overlap" is a list,
not a vector. From ?==, it says:

At least one of x and y must be an atomic vector, but if the other is
a list R attempts to coerce it to the type of the atomic vector: this
will succeed if the list is made up of elements of length one that can
be coerced to the correct type.

If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.

We can coerce "overlap" to both numeric and character to see the difference.

df %>%
  mutate_at(2:3, funs(as.Date(., format = "%Y-%m-%d"))) %>%
  mutate(overlap = interval(time1, time2)) %>%
  group_by(id) %>%
  mutate(cond1 = ifelse(lead(overlap) == overlap, 1, 0),
         cond2 = ifelse(lag(overlap) == overlap, 1, 0)) %>%
  mutate(overlap.n = as.numeric(overlap),
         overlap.c = as.character(overlap))

# A tibble: 10 x 8
# Groups:   id [5]
id time1      time2      overlap                        cond1 cond2 overlap.n overlap.c    
<int> <date>     <date>     <S4: Interval>                 <dbl> <dbl>     <dbl> <chr>        
  1     1 2008-10-12 2009-03-20 2008-10-12 UTC--2009-03-20 UTC     0    NA  13737600 2008-10-12 U…
  2     1 2008-08-10 2009-06-15 2008-08-10 UTC--2009-06-15 UTC    NA     0  26697600 2008-08-10 U…
  3     2 2006-01-09 2006-02-13 2006-01-09 UTC--2006-02-13 UTC     1    NA   3024000 2006-01-09 U…
  4     2 2008-03-13 2008-04-17 2008-03-13 UTC--2008-04-17 UTC    NA     1   3024000 2008-03-13 U…
  5     3 2008-09-12 2008-10-17 2008-09-12 UTC--2008-10-17 UTC     1    NA   3024000 2008-09-12 U…
  6     3 2007-05-30 2007-07-04 2007-05-30 UTC--2007-07-04 UTC    NA     1   3024000 2007-05-30 U…
  7     4 2003-09-29 2004-01-15 2003-09-29 UTC--2004-01-15 UTC     1    NA   9331200 2003-09-29 U…
  8     4 2003-09-29 2004-01-15 2003-09-29 UTC--2004-01-15 UTC    NA     1   9331200 2003-09-29 U…
  9     5 2003-04-01 2003-07-04 2003-04-01 UTC--2003-07-04 UTC     1    NA   8121600 2003-04-01 U…
  10     5 2003-04-01 2003-07-04 2003-04-01 UTC--2003-07-04 UTC    NA     1   8121600 2003-04-01 U…

Per the output above, I believe that using == is coercing the "overlap" interval to a numeric vector, resulting in the duration comparison @hmhensen mentions above. When you force the
coercion to character rather than numeric, you get your desired result.

Is cppreference using the term [Object's] identity is two different meanings for c++11 and for c++17?

Why do you think that the C++11 concepts no longer apply? The page says that that version introduced the “has identity” idea, not that it’s the only version that uses it. What C++17 did was say that prvalues “wait until” they are used for initialization of an object to become an object at all. (It therefore merges “being an object” with “having identity”, which is simpler; plainly every object can have this be detected.)

What that really means is that the address of that target object is invisibly passed to the site of the prvalue construction so that it appears in the correct place from the start. Compilers were already doing that under the RVO name, but the change guaranteed that and removed formal movability restrictions.

The xvalue certainly has an identity: if you pass a prvalue to a function by (either kind of) reference, it can take its address. It can even return it so that (during the same full-expression) the function that created it can use its address. The prohibition on taking the address of a temporary is a separate safety measure that applies even after (say) using static_cast<A&&>(get_a()) to force temporary materialization. (However, it can’t stop you from taking the address of a glvalue that refers to the temporary returned from another function.)

How to efficiently check whether the identity of R list elements if these are vectors?

dt[, o := sapply(listcol, function(x) 1 %in% x)]
dt
#    numericcol        listcol     o
# 1:         42        1,22, 3  TRUE
# 2:         42              6 FALSE
# 3:         42              1  TRUE
# 4:         42             12 FALSE
# 5:         42    5,   6,1123 FALSE
# 6:         42              3 FALSE
# 7:         42             42 FALSE
# 8:         42              1  TRUE

What does object identity mean in java?

Suppose you have this simple class:

class Example {
    private int value;

    Example(int v) {
        this.value = v;
    }

    public void showValue() {
        System.out.println(this.value);
    }
}

And we have this code (for instance, in a method somewhere else):

Example e1 = new Example(42);
Example e2 = new Example(42);

Then:

e1 and e2 have state (their value member). In this case, it happens both have the same state (42).
e1 and e2 have behavior: A method, showValue, that will dump out their value to the console. (Note that they don't necessarily have to have the same behavior: We could create a subclass of Example that did something different with showValue [perhaps showed it in a pop-up dialog box], and make e2 an instance of that subclass instead.)
e1 and e2 have identity: The expression e1 == e2 is false; they are not the same object. They each have a unique identity. They may be equivalent objects (we could implement equals and hashCode such that they were deemed equivalent), but they will never have the same identity.

No object ever has the same identity as another object; an object's identity is guaranteed unique within the memory of the running process.

(They also have other characteristics, such as their class type, but those are the main three.)

R sf equivalent of Esri Identity

Answering in case someone else running into this issue. One of my coworkers helped me get to the solution of binding the results of the intersection of the two layers, with the difference of the focal layer and intersecting layer.

arc.ident <- function(layer_a, layer_b){
  int_a_b <- st_intersection(layer_a, layer_b)
  rest_of_a <- st_difference(layer_a, st_union(layer_b))
  output <- bind_rows(int_a_b, rest_of_a)
  return(st_as_sf(output))
}

or as a tidyverse pipe

arc.ident.output <-  st_intersection(layer_a, layer_b) %>% 
  bind_rows(st_difference(layer_a, st_union(layer_b)))

Assign identity code based on factor name

You can use dplyr::group_indices().

library(dplyr)

df <- df %>%
  mutate(id = group_indices(., NAME))

R Object Identity