Inheritance in R
A first pass, not quite good enough
Here are two classes
.A <- setClass("A", representation(a="integer"))
.B <- setClass("B", contains="A", representation(b="integer"))
The symbol .A
is a class generator function (essentially a call to new()
), and is a relatively new addition to the methods package.
Here we write an initialize,A-method, using callNextMethod
to call the next method (the default constructor, initialize,ANY-method) for the class
setMethod("initialize", "A", function(.Object, ..., a=integer()) {
## do work of initialization
cat("A\n")
callNextMethod(.Object, ..., a=a)
})
The argument corresponding to the slot a=a
comes after ...
so that the function does not assign any unnamed arguments to a
; this is important because unnamed arguments are supposed (from ?initialize
) to be used to initialize base classes, not slots; the importance of this becomes apparent below. Similarly for "B":
setMethod("initialize", "B", function(.Object, ..., b=integer()) {
cat("B\n")
callNextMethod(.Object, ..., b=b)
})
and in action
> b <- .B(a=1:5, b=5:1)
B
A
> b
An object of class "B"
Slot "b":
[1] 5 4 3 2 1
Slot "a":
[1] 1 2 3 4 5
Actually, this is not quite correct, because the default initialize
is a copy constructor
.C <- setClass("C", representation(c1="numeric", c2="numeric"))
c <- .C(c1=1:5, c2=5:1)
> initialize(c, c1=5:1)
An object of class "C"
Slot "c1":
[1] 5 4 3 2 1
Slot "c2":
[1] 5 4 3 2 1
and our initialize method has broken this aspect of the contract
> initialize(b, a=1:5) # BAD: no copy construction
B
A
An object of class "B"
Slot "b":
integer(0)
Slot "a":
[1] 1 2 3 4 5
Copy construction turns out to be quite handy, so we don't want to break it.
Retaining copy construction
There are two solutions employed to retain copy construction functionality. The first avoids defining an initialize method, but instead creates a plain old function as a constructor
.A1 <- setClass("A1", representation(a="integer"))
.B1 <- setClass("B1", contains="A1", representation(b="integer"))
A1 <- function(a = integer(), ...) {
.A1(a=a, ...)
}
B1 <- function(a=integer(), b=integer(), ...) {
.B1(A1(a), b=b, ...)
}
These functions include ...
as arguments, so that class "B1" can be extended and its constructor still used. This is actually quite attractive; the constructor can have a sensible signature with documented arguments. initialize
can be used as a copy constructor (remember, there is no initialize,A1-method or initialize,B1-method, so the call .A1()
invokes the default, copy-constructor able initialize method). The function (.B1(A1(a), b=b, ...)
says "call the generator for class B1, with an unnamed argument creating its superclass using the "A1" constructor, and a named argument corresponding to slot b". As mentioned above, from ?initialize
, the unnamed argument(s) are used to initialize superclass(es) (with plural classes when the class structure involves multiple inheritance). The use of constructors means that class A1 and B1 can be ignorant of each other's structure and implementation.
The second solution, less commonly used in its full glory, is to write an initialize method that retains copy construction, along the lines of
.A2 <- setClass("A2", representation(a1="integer", a2="integer"),
prototype=prototype(a1=1:5, a2=5:1))
setMethod("initialize", "A2",
function(.Object, ..., a1=.Object@a1, a2=.Object@a2)
{
callNextMethod(.Object, ..., a1=a1, a2=a2)
})
The argument a1=.Object@a1
uses the current value of the a1
slot of .Object
as a default, relevant when the method is being used as a copy constructor. The example illustrates the use of a prototype
to provide an initial values different from 0-length vectors. In action:
> a <- .A2(a2=1:3)
> a
An object of class "A1"
Slot "a1":
[1] 1 2 3 4 5
Slot "a2":
[1] 1 2 3
> initialize(a, a1=-(1:3)) # GOOD: copy constructor
An object of class "A1"
Slot "a1":
[1] -1 -2 -3
Slot "a2":
[1] 1 2 3
Unfortunately this approach fails when trying to initialize a derived class from a base class.
Other considerations
One final point is the structure of the initialize method itself. Illustrated above is the pattern
## do class initialization steps, then...
callNextMethod(<...>)
so callNextMethod()
is at the end of the initialize method. An alternative is
.Object <- callNextMethod(<...>)
## do class initialization steps by modifying .Object, e.g.,...
.Object@a <- <...>
.Object
The reason to prefer the first approach is that there is less copying involved; the default initialize,ANY-method populates slots with a minimum of copying, whereas the slot update approach copies the entire object each time a slot is modified; this can be very bad if the object contains large vectors.
In R, does integer type inherit from numeric type or not?
This is really to do with the difference between mode
and class
. When you use as.integer(4)
, you're explicitly creating an object of class integer
. inherits
checks class inheritance and therefore inherits(as.integer(4),"numeric")
will return FALSE
because it is not class numeric
.
as.integer(4)
is still a numeric
object but R's inherits
checks only class
es not mode
s which is unlike what you would have expected in terms of inheritance.
inherits indicates whether its first argument inherits from any of the classes specified in the what argument. If which is TRUE then an integer vector of the same length as what is returned. Each element indicates the position in the class(x) matched by the element of what; zero indicates no match. If which is FALSE then TRUE is returned by inherits if any of the names in what match with any class.
Now, looking at the source code of checkmate::check_class
, it is essentially doing the same thing(checking inheritance of class not mode):
function (x, classes, ordered = FALSE, null.ok = FALSE)
{
qassert(classes, "S+")
qassert(ordered, "B1")
qassert(null.ok, "B1")
if (is.null(x) && null.ok)
return(TRUE)
ord = inherits(x, classes, TRUE)
w = wf(ord == 0L)
if (length(w) > 0L) {
cl = class(x)
return(sprintf("Must inherit from class '%s', but has class%s '%s'",
classes[w], if (length(cl) > 1L) "es" else "",
paste0(cl, collapse = "','")))
}
Why does the first work?
Use checkmate::check_class(4,"numeric"), it correctly returns TRUE
because:
class(4)
[1] "numeric"
I think inheritance
here is not used in the way you might have expected because it seems to stop at "one level"(just the class and not mode):
mode(as.integer(4))
[1] "numeric"
Can (should) I inherit parts of a function in R?
There is no inheritance of function parts in R. You cannot "inherit part's" of functions from other functions, only call functions from other functions. All OO paradigms in R (S3,S4,refClasses) are exactly what they say, object-oriented. Methods are dispatched according to the class of objects they receive.
Your question is really how to get rid of code repetition.
There are two ways, one standard and one not so standard.
Standard way: Write functions for repeated code and call them from other functions. The drawback is that functions return only one object, but you have three. So you can do something like this:
repeated_code <- function(table, pattern){
objects <- list()
objects$dframe <- get(table)
objects$cn <- colnames(get(table))
objects$qs <- subset(cn, cn %in% grep(pattern, cn, value=TRUE))
}
firstfunc <- function(table,pattern="^Variable") {
objects <- repeated_code(table, pattern)
...
manipulate objects
...
}
secondfunc <- function(table,pattern="^Variable") {
objects <- repeated_code(table, pattern)
...
manipulate objects
...
}Not so standard way: Use unevaluated expressions:
redundant_code <- expression({
dframe <- get(table)
cn <- colnames(get(table))
qs <- subset(cn, cn %in% grep(pattern, cn, value=TRUE))
})
firstfunc <- function(table,pattern="^Variable") {
eval(redundant_code, envir=parent.frame())
...
}
secondfunc <- function(table,pattern="^Variable") {
eval(redundant_code, envir=parent.frame())
...
}
[Update: Since the R 2.12.0 there is yet another, multi-assign way.
Write a function wich returns the list of objects (like in the "standard" case above). Then assign the objects in the returned list to the current evnvironmnet with list2env
:
secondfunc <- function(table,pattern="^Variable") {
objects <- repeated_code(table, pattern)
list2env(objects, envir = parent.frame())
...
}
]
How to make an S4 class inherit correctly from another S4 class?
Per nrussel's comment:
the argument contains
of the function setClass
deals with inheritance. You want the class Employee
to inherit from the class Person
(i.e. an employee is a special type of person). So
setClass("Person", slots = list(name="character", age="numeric"))
setClass("Employee", slots = list(boss="Person"), contains = "Person")
will do the trick.
> alice <- new("Person", name="Alice", age = 40)
> john <- new("Employee", name = "John", age = 20, boss= alice)
> john
An object of class "Employee"
Slot "boss":
An object of class "Person"
Slot "name":
[1] "Alice"
Slot "age":
[1] 40
Slot "name":
[1] "John"
Slot "age":
[1] 20
Function for columns to inherit custom class of data frame in R
We can use the imap
recursively or use map
inside
library(purrr)
dd$dat <- imap(dd$dat, ~ {nm1 <- .y
map_dfr(append_classes(.x, nm1), ~ append_classes(.x, nm1))
})
class(dd$dat$one$a)
#[1] "numeric" "foo"
class(dd$dat$two$d)
#[1] "numeric" "bar"
Or this can be done with base R
using Map/lapply
dd$dat <- Map(function(x, y) {
tmp <- append_classes(x, y)
tmp[] <- lapply(tmp, append_classes, nm = y)
tmp} , dd$dat, names(dd$dat))
class(dd$dat$one$a)
#[1] "numeric" "foo"
Related Topics
How to Match by Nearest Date from Two Data Frames
Is There a Logical Way to Think About List Indexing
Counting Number of Instances of a Condition Per Row R
Aggregate and Reshape from Long to Wide
How to Determine If Date Is a Weekend or Not (Not Using Lubridate)
How to Define More Line Types for Graphs in R (Custom Linetype)
Split One Row into Multiple Rows
Checking If Date Is Between Two Dates in R
Display Exact Value of a Variable in R
Ggplot for Loop Outputs All the Same Graph
Extract Text After "/" in a Data Frame Column
R Matrix to Rownames Colnames Values
Calling an R Function Using Inline and Rcpp Is Still Just as Slow as Original R Code
R: What Do You Call the :: and ::: Operators and How Do They Differ