How to see the source code of R .Internal or .Primitive function?
The R source code of pnorm
is:
function (q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
.Call(C_pnorm, q, mean, sd, lower.tail, log.p)
So, technically speaking, typing "pnorm" does show you the source code. However, more usefully: The guts of pnorm
are coded in C, so the advice in the previous question view source code in R is only peripherally useful (most of it concentrates on functions hidden in namespaces etc.).
Uwe Ligges's article in R news, Accessing the Sources (p. 43), is a good general reference. From that document:
When looking at R source code, sometimes calls
to one of the following functions show up:.C()
,.Call()
,.Fortran()
,.External()
, or.Internal()
and.Primitive()
. These functions are calling entry points in compiled code such as shared objects,
static libraries or dynamic link libraries. Therefore,
it is necessary to look into the sources of the compiled code, if complete understanding of the code is
required.
...
The first step is to look up the
entry point in file ‘$R HOME/src/main/names.c’, if
the calling R function is either.Primitive()
or.Internal()
. This is done in the following example for the code implementing the ‘simple’ R functionsum()
.
(Emphasis added because the precise function you asked about (sum
) is covered in Ligges's article.)
Depending on how seriously you want to dig into the code, it may be worth downloading and
unpacking the source code as Ligges suggests (for example, then you can use command-line tools
such as grep
to search through the source code). For more casual inspection, you can view
the sources online via the R Subversion server or Winston Chang's github mirror (links here are specifically to src/nmath/pnorm.c
). (Guessing the right place to look, src/nmath/pnorm.c
, takes some familiarity with the structure of the R source code.)
mean
and sum
are both implemented in summary.c.
Studying the source code of primitive and internal R functions: How is R connected with C?
Great question. I started R under gdb R -d gdb
, set a breakpoint at do_isna
, then continued R and entered is.na(3)
.
$ R -d gdb
(gdb) run
Starting program: /home/mtmorgan/bin/R-3-3-branch/bin/exec/R --no-save --no-restore --silent
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> ## break, cntrl-C
Program received signal SIGINT, Interrupt.
0x00007ffff722fd83 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) b do_isna
Breakpoint 1 at 0x7ffff77e0b3b: file /home/mtmorgan/src/R-3-3-branch/src/main/coerce.c, line 1982.
(gdb) continue
Continuing.
> is.na(3)
Breakpoint 1, do_isna (call=0x1838888, op=0x628218, args=0x1838770, rho=0x63f648)
at /home/mtmorgan/src/R-3-3-branch/src/main/coerce.c:1982
1982 checkArity(op, args);
(gdb)
At the gdb prompt I asked
(gdb) where
#0 do_isna (call=0x1838888, op=0x628218, args=0x1838770, rho=0x63f648) at /home/mtmorgan/src/R-3-3-branch/src/main/coerce.c:1982
#1 0x00007ffff7869170 in Rf_eval (e=0x1838888, rho=0x63f648) at /home/mtmorgan/src/R-3-3-branch/src/main/eval.c:717
#2 0x00007ffff78b36af in Rf_ReplIteration (rho=0x63f648, savestack=0, browselevel=0, state=0x7fffffffcaf0) at /home/mtmorgan/src/R-3-3-branch/src/main/main.c:258
...
Starting from #2, Rf_ReplIteration is the REPL (read-eval-print loop) trying to evalute is.na(3)
. It's provided with the environment where the function is being called from. By the time it calls Rf_eval()
on line 258, it knows the environment and the call
(gdb) call Rf_PrintValue(rho)
<environment: R_GlobalEnv>
(gdb) call Rf_PrintValue(thisExpr)
is.na(3)
By #1 (eval.c:717), R has figured out the values of op
and tmp
.
(gdb) call Rf_PrintValue(op)
function (x) .Primitive("is.na")
(gdb) call TYPEOF(op)
$2 = 8
(type 8 is 'BUILTINSXP', from the table in Rinternals.h). It does this by finding out that e
is a LANGSXP (line 614), that is.na is a SYMSXP (line 670), and that the function it references (op
) is a BUILTINSXP (line 700). It then uses (line 717)
(gdb) call PRIMFUN(op)
$8 = (SEXP (*)(SEXP, SEXP, SEXP, SEXP)) 0x7ffff77e0b20 <do_isna>
to discover that it should invoke do_isna
with the values it's discovered.
Hopefully that removes some of the mystery, and points to relevant parts of the code.
How can I view the source code for a function?
UseMethod("t")
is telling you that t()
is a (S3) generic function that has methods for different object classes.
The S3 method dispatch system
For S3 classes, you can use the methods
function to list the methods for a particular generic function or class.
> methods(t)
[1] t.data.frame t.default t.ts*
Non-visible functions are asterisked
> methods(class="ts")
[1] aggregate.ts as.data.frame.ts cbind.ts* cycle.ts*
[5] diffinv.ts* diff.ts kernapply.ts* lines.ts
[9] monthplot.ts* na.omit.ts* Ops.ts* plot.ts
[13] print.ts time.ts* [<-.ts* [.ts*
[17] t.ts* window<-.ts* window.ts*
Non-visible functions are asterisked
"Non-visible functions are asterisked" means the function is not exported from its package's namespace. You can still view its source code via the :::
function (i.e. stats:::t.ts
), or by using getAnywhere()
. getAnywhere()
is useful because you don't have to know which package the function came from.
> getAnywhere(t.ts)
A single object matching ‘t.ts’ was found
It was found in the following places
registered S3 method for t from namespace stats
namespace:stats
with value
function (x)
{
cl <- oldClass(x)
other <- !(cl %in% c("ts", "mts"))
class(x) <- if (any(other))
cl[other]
attr(x, "tsp") <- NULL
t(x)
}
<bytecode: 0x294e410>
<environment: namespace:stats>
The S4 method dispatch system
The S4 system is a newer method dispatch system and is an alternative to the S3 system. Here is an example of an S4 function:
> library(Matrix)
Loading required package: lattice
> chol2inv
standardGeneric for "chol2inv" defined from package "base"
function (x, ...)
standardGeneric("chol2inv")
<bytecode: 0x000000000eafd790>
<environment: 0x000000000eb06f10>
Methods may be defined for arguments: x
Use showMethods("chol2inv") for currently available ones.
The output already offers a lot of information. standardGeneric
is an indicator of an S4 function. The method to see defined S4 methods is offered helpfully:
> showMethods(chol2inv)
Function: chol2inv (package base)
x="ANY"
x="CHMfactor"
x="denseMatrix"
x="diagonalMatrix"
x="dtrMatrix"
x="sparseMatrix"
getMethod
can be used to see the source code of one of the methods:
> getMethod("chol2inv", "diagonalMatrix")
Method Definition:
function (x, ...)
{
chk.s(...)
tcrossprod(solve(x))
}
<bytecode: 0x000000000ea2cc70>
<environment: namespace:Matrix>
Signatures:
x
target "diagonalMatrix"
defined "diagonalMatrix"
There are also methods with more complex signatures for each method, for example
require(raster)
showMethods(extract)
Function: extract (package raster)
x="Raster", y="data.frame"
x="Raster", y="Extent"
x="Raster", y="matrix"
x="Raster", y="SpatialLines"
x="Raster", y="SpatialPoints"
x="Raster", y="SpatialPolygons"
x="Raster", y="vector"
To see the source code for one of these methods the entire signature must be supplied, e.g.
getMethod("extract" , signature = c( x = "Raster" , y = "SpatialPolygons") )
It will not suffice to supply the partial signature
getMethod("extract",signature="SpatialPolygons")
#Error in getMethod("extract", signature = "SpatialPolygons") :
# No method found for function "extract" and signature SpatialPolygons
Functions that call unexported functions
In the case of ts.union
, .cbindts
and .makeNamesTs
are unexported functions from the stats
namespace. You can view the source code of unexported functions by using the :::
operator or getAnywhere
.
> stats:::.makeNamesTs
function (...)
{
l <- as.list(substitute(list(...)))[-1L]
nm <- names(l)
fixup <- if (is.null(nm))
seq_along(l)
else nm == ""
dep <- sapply(l[fixup], function(x) deparse(x)[1L])
if (is.null(nm))
return(dep)
if (any(fixup))
nm[fixup] <- dep
nm
}
<bytecode: 0x38140d0>
<environment: namespace:stats>
Functions that call compiled code
Note that "compiled" does not refer to byte-compiled R code as created by the compiler package. The <bytecode: 0x294e410>
line in the above output indicates that the function is byte-compiled, and you can still view the source from the R command line.
Functions that call .C
, .Call
, .Fortran
, .External
, .Internal
, or .Primitive
are calling entry points in compiled code, so you will have to look at sources of the compiled code if you want to fully understand the function. This GitHub mirror of the R source code is a decent place to start. The function pryr::show_c_source
can be a useful tool as it will take you directly to a GitHub page for .Internal
and .Primitive
calls. Packages may use .C
, .Call
, .Fortran
, and .External
; but not .Internal
or .Primitive
, because these are used to call functions built into the R interpreter.
Calls to some of the above functions may use an object instead of a character string to reference the compiled function. In those cases, the object is of class "NativeSymbolInfo"
, "RegisteredNativeSymbol"
, or "NativeSymbol"
; and printing the object yields useful information. For example, optim
calls .External2(C_optimhess, res$par, fn1, gr1, con)
(note that's C_optimhess
, not "C_optimhess"
). optim
is in the stats package, so you can type stats:::C_optimhess
to see information about the compiled function being called.
Compiled code in a package
If you want to view compiled code in a package, you will need to download/unpack the package source. The installed binaries are not sufficient. A package's source code is available from the same CRAN (or CRAN compatible) repository that the package was originally installed from. The download.packages()
function can get the package source for you.
download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")
This will download the source version of the Matrix package and save the corresponding .tar.gz
file in the current directory. Source code for compiled functions can be found in the src
directory of the uncompressed and untared file. The uncompressing and untaring step can be done outside of R
, or from within R
using the untar()
function. It is possible to combine the download and expansion step into a single call (note that only one package at a time can be downloaded and unpacked in this way):
untar(download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")[,2])
Alternatively, if the package development is hosted publicly (e.g. via GitHub, R-Forge, or RForge.net), you can probably browse the source code online.
Compiled code in a base package
Certain packages are considered "base" packages. These packages ship with R and their version is locked to the version of R. Examples include base
, compiler
, stats
, and utils
. As such, they are not available as separate downloadable packages on CRAN as described above. Rather, they are part of the R source tree in individual package directories under /src/library/
. How to access the R source is described in the next section.
Compiled code built into the R interpreter
If you want to view the code built-in to the R interpreter, you will need to download/unpack the R sources; or you can view the sources online via the R Subversion repository or Winston Chang's github mirror.
Uwe Ligges's R news article (PDF) (p. 43) is a good general reference of how to view the source code for .Internal
and .Primitive
functions. The basic steps are to first look for the function name in src/main/names.c
and then search for the "C-entry" name in the files in src/main/*
.
How to get the C/C++ source code of the a secondary function of R?
The searchable R source code at https://github.com/wch/r-source is really useful for this:
- First we can look for the
read.table
definition The actual data reading is done by the
scan
function which in the end uses.Internal(scan(file, what, nmax, sep, dec, quote, skip, nlines,
[...]- Now
scan
is mapped todo_scan
So here you are: The underlying C implementation for read.table
can be found in src/main/scan.c
, starting with the function do_scan
.
view source code in R
In response to
Non-visible functions are asterisked
, this means that the actual functions that are dispatched onts
or default objects, respectively, are in the tseries namespace but not exported. So just typetseries:::portfolio.optim.default
and you see the function code once you specify the full patch including the namespace.Whether R downloads source or a binary depends on your operating system. In either event, source for the tseries package is available. Reading source code written by experienced coders is a good way to learn.
Is it possible to see source code of a value of function
The easiest way to find how functions work is by looking at the source. You have a good chance that by typing function name in the R console, you will get the function definitions (although not always with good layout, so seeking the source where brackets are present, is a viable option).
In your case, you have a function dtw
from the same name package. This function uses a function called globalCostMatrix
. If you type that name into R, you will get an error that object was not found. This happens because the function was not exported when the package was created, probably because the author thinks this is not something a regular user would use (but not see!) or to prevent clashes with other packages who may use the same function name.
However, for an interested reader, there are at least two ways to access the code in this function. One is by going to CRAN, downloading the source tarballs and finding the function in the R
folder of the tar ball. The other one, easier, is by using getAnywhere
function. This will give you the definition of the function just like you're used for other, user accessible functions like dtw
.
> library(dtw)
> getAnywhere("globalCostMatrix")
A single object matching ‘globalCostMatrix’ was found
It was found in the following places
namespace:dtw
with value
function (lm, step.matrix = symmetric1, window.function = noWindow,
native = TRUE, seed = NULL, ...)
{
if (!is.stepPattern(step.matrix))
stop("step.matrix is no stepMatrix object")
n <- nrow(lm)
... omitted for brevity
R: Further understanding on the .Internal function
You can view the c source code with this:
pryr::show_c_source(.Internal(mean()))
From @Dominic Comtois post here, "the show_c_source
function will search on GitHub for the relevant piece of code in the C source files. Works for .Internal and .Primitive functions."
Related Topics
Overlay Normal Curve to Histogram in R
Subscript Letters in Ggplot Axis Label
Read All Files in Directory and Apply Multiple Functions to Each Data Frame
Finding Rows Containing a Value (Or Values) in Any Column
How to Delete Rows from a Dataframe That Contain N*Na
How to Send an Email With Attachment from R in Windows
How to Read a CSV File in R With Different Number of Columns
Sample from Vector of Varying Length (Including 1)
Split Violin Plot With Ggplot2
Replace Multiple Letters With Accents With Gsub
Create a Data.Frame Where a Column Is a List
Sum Values in a Rolling/Sliding Window
Ggplot, Drawing Line Between Points Across Facets
Replace X-Axis With Own Values
Subset Data to Contain Only Columns Whose Names Match a Condition