How to Read Knitr/Rmd Cache in Interactive Session

How to read knitr/Rmd cache in interactive session?

I think that running library("knitr"); knit("foo.Rmd") in the console/R session is the easiest way to do this, although it will rewrite foo.md, figures, etc.. (Too busy/lazy to test it at the moment.)

You could probably poke around in the cache directory and read the cached files directly, but that would be a lot more work/trickier.

Possible to use knitr cache chunk in interactive rmarkdown doc?

i ran into the same problem where, in runtime: shiny, the cache switch was ignored.

Nowadays there is a workaround, using runtime: shiny_prerendered and context="data" with cache=TRUE:

---
title: "Cache test"
output: html_document
runtime: shiny_prerendered
---

```{r,context="data", cache=TRUE}
Sys.sleep(10)
```

this behaves as expected; on the first run, rendering takes 10 seconds; on all subsequent runs, the cached chunk is used.

How to elegantly + robustly cache external script in knitr rmd document?

There should be better approaches than the do-it-yourself caching you currently use. To start with, you could split external.R into chunks:

# ---- CreateRandomDFs----
df.rand1 <- data.frame(rnorm(n = 100), rnorm(n = 100))
df.rand2 <- data.frame(rnorm(n = 100), rnorm(n = 100))

# ---- CreateOtherObjects----

# stuff

In main.Rmd, add (in a uncached chunk!) read_chunk(path = 'external.R'). Then execute the chunks:

<<CreateRandomDFs>>=
@
<<CreateOtherObjects>>=
@

If autodep doesn't work, add dependson to your chunks. A chunk that only uses df.rand1 and df.rand2 gets dependson = "CreateRandomDFs"; when other objects are also used, set dependson = c("CreateRandomDFs", "CreateOtherObjects").

You may also invalidate a chunk's cache when a certain object changes: cache.whatever = quote(df.rand1).

This way, you avoid invalidating the whole cache with any change in external.R. It is crucial how you split the code in that file into chunks: If you use too many chunks, you will have to list many dependencies; if you use too few chunks, cache gets invalidated more/too often.

Forcing interactive session with knitr to add drop down list (GUI)

The question asks for a way to have the user interactively select an item from a list inside an RNW document (the same applies for other files that are knitted, like RMD):

%mydocument.Rnw

\documentclass{article}
\begin{document}
<<>>=
letterIndex <- menu(LETTERS, graphics = TRUE, title = "Select your favorite letter")
sprintf("My favorite letter is '%s'.", LETTERS[letterIndex])
@
\end{document}

This throws an error when the document is knitted using the "Compile PDF" button in Rstudio because menu needs an interactive R session but "Compile PDF" starts a new, non-interactive session to process the document.

Error in menu(LETTERS, graphics = TRUE, title = "Select your favorite letter"): menu() cannot be used non-interactively

To solve this issue, the "Compile PDF" button must be avoided. Instead the document can be knitted calling knit/knit2pdf. Note that this may have some unexpected side-effects, see here to get an idea about this.

knit2pdf("mydocument.Rnw") works (which I didn't expect when writing that comment). The menu of choices pops up in the middle of the knitting process. Nevertheless, I would prefer a solution that separates knitting and user interaction (as suggested in the comment):

#control.R
letterIndex <- menu(LETTERS, graphics = TRUE, title = "Select your favorite letter")
knit2pdf("mydocument2.Rnw")

%mydocument2.Rnw

\documentclass{article}
\begin{document}
<<>>=
sprintf("My favorite letter is '%s'.", LETTERS[letterIndex])
@
\end{document}

Here, the user interaction takes place before the document is knitted. The result letterIndex is saved in the global environment and the knitting process reads it from there.

In both cases, instead of opening the RNW file and clicking "Compile PDF", the user now opens an R script containing knit2pdf (and possibly the menu call) and clicks "Source". This should not increase the difficulty level too much.

knitr: keep cache when I make small change in chunk

Using the knitr cache to store the results of a week-long simulation sounds a bit crazy susceptible to disaster.

My suggestion for a safer workflow is:

  1. Run the simulation and store the results in a file (csv, rda, whatever is suitable).

  2. Load that data inside a chunk (probably with echo = FALSE) near the start of your knitr report.

Now simulating and reporting are decoupled.

knitr vs. interactive R behaviour

Thanks to Aleksey Vorona and Duncan Murdoch, this bug is now fixed in R-devel!

See: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15411

In an R interactive session, how can I access the value of the most recent expression?

.Last.value does that trick:

> .Last.value
$help_type
NULL

> 5
[1] 5
> .Last.value
[1] 5
> iris; .Last.value

However I do not really know, whether one should ever use that. Just give things an explicit name. It takes less key strokes to write

> (a <- 5)
[1] 5
> a
[1] 5

and then everybody can easily see what happens and if you come back later to your script and enter an additional line, that will do not harm.

From the Zen of Python:

Explicit is better than implicit.

Simple is better than complex.

Readability counts.

Special cases aren't special enough to break the rules.

If the implementation is hard to explain, it's a bad idea.

Can I cache data loading in R?

Sort of. There are a few answers:

  1. Use a faster csv read: fread() in the data.table() package is beloved by many. Your time may come down to a second or two.

  2. Similarly, read once as csv and then write in compact binary form via saveRDS() so that next time you can do readRDS() which will be faster as you do not have to load and parse the data again.

  3. Don't read the data but memory-map it via package mmap. That is more involved but likely very fast. Databases uses such a technique internally.

  4. Load on demand, and eg the package SOAR package is useful here.

Direct caching, however, is not possible.

Edit: Actually, direct caching "sort of" works if you save your data set with your R session at the end. Many of us advise against that as clearly reproducible script which make the loading explicit are preferably in our view -- but R can help via the load() / save() mechanism (which lots several objects at once where saveRSS() / readRDS() work on a single object.



Related Topics



Leave a reply



Submit