Importing common YAML in rstudio/knitr document
Have found two options to do this portably (ie no .Rprofile
customisation needed, minimal duplication of YAML frontmatter):
- You can provide common yaml to pandoc on the command-line! d'oh!
- You can set the
knit:
property of the metadata to your own function to have greater control over what happens when you Ctrl+Shift+K.
Option 1: common YAML to command line.
Put all the common YAML in its own file
common.yaml
:
---
author: me
date: "`r format (Sys.time(), format='%Y-%m-%d %H:%M:%S %z')`"
link-citations: true
reference-section-title: References
---
Note it's complete, ie the ---
are needed.
Then in the document you can specify the YAML as the last argument to pandoc, and it'll apply the YAML (see this github issue)
in example.rmd
:
---
title: On the Culinary Preferences of Anthropomorphic Cats
output:
html_document:
pandoc_args: './common.yaml'
---
I do not like green eggs and ham. I do not like them, Sam I Am!
You could even put the html_document:
stuff in an _output.yaml
since rmarkdown
will take that and place it under output:
for all the documents in that folder. In this way there can be no duplication of YAML between all documents using this frontmatter.
Pros:
- no duplication of YAML frontmatter.
- very clean
Cons:
- the common YAML is not passed through
knit
, so the date field above will not be parsed. You will get the literal string "r format(Sys.time(), format='%Y-%m-%d %H:%M:%S %z')" as your date. from the same github issue:
Metadata definitions seen first are kept and left unchanged, even if conflicting data is parsed at a later point.
Perhaps this could be a problem at some point depending on your setup.
Option 2: override the knit
command
This allows for much greater control, though is a bit more cumbersome/tricky.
This link and this one mention an undocumented feature in rmarkdown: the knit:
part of the YAML will be executed when one clicks the "Knit" button of Rstudio.
In short:
- define a function
myknit(inputFile, encoding)
that would read the YAML, put it in to the RMD and callrender
on the result. Saved in its own filemyknit.r
. in the YAML of
example.rmd
, addknit: (function (...) { source('myknit.r'); myknit(...) })
It seems to have to be on one line. The reason for
source('myknit.r')
instead of just putting the function definition int he YAML is for portability. If I modifymyknit.r
I don't have to modify every document's YAML. This way, the only common YAML that all documents must repeat in their frontmatter is theknit
line; all other common YAML can stay incommon.yaml
.
Then Ctrl+Shift+K works as I would hope from within Rstudio.
Further notes:
myknit
could just be a system call tomake
if I had a makefile setup.- the injected YAML will be passed through
rmarkdown
and hence knitted, since it is injected before the call torender
. Preview window: so long as
myknit
produces a (single) messageOutput created: path/to/file.html
, then the file will be shown in the preview window.I have found that there can be only one such message in the output [not multiple], or you get no preview window. So if you use
render
(which makes an "Output created: basename.extension") message and the final produced file is actually elsewhere, you will need to suppress this message via eitherrender(..., quiet=T)
orsuppressMessages(render(...))
(the former suppresses knitr progress and pandoc output too), and create your own message with the correct path.
Pros:
- the YAML frontmatter is knitted
- much more control than option 1 if you need to do custom pre- / post-processing.
Cons:
- a bit more effort than option 1
- the
knit:
line must be duplicated in each document (though bysource('./myknit.r')
at least the function definition may be stored in one central location)
Here is the setup for posterity. For portability, you only need to carry around myknit.r
and common.yaml
. No .Rprofile
or project-specific config needed.
example.rmd
:
---
title: On the Culinary Preferences of Anthropomorphic Cats
knit: (function (...) { source('myknit.r'); myknit(...) })
---
I do not like green eggs and ham. I do not like them, Sam I Am!
common.yaml
[for example]:
author: me
date: "`r format (Sys.time(), format='%Y-%m-%d %H:%M:%S %z')`"
link-citations: true
reference-section-title: References
myknit.r
:
myknit <- function (inputFile, encoding, yaml='common.yaml') {
# read in the YAML + src file
yaml <- readLines(yaml)
rmd <- readLines(inputFile)
# insert the YAML in after the first ---
# I'm assuming all my RMDs have properly-formed YAML and that the first
# occurence of --- starts the YAML. You could do proper validation if you wanted.
yamlHeader <- grep('^---$', rmd)[1]
# put the yaml in
rmd <- append(rmd, yaml, after=yamlHeader)
# write out to a temp file
ofile <- file.path(tempdir(), basename(inputFile))
writeLines(rmd, ofile)
# render with rmarkdown.
message(ofile)
ofile <- rmarkdown::render(ofile, encoding=encoding, envir=new.env())
# copy back to the current directory.
file.copy(ofile, file.path(dirname(inputFile), basename(ofile)), overwrite=T)
}
Pressing Ctrl+Shift+K/Knit from the editor of example.rmd
will compile the result and show a preview. I know it is using common.yaml
, because the result includes the date and author whereas example.rmd
on its own does not have a date or author.
Programmatically add tags to yaml header during knitting R markdown file
To generate a valid YAML array, you could use the alternative syntax [ ], e.g.,
tags: ["`r paste(head(letters), collapse = '", "')`"]
which generates:
tags: ["a", "b", "c", "d", "e", "f"]
Note the hack collapse = '", "'
: since there already exists a pair of double quotes outside the R expression, you should only generate the part a", "b", "c", "d", "e", "f
from the R expression.
-- solution copied from Yihui's explanation at blogdown#647
how to access yaml metadata from knitr
That is stored in rmarkdown::metadata
as a list of the form list(title = ...)
.
Strip YAML from child docs in knitr
In the mean time, maybe the following will work for you; it is kind of an ugly and inefficient work-around (I am new to knitr and am not a real programmer), but it achieves what I believe you are wanting to do.
I had written a function for a similar personal use that includes the following relevant bit; the original is in Spanish, so I've translated it some below:
extraction <- function(matter, escape = FALSE, ruta = ".", patron) {
require(yaml)
# Gather together directory of documents to be processed
doc_list <- list.files(
path = ruta,
pattern = patron,
full.names = TRUE
)
# Extract desired contents
lapply(
X = doc_list,
FUN = function(i) {
raw_contents <- readLines(con = i, encoding = "UTF-8")
switch(
EXPR = matter,
# !YAML (e.g., HTML)
"no_yaml" = {
if (escape == FALSE) {
paste(raw_contents, sep = "", collapse = "\n")
} else if (escape == TRUE) {
require(XML)
to_be_escaped <- paste(raw_contents, sep = "", collapse = "\n")
xmlTextNode(value = to_be_escaped)
}
},
# YAML header and Rmd contents
"rmd" = {
yaml_pattern <- "[-]{3}|[.]{3}"
limits_yaml <- grep(pattern = yaml_pattern, x = raw_contents)[1:2]
indices_yaml <- seq(
from = limits_yaml[1] + 1,
to = limits_yaml[2] - 1
)
yaml <- mapply(
FUN = function(i) {yaml.load(string = i)},
raw_contents[indices_yaml],
USE.NAMES = FALSE
)
indices_rmd <- seq(
from = limits_yaml[2] + 1,
to = length(x = raw_contents)
)
rmd<- paste(raw_contents[indices_rmd], sep = "", collapse = "\n")
c(yaml, "contents" = rmd)
},
# Anything else (just in case)
{
stop("Matter not extractable")
}
)
}
)
}
Say my main Rmd document main.Rmd
lives in my_directory
and my child documents, 01-abstract.Rmd
, 02-intro.Rmd
, ..., 06-conclusion.Rmd
are housed in ./sections
; note that for my amateur function it is best to have the child documents saved in the order they will be summoned into the main document (see below). I have my function extraction.R
in ./assets
. Here is the structure of my example directory:
.
+--assets
| +--extraction.R
+--sections
| +--01-abstract.Rmd
| +--02-intro.Rmd
| +--03-methods.Rmd
| +--04-results.Rmd
| +--05-discussion.Rmd
| +--06-conclusion.Rmd
+--stats
| +--analysis.R
+--main.Rmd
In main.Rmd
I import my child documents from ./sections
:
---
title: Main
author: me
date: Today
output:
html_document
---
```{r, 'setup', include = FALSE}
opts_chunk$set(autodep = TRUE)
dep_auto()
```
```{r, 'import_children', cache = TRUE, include = FALSE}
source('./assets/extraction.R')
rmd <- extraction(
matter = 'rmd',
ruta = './sections',
patron = "*.Rmd"
)
```
# Abstract
```{r, 'abstract', echo = FALSE, results = 'asis'}
cat(x = rmd[[1]][["contents"]], sep = "\n")
```
# Introduction
```{r, 'intro', echo = FALSE, results = 'asis'}
cat(x = rmd[[2]][["contents"]], sep = "\n")
```
# Methods
```{r, 'methods', echo = FALSE, results = 'asis'}
cat(x = rmd[[3]][["contents"]], sep = "\n")
```
# Results
```{r, 'results', echo = FALSE, results = 'asis'}
cat(x = rmd[[4]][["contents"]], sep = "\n")
```
# Discussion
```{r, 'discussion', echo = FALSE, results = 'asis'}
cat(x = rmd[[5]][["contents"]], sep = "\n")
```
# Conclusion
```{r, 'conclusion', echo = FALSE, results = 'asis'}
cat(x = rmd[[6]][["contents"]], sep = "\n")
```
# References
I then knit this document and only the contents of my child documents are incorporated thereinto, e.g.:
---
title: Main
author: me
date: Today
output:
html_document
---
# Abstract
This is **Child Doc 1**, my abstract.
# Introduction
This is **Child Doc 2**, my introduction.
- Point 1
- Point 2
- Point *n*
# Methods
This is **Child Doc 3**, my "Methods" section.
| method 1 | method 2 | method *n* |
|---------------|---------------|----------------|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|
# Results
This is **Child Doc 4**, my "Results" section.
## Result 1
```{r}
library(knitr)
```
```{r, 'analysis', cache = FALSE}
source(file = '../stats/analysis.R')
```
# Discussion
This is **Child Doc 5**, where the results are discussed.
# Conclusion
This is **Child Doc 6**, where I state my conclusions.
# References
The foregoing document is the knitted version of main.Rmd
, i.e., main.md
. Note under ## Result 1
that in my child document, 04-results.Rmd
, I sourced an external R script, ./stats/analysis.R
, which is now incorporated as a new knitr chunk in my knitted document; consequently, I now need to knit the document again.
When child documents also include chunks, instead of knitting into .md
I would knit the main document into another .Rmd
as many times as I have chunks nested, e.g., continuing the example above:
- Using
knit(input = './main.Rmd', output = './main_2.Rmd')
, instead of knittingmain.Rmd
intomain.md
, I would knit it into another .Rmd so as to be able to knit the resulting file containing the newly imported chunks, e.g., my R scriptanalysis.R
above. - I can now knit my
main_2.Rmd
intomain.md
or render it asmain.html
viarmarkdown::render(input = './main_2.Rmd', output_file = './main.html')
.
Note: in the example above of main.md
, the path to my R script is ../stats/analysis.R
. This is the path relative to the child document that sourced it, ./sections/04-results.Rmd
. Once I import the child document into the main document located at the root of my_directory
, i.e., ./main.md
or ./main_2.Rmd
, the path becomes wrong; I therefore must correct it manually to ./stats/analysis.R
before the next knit.
I mentioned above that it is best to have the child documents saved in the same order that they are imported into the main document. This is because my simple function extraction()
simply stores the contents of all the files specified to it in an unnamed list, hence I must access each file in main.Rmd
by number, i.e., rmd[[5]][["contents"]]
refers to the child document ./sections/05-discussion.Rmd
; consider:
> str(rmd)
List of 6
$ :List of 4
..$ title : chr "child doc 1"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 1**, my abstract."
$ :List of 4
..$ title : chr "child doc 2"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 2**, my introduction.\n\n- Point 1\n- Point 2\n- Point *n*"
$ :List of 4
..$ title : chr "child doc 3"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 3**, my \"Methods\" section.\n\n| method 1 | method 2 | method *n* |\n|--------------|--------------|----"| __truncated__
$ :List of 4
..$ title : chr "child doc 4"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 4**, my \"Results\" section.\n\n## Result 1\n\n```{r}\nlibrary(knitr)\n```\n\n```{r, cache = FALSE}\nsour"| __truncated__
$ :List of 4
..$ title : chr "child doc 5"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 5**, where the results are discussed."
$ :List of 4
..$ title : chr "child doc 6"
..$ layout : chr "default"
..$ etc : chr "etc"
..$ contents: chr "\nThis is **Child Doc 6**, where I state my conclusions."
So, extraction()
here is actually storing both the R Markdown contents of the specified child documents, as well as their YAML, in case you had a use for this as well (I myself do).
How can I modify yaml instructions outside of the document I am rendering
You could cat
a sink
into a tempfile
.
xxx <- "
#' # Title
Hello world
#+ one_plus_one
1 + 1
"
tmp <- tempfile()
sink(tmp)
cat("
---
title: 'Sample Document'
output:
html_document:
toc: true
theme: united
pdf_document:
toc: true
highlight: zenburn
---", xxx)
sink()
w.d <- getwd()
rmarkdown::render(tmp, output_file=paste(w.d, "myfile", sep="/"))
Related Topics
Is There a Function to Add Aov Post-Hoc Testing Results to Ggplot2 Boxplot
Ggplot/Mapping Us Counties - Problems with Visualization Shapes in R
Date Format for Plotting X Axis Ticks of Time Series Data
How to Stop Emacs from Replacing Underbar with <- in Ess-Mode
"Long Vectors Not Supported Yet" Error in Rmd But Not in R Script
Remove the Last Element of a Vector
Difference Between Subset and Filter from Dplyr
How to Plot 3D Scatter Diagram Using Ggplot
How to Sort a Data.Frame with Only One Column, Without Losing Rownames
Avoid That Space in Column Name Is Replaced with Period (".") When Using Read.Csv()
Getting a Map with Points, Using Ggmap and Ggplot2
Gbm R Function: Get Variable Importance Separately for Each Class
Calculating Length of 95%-Ci Using Dplyr
Aggregating Sub Totals and Grand Totals with Data.Table
Why Is Seq(X) So Much Slower Than 1:Length(X)
Plot a Legend and Well-Spaced Universal Y-Axis and Main Titles in Grid.Arrange