Title: | Formal Parser and Related Tools for R Markdown Documents |
---|---|
Description: | An implementation of a formal grammar and parser for R Markdown documents using the Boost Spirit X3 library. It also includes a collection of high level functions for working with the resulting abstract syntax tree. |
Authors: | Colin Rundel [aut, cre] |
Maintainer: | Colin Rundel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.2.9000 |
Built: | 2024-11-02 03:57:43 UTC |
Source: | https://github.com/statnmap/parsermd |
rmd_ast
.Currently only supports conversion of rmd_tibble
objects back to rmd_ast
.
as_ast(x, ...)
as_ast(x, ...)
x |
Object to convert |
... |
Unused, for extensibility. |
Returns an rmd_ast
object.
parse_rmd(system.file("hw01.Rmd", package="parsermd")) %>% as_tibble() %>% as_ast()
parse_rmd(system.file("hw01.Rmd", package="parsermd")) %>% as_tibble() %>% as_ast()
rmd_ast
, rmd_tibble
, or any ast node into text.Convert an rmd_ast
, rmd_tibble
, or any ast node into text.
as_document(x, padding = "", collapse = NULL, ...)
as_document(x, padding = "", collapse = NULL, ...)
x |
|
padding |
Padding to add between nodes when assembling the text. |
collapse |
If not |
... |
Unused, for extensibility. |
Returns a character vector.
Helper functions for obtaining or changing chunk options within an rmd object.
rmd_set_options(x, ...) rmd_get_options(x, ..., defaults = list())
rmd_set_options(x, ...) rmd_get_options(x, ..., defaults = list())
x |
An |
... |
Either a collection of named values for the setter or a character values of the option names for the getter. |
defaults |
A named list of default values for the options. |
rmd_set_options
returns the modified version of the original object.
rmd_get_options
returns a list of the requested options (or all options if none
are specified). Non-chunk nodes return NULL
.
rmd = parse_rmd(system.file("minimal.Rmd", package = "parsermd")) str(rmd_get_options(rmd)) str(rmd_get_options(rmd), "include") rmd_set_options(rmd, include = TRUE)
rmd = parse_rmd(system.file("minimal.Rmd", package = "parsermd")) str(rmd_get_options(rmd)) str(rmd_get_options(rmd), "include") rmd_set_options(rmd, include = TRUE)
Documents are parse into an rmd_ast
object.
parse_rmd(rmd, allow_incomplete = FALSE, parse_yaml = TRUE)
parse_rmd(rmd, allow_incomplete = FALSE, parse_yaml = TRUE)
rmd |
Either the path to an |
allow_incomplete |
Allow incomplete parsing of the document. |
parse_yaml |
Use the yaml package to parse the document's yaml. |
Returns a rmd_ast
object.
parse_rmd(system.file("hw01.Rmd", package="parsermd"))
parse_rmd(system.file("hw01.Rmd", package="parsermd"))
parsermd
objects using rmarkdown::render()
Object contents are converted to a character vector and written to a temporary directory before rendering.
Note that this function has the potential to overwrite existing output
files (e.g. .html
, .pdf
, etc).
render(x, name = NULL, ...)
render(x, name = NULL, ...)
x |
Object to render, e.g. a |
name |
Name of the output file, if not given it will be inferred from the
name of |
... |
Any additional arguments to be passed to |
Returns the results of rmarkdown::render()
.
This function compares the provided Rmd against a template and reports on discrepancies (e.g. missing or unmodified components).
rmd_check_template(rmd, template, ...)
rmd_check_template(rmd, template, ...)
rmd |
The rmd to be check, can be an |
template |
|
... |
Unused, for extensibility. |
Invisibly returns TRUE
if the rmd matches the template, FALSE
otherwise.
tmpl = parse_rmd(system.file("hw01.Rmd", package = "parsermd")) %>% rmd_select(by_section(c("Exercise *", "Solution"))) %>% rmd_template(keep_content = TRUE) rmd_check_template( system.file("hw01-student.Rmd", package = "parsermd"), tmpl )
tmpl = parse_rmd(system.file("hw01.Rmd", package = "parsermd")) %>% rmd_select(by_section(c("Exercise *", "Solution"))) %>% rmd_template(keep_content = TRUE) rmd_check_template( system.file("hw01-student.Rmd", package = "parsermd"), tmpl )
Functions for extracting information for Rmd nodes.
rmd_node_label(x, ...) rmd_node_type(x, ...) rmd_node_length(x, ...) rmd_node_content(x, ...) rmd_node_attr(x, attr, ...) rmd_node_engine(x, ...) rmd_node_options(x, ...) rmd_node_code(x, ...)
rmd_node_label(x, ...) rmd_node_type(x, ...) rmd_node_length(x, ...) rmd_node_content(x, ...) rmd_node_attr(x, attr, ...) rmd_node_engine(x, ...) rmd_node_options(x, ...) rmd_node_code(x, ...)
x |
An rmd object, e.g. |
... |
Unused, for extensibility. |
attr |
Attribute name to extract. |
rmd_node_label()
- returns a character vector of node labels,
nodes without labels return NA
.
rmd_node_type()
- returns a character vector of node types.
rmd_node_length()
- returns an integer vector of node lengths (i.e. lines of code, lines of text, etc.),
nodes without a length return NA
.
rmd_node_content()
- returns a character vector of node textual content, nodes without content return NA
.
rmd_node_attr()
- returns a list of node attribute values.
rmd_node_engine()
- returns a character vector of chunk engines,
NA
for all other node types.
rmd_node_options()
- returns a list of chunk node options (named list), MULL
for all other node types.
rmd_node_code()
- returns a list of chunk node code (character vector),
NULL
for all other node types.
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_node_label(rmd) rmd_node_type(rmd) rmd_node_content(rmd) rmd_node_attr(rmd, "level") rmd_node_engine(rmd) rmd_node_options(rmd) rmd_node_code(rmd)
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_node_label(rmd) rmd_node_type(rmd) rmd_node_content(rmd) rmd_node_attr(rmd, "level") rmd_node_engine(rmd) rmd_node_options(rmd) rmd_node_code(rmd)
Uses the section headings of an rmd object to identify the hierarchical structure of the document.
rmd_node_sections(x, levels = 1:6, drop_na = FALSE)
rmd_node_sections(x, levels = 1:6, drop_na = FALSE)
x |
An rmd object, e.g. |
levels |
Limit which section heading levels to return. |
drop_na |
Should |
A list of section names for each node.
This function is implemented using tidyselect::eval_select()
which enables
a variety of useful syntax for selecting nodes from the ast.
Additionally, a number of additional parsermd
specific selection helpers are available:
by_section()
, has_type()
, has_label()
, and has_option()
.
rmd_select(x, ...)
rmd_select(x, ...)
x |
Rmd object, e.g. |
... |
One or more unquoted expressions separated by commas. Chunk labels can be used as if they were positions in the data frame, so expressions like x:y can be used to select a range of nodes. |
Returns a subset Rmd object (either rmd_ast
or rmd_tibble
depending on input).
rmd = parse_rmd(system.file("hw01.Rmd", package = "parsermd")) rmd_select(rmd, "plot-dino", "cor-dino") rmd_select(rmd, "plot-dino":"cor-dino") rmd_select(rmd, `plot-dino`:`cor-dino`) rmd_select(rmd, has_type("rmd_chunk")) rmd_select(rmd, by_section(c("Exercise *", "Solution")))
rmd = parse_rmd(system.file("hw01.Rmd", package = "parsermd")) rmd_select(rmd, "plot-dino", "cor-dino") rmd_select(rmd, "plot-dino":"cor-dino") rmd_select(rmd, `plot-dino`:`cor-dino`) rmd_select(rmd, has_type("rmd_chunk")) rmd_select(rmd, by_section(c("Exercise *", "Solution")))
These functions are used in conjunction with rmd_select()
to
select nodes from an Rmd ast.
by_section()
- uses section selectors to select nodes.
has_type()
- selects all nodes that have the given type(s).
has_label()
- selects nodes with labels matching the given glob.
has_option()
- selects nodes that have the given option(s) set.
has_type(types) by_section(sec_ref, keep_parents = TRUE) has_label(label) has_option(...)
has_type(types) by_section(sec_ref, keep_parents = TRUE) has_label(label) has_option(...)
types |
Vector of character type names, e.g. |
sec_ref |
character vector, a section reference selector. See details below for further details on how these are constructed. |
keep_parents |
Logical, retain the parent headings of selected sections.
Default: |
label |
character vector, glob patterns for matching chunk labels. |
... |
Either option names represented by a scalar string or a named argument with the form
|
Section reference selectors are a simplified version of CSS selectors that are designed to enable the selection nodes in a way that respects the implied hierarchy of a document's section headings.
They consist of a character vector of heading names where each subsequent value
is assumed to be nested within the preceding value. For example, the section
selector c("Sec 1", "Sec 2")
would select all nodes that are contained within
a section named Sec 2
that is in turn contained within a section named Sec 1
(or a section contained within a section named Sec 1
, and so on).
The individual section names can be specified using wildcards (aka globbing
patterns), which may match one or more sections within the document, e.g.
c("Sec 1", "Sec *")
. See utils::glob2rx()
or
wikipedia
for more details on the syntax for these patterns.
All helper functions return an integer vector of selected indexes.
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_select(rmd, has_type("rmd_chunk")) rmd_select(rmd, has_label("*dino")) rmd_select(rmd, has_option("message")) rmd_select(rmd, has_option(message = FALSE)) rmd_select(rmd, has_option(message = TRUE))
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_select(rmd, has_type("rmd_chunk")) rmd_select(rmd, has_label("*dino")) rmd_select(rmd, has_option("message")) rmd_select(rmd, has_option(message = FALSE)) rmd_select(rmd, has_option(message = TRUE))
This is the equivalent of the source()
function for Rmd files or
their resulting asts.
rmd_source(x, local = FALSE, ..., label_comment = TRUE, use_eval = TRUE)
rmd_source(x, local = FALSE, ..., label_comment = TRUE, use_eval = TRUE)
x |
An Rmd document (e.g. |
local |
|
... |
Additional arguments passed to |
label_comment |
Attach chunk labels as comment before each code block. |
use_eval |
Use the |
Returns the result of source()
for any R code chunks.
rmd_source(system.file("minimal.Rmd", package = "parsermd"), echo=TRUE)
rmd_source(system.file("minimal.Rmd", package = "parsermd"), echo=TRUE)
Subset an rmd object based on sections, node types, or names.
rmd_subset( x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, exclude = FALSE, keep_yaml = TRUE, keep_setup = FALSE, ... )
rmd_subset( x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, exclude = FALSE, keep_yaml = TRUE, keep_setup = FALSE, ... )
x |
rmd object, e.g. |
sec_refs |
Section references, TODO - add details. |
type_refs |
Node type references, TODO - add details. |
name_refs |
Name references, TODO - add details. |
exclude |
Should the matching nodes be excluded. |
keep_yaml |
Should the document yaml be kept. |
keep_setup |
Should the document setup chunk be kept. |
... |
Unused, for extensibility. |
Returns a subset Rmd object (either rmd_ast
or rmd_tibble
depending on input).
Tools for selecting or checking a single node using rmd_subset()
selection.
rmd_get_node(x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, ...) rmd_get_chunk(x, sec_refs = NULL, name_refs = NULL) rmd_get_markdown(x, sec_refs = NULL) rmd_has_node(x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, ...) rmd_has_chunk(x, sec_refs = NULL, name_refs = NULL, ...) rmd_has_markdown(x, sec_refs = NULL, ...)
rmd_get_node(x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, ...) rmd_get_chunk(x, sec_refs = NULL, name_refs = NULL) rmd_get_markdown(x, sec_refs = NULL) rmd_has_node(x, sec_refs = NULL, type_refs = NULL, name_refs = NULL, ...) rmd_has_chunk(x, sec_refs = NULL, name_refs = NULL, ...) rmd_has_markdown(x, sec_refs = NULL, ...)
x |
rmd object, e.g. |
sec_refs |
Section references, TODO - add details. |
type_refs |
Node type references, TODO - add details. |
name_refs |
Name references, TODO - add details. |
... |
Unused, for extensibility. |
rmd_get_*()
functions returns a single Rmd node object (e.g. rmd_heading
, rmd_chunk
, rmd_markdown
, etc.)
rmd_has_*()
functions return TRUE
if a matching node exists, FALSE
otherwise.
rmd
object.Templates are objects which are meant to capture the structure of an R Markdown document and facilitate the comparison between the template and new Rmd documents, usually to ensure the structure and/or content matches sufficiently.
rmd_template( rmd, keep_content = FALSE, keep_labels = TRUE, keep_headings = FALSE, keep_yaml = FALSE, ... )
rmd_template( rmd, keep_content = FALSE, keep_labels = TRUE, keep_headings = FALSE, keep_yaml = FALSE, ... )
rmd |
R Markdown document in the form of an |
keep_content |
Should the template keep the document's content (markdown text and chunk code). |
keep_labels |
Should the template keep the document's code chunk labels. |
keep_headings |
Should the template keep the document's headings. |
keep_yaml |
Should the template keep the document's yaml. |
... |
Unused, for extensibility. |
Returns an rmd_template
object, which is a derived tibble containing relevant structural
details of the document.
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_select(rmd, by_section(c("Exercise *", "Solution"))) %>% rmd_template()
rmd = parse_rmd(system.file("hw01.Rmd", package="parsermd")) rmd_select(rmd, by_section(c("Exercise *", "Solution"))) %>% rmd_template()