class: middle, inverse .leftcol30[ <center> <img src="https://github.com/emse-p4a-gwu/emse-p4a-gwu.github.io/raw/master/images/p4a_hex_sticker.png" width=250> </center> ] .rightcol70[ # Week 12: .fancy[Reproducible Reporting] ###
EMSE 4571: Intro to Programming for Analytics ###
John Paul Helveston ###
April 14, 2022 ] --- class: inverse # Quiz 7 (last one!)
05
:
00
.leftcol[ ## Go to `#class` channel in Slack for quiz link ## Open RStudio first! ## Rules: - You may use your notes and RStudio - You may **not** use any other resources (e.g. the internet, your classmates, etc.) ] .rightcol[ <br> <center> <img src="https://github.com/emse-p4a-gwu/2022-Spring/raw/main/images/quiz_doge.png" width="400"> </center> ] --- .center[# (You should always cite your sources)] .leftcol60[ ## A bunch of today's slides are adapted from the brilliant<br>[Alison Presmanes Hill](https://alison.rbind.io/) ### [
@apreshill](http://twitter.com/apreshill) ### Check out her [RMarkdown slide deck](https://apreshill.github.io/rmd4cdc/) ] .rightcol40[.circle[ <center> <img src="images/allison.jpg" width=350> </center> ]] --- class: inverse, middle # Week 12: .fancy[Reproducible Reporting] ### 1. Why RMarkdown? ### 2. Metadata and output formats ### BREAK ### 3. Text ### 4. Code chunks --- class: inverse, middle # Week 12: .fancy[Reproducible Reporting] ### 1. .orange[Why RMarkdown?] ### 2. Metadata and output formats ### BREAK ### 3. Text ### 4. Code chunks --- class:center ## The horrors of a non-reproducible workflow <center> <iframe width="800" height="500" src="https://www.youtube.com/embed/s3JldKoA0zw" frameborder="0" allowfullscreen></iframe> </center> --- class:center ## RMarkdown to the rescue! <center> <iframe src="https://player.vimeo.com/video/178485416?color=428bca&title=0&byline=0&portrait=0" width="800" height="500" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe> </center> --- class: center # "Literate programming" .leftcol[.left[ > ### Treat programs as a "literature" understandable to **human beings** ]] .rightcol[.center[ <center> <img src="images/Knuth.jpg" width=400> </center> [Donald E. Knuth](https://en.wikipedia.org/wiki/Donald_Knuth) ]] --- class: inverse, middle # Week 12: .fancy[Reproducible Reporting] ### 1. Why RMarkdown? ### 2. .orange[Metadata and output formats] ### BREAK ### 3. Text ### 4. Code chunks --- background-image: url(images/horst_monsters_rmarkdown.png) background-size: contain background-color: #FFFFFF .footnote[Art by [Allison Horst](https://twitter.com/allison_horst?lang=en)] --- name: card0 background-image: url(images/card0.png) background-size: contain .footnote[https://www.dear-data.com/] --- name: card1 background-image: url(images/card1.png) background-size: contain .footnote[https://www.dear-data.com/] --- background-color: #FFFFFF .leftcol[ # metadata: YAML **Y**AML<br> **A**in't<br> **M**arkup<br> **L**anguage ```yaml --- key: value --- ``` ] .rightcol[ <img src="images/orchestra.jpg" width="75%" style="display: block; margin: auto;" /> ] --- class: inverse, middle, center # Output options --- background-color: #FFFFFF .leftcol[ # Save output options in your YAML ```yaml --- title: Your title here author: Your name here output: html_document --- ``` ```yaml --- title: Your title here author: Your name here output: html_document: toc: true toc_float: true theme: flatly --- ``` ] .rightcol[ <img src="images/orchestra.jpg" width="75%" style="display: block; margin: auto;" /> ] --- background-image: url(images/Single-rmd.png) background-size: contain --- background-image: url(images/Single-rmd1.png) background-size: contain --- background-image: url(images/Single-rmd2.png) background-size: contain --- background-image: url(images/Single-rmd3.png) background-size: contain --- class: top center <video width="1530" height="610" controls> <source src="images/single-doc-knit.mov"> </video> --- class: middle center background-image: url(images/Single-knit1.png) background-size: contain ??? Here is what we knit. --- background-image: url(images/Single-knit2.png) background-size: contain ??? Using the `THEME:` key in our YAML, we changed our font and colors. --- background-image: url(images/Single-knit3.png) background-size: contain ??? And we have this nice table of contents floating off to the side... --- class: center ## Preview bootswatch themes <a href="https://bootswatch.com/default/" target="_blank"><img src="images/bootswatch.png" width="55%" style="display: block; margin: auto;" /></a> https://bootswatch.com/ --- class: inverse .leftcol[ ## Quick check-in .font90[How do you use only the **default** output options?] (a) ```yaml --- output: html_document() --- ``` (b) ```yaml --- output: html_document --- ``` ] -- .rightcol[ <br><br> .font90[How do you add an output **option** to a format in your YAML?] (a) ```yaml --- output: html_document: toc: true --- ``` (b) ```yaml --- output: html_document(toc=true) --- ``` ] --- class: inverse, middle, center # Output formats --- class: center <a href="https://rmarkdown.rstudio.com/docs/reference/index.html#section-output-formats" target="_blank"><img src="images/rmdbase-formats.png" width="42%" style="display: block; margin: auto;" /></a> https://rmarkdown.rstudio.com/docs/reference/index.html#section-output-formats --- # ๐งถ Knit to multiple outputs ```r rmarkdown::render("notes.Rmd", output_format ="all") ``` .leftcol[ <img src="images/knit-dropdown.png" width="60%" style="display: block; margin: auto;" /> ] .rightcol[ ```yaml --- title: Your title here author: Your name here output: html_document: toc: true toc_float: true theme: flatly word_document: default pdf_document: default --- ``` ] ??? This is a great way to "control" your knit button! Notice that when you knit, it respects those output options in your YAML. This way you "save" your output options --- class: inverse ## Quick check-in .leftcol[ .font80[How do you add another output **format** to your YAML? (a) ```yaml --- output: html_document: default word_document: default --- ``` (b) ```yaml --- output: html_document() word_document() --- ``` ]] -- .rightcol[ .font80[How do you now add output **options** to your YAML? (a) ```yaml --- output: html_document: toc: true word_document: default --- ``` (b) ```yaml --- output: html_document(toc=true) word_document(default) --- ``` ]] --- class: middle, center # Built-in output formats <img src="https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/rmarkdown.png" width="32%" style="display: block; margin: auto;" /> --- class: middle <center> <img src="images/RMarkdownOutputFormats.png" width=600> </center> --- class: middle, center # Extension output formats .cols3[ <img src="https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/flexdashboard.png"> ] .cols3[ <img src="https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/bookdown.png"> ] .cols3[ <img src="https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/xaringan.png"> ] --- class: center <a href="https://rstudio.github.io/distill/" target="_blank"><img src="images/distill.png" width="70%" style="display: block; margin: auto;" /></a> https://rstudio.github.io/distill/ --- class: middle .center[ # Use an extension package ] .leftcol[ ```yaml --- author: Your name here title: Your title here output: distill::distill_article --- ``` ] .rightcol[ ```yaml --- author: Your name here title: Your title here output: distill::distill_article: toc: true --- ``` ] --- class: inverse
02
:
00
## Quick practice Go to your `notes.Rmd` file and knit it to the following outputs: - `html_document` with a table of contents - `distill_article` with a table of contents - `word_document` - `pdf_document` --- class: inverse, middle # Week 12: .fancy[Reproducible Reporting] ### 1. Why RMarkdown? ### 2. Metadata and output formats ### BREAK ### 3. .orange[Text] ### 4. Code chunks --- template: card0 --- template: card1 --- name: card2 background-image: url(images/card2.png) background-size: contain .footnote[https://www.dear-data.com/] --- class: center, middle # Right now, bookmark this! ๐ # https://commonmark.org/help/ <hr> # (When you have 10 minutes, do this! ๐) # https://commonmark.org/help/tutorial/ --- # .center[Headers] -- .leftcol[ ```markdown # HEADER 1 ## HEADER 2 ### HEADER 3 #### HEADER 4 ##### HEADER 5 ###### HEADER 6 ``` ] -- .rightcol[ # HEADER 1 ## HEADER 2 ### HEADER 3 #### HEADER 4 ##### HEADER 5 ###### HEADER 6 ] --- # .center[Text] -- .leftcol[ ```markdown Childhood **vaccines** are one of the _great triumphs_ of modern medicine. ``` ] -- .rightcol[ Childhood **vaccines**<br> are one of the<br> _great triumphs_<br> of modern medicine. ] --- # .center[Text] .leftcol[ ## Type this... - `normal text` - `*italic text*` - `**bold text**` - `***bold italic text***` - `~~strikethrough~~` - `` `code text` `` ] .rightcol[ ## ..to get this - normal text - *italic text* - **bold text** - ***bold italic text*** - ~~strikethrough~~ - `code text` ] --- class: top # .center[Lists] .leftcol[ Bullet list: ```r - first item - second item - third item ``` - first item - second item - third item ] .rightcol[ Numbered list: ```r 1. first item 2. second item 3. third item ``` 1. first item 2. second item 3. third item ] --- # .center[Links] Simple **url link** to another site: ```r [Download R](http://www.r-project.org/) ``` [Download R](http://www.r-project.org/) --- # .center[Images] .leftcol70[.code80[ ```markdown ![](https://p4a.seas.gwu.edu/2022-Spring/images/helveston.jpg) ``` ]] .rightcol30[ ![](https://p4a.seas.gwu.edu/2022-Spring/images/helveston.jpg) ] --- # .center[Local images] .leftcol70[.code80[ ```markdown ![](images/p4a_hex_sticker.png) ``` ]] .rightcol30[ ![](images/p4a_hex_sticker.png) ] --- class: inverse ## Quick check-in .leftcol[ How do you add headers in Markdown? a. `! Header` b. `- Header` c. `# Header` d. `1. Header` ] -- .rightcol[ What about lists? Bulleted? Numbered? a. `! Item 1` b. `- Item 1` c. `# Item 1` d. `1. Item 1` ] --- class: center background-image: url(images/typewriter.jpg) background-size: contain background-color: #f6f6f6 # Markdown tables --- class: center # Markdown tables .leftcol[ ## This... ```markdown | Column 1 | Column 2 | |------------| ----------| | Cell 1, 1 | Cell 2, 1 | | Cell 1, 2 | Cell 2, 2 | ``` ] .rightcol[ ## ...produces this | Column 1 | Column 2 | |------------| ----------| | Cell 1, 1 | Cell 2, 1 | | Cell 1, 2 | Cell 2, 2 | ] --- class: center # Markdown tables .leftcol55[ ## This... ```markdown | Time | Session | Topic | |:--------------|:-------:|----------:| | _left_ | _center_| _right_ | | 01:00 - 01:50 | 1 | Practice | | 01:50 - 02:00 | | **Break** | | 02:00 - 02:45 | 2 | Class | | 02:45 - 03:00 | | **Break** | ``` ] -- .rightcol45[ ## ...produces this | Time | Session | Topic | |:--------------|:-------:|----------:| | _left_ | _center_| _right_ | | 01:00 - 01:50 | 1 | Practice | | 01:50 - 02:00 | | **Break** | | 02:00 - 02:45 | 2 | Class | | 02:45 - 03:00 | | **Break** | ] --- class: inverse, center # .fancy[Break]
05
:
00
--- class: inverse, middle # Week 12: .fancy[Reproducible Reporting] ### 1. Why RMarkdown? ### 2. Metadata and output formats ### BREAK ### 3. Text ### 4. .orange[Code chunks] --- template: card0 --- template: card1 --- template: card2 --- name: card3 background-image: url(images/card3.png) background-size: contain --- class: center # R Code -- .leftcol[ ## Inline code .left[ ```r `r insert code here` ``` ]] -- .rightcol[ ## Code chunks .left[ ````markdown ```{r} insert code here insert more code here ``` ```` ]] --- # Inline R code Embed R code directly in a markdown ```r `r <insert code here>` ``` -- For example: ```r The sum of 3 and 4 is `r 3 + 4` ``` -- Produces this: The sum of 3 and 4 is 7 --- # R Code chunks .leftcol[ ````markdown ```{r} bears %>% count(month) ``` ```` What is the fate of this chunk? ] -- .rightcol[ ```r bears %>% count(month) ``` ``` #> # A tibble: 12 ร 2 #> month n #> <dbl> <int> #> 1 1 3 #> 2 2 1 #> 3 3 1 #> 4 4 4 #> 5 5 18 #> 6 6 20 #> 7 7 27 #> 8 8 28 #> 9 9 25 #> 10 10 25 #> 11 11 12 #> 12 12 2 ``` ] --- # Code chunks .leftcol[ ````markdown ```{r} monthlyCount <- bears %>% count(month) ``` ```` What fate do you predict here? ] -- .rightcol[ ```r monthlyCount <- bears %>% count(month) ``` ] --- # Code chunks .leftcol[ ````markdown ```{r} monthlyCount <- bears %>% count(month) monthlyCount ``` ```` ] -- .rightcol[.code80[ ```r monthlyCount <- bears %>% count(month) monthlyCount ``` ``` #> # A tibble: 12 ร 2 #> month n #> <dbl> <int> #> 1 1 3 #> 2 2 1 #> 3 3 1 #> 4 4 4 #> 5 5 18 #> 6 6 20 #> 7 7 27 #> 8 8 28 #> 9 9 25 #> 10 10 25 #> 11 11 12 #> 12 12 2 ``` ]] --- # Chunk options Control what chunks output using options inside `{r}`: Example: `{r, echo=FALSE, message=FALSE}` <img src="images/chunks_options.png" width="60%" /> --- # Chunk options By default, code chunks print **code** + **output**: -- .leftcol[ ## This... ````markdown ```{r} cat('hello world!') ``` ```` ] -- .rightcol[ ## ...produces this ```r cat('hello world!') ``` ``` #> hello world! ``` ] --- # .center[Chunk output options] -- .cols3[ ````markdown ```{r, echo=FALSE} cat('hello world!') ``` ```` Prints only **output**<br>(doesn't show code) **Output**: ``` #> hello world! ``` ] -- .cols3[ ````markdown ```{r, eval=FALSE} cat('hello world!') ``` ```` Prints only **code**<br>(doesn't run the code) **Output**: ```r cat('hello world!') ``` ] -- .cols3[ ````markdown ```{r, include=FALSE} cat('hello world!') ``` ```` Runs, but doesn't print anything **Output**: ] --- # message / warning ![](https://www.tidyverse.org/images/tidyverse_1.2.0/tidyverse_1-2-0_pkg_load.gif) --- # message / warning .leftcol[ ````markdown ```{r, message=FALSE, warning=FALSE} library(tidyverse) ``` ```` ] .rightcol[ ```r library(tidyverse) ``` ] --- # Using chunk options .leftcol80[ ````markdown ```{r, message=FALSE, warning=FALSE} library(tidyverse) ``` ```` - Place between curly braces<br>`{r option=value}` - Multiple options separated by commas<br>`{r option1=value, option2=value}` - Careful! The `r` part is the **code engine** (other engines possible) ] --- # Inserting a Python code chunk Change `{r}` to `{python}` in the code chunk. -- Example: ```python 'In Python, you can concatenate strings' + ' like this!' ``` ``` 'In Python, you can concatenate strings like this!' ``` --- # A global `setup` chunk ๐ One chunk to rule them all! .leftcol[ ````markdown ```{r setup, include = FALSE} knitr::opts_chunk$set( warning = FALSE, message = FALSE, comment = "#>", fig.retina = 3, fig.path = "figs/" ) ``` ```` ] .rightcol[ - A special chunk label: `setup` - Typically the first chunk - All following chunks will use these options (i.e., sets global chunk options) - **Tip**: set `include=FALSE` - You can (and should) use individual chunk options too ] --- class: inverse
15
:
00
## Your turn: Birds & Bears .font90[ 1) Create a new R Markdown file (`.Rmd`) in RStudio - title it _"Birds and Bears Analysis"_ 2) Create a "setup" code chunk to load the `tidyverse` library and the `birds.csv` and `bears.csv` files. 3) Use text and code to find answers each of the following questions - show your code and results to justify each answer: - Which months have the highest and lowest number of bird impacts with aircraft? - Does the annual number of bird impacts appear to be changing over time? - Which months have the highest frequency of bear killings? - Who has been killed more often by bears: hunters or hikers? - How do the the number of bear attacks on men vs women compare? ] --- class: inverse, middle, center # Including plots --- # Including plots .leftcol[ ```r bears %>% count(month) %>% ggplot() + geom_col( aes(x = as.factor(month), y=n)) + theme_minimal(base_size = 22) + labs(x = 'Month', y = 'Count') ``` Will this print? ] -- .rightcol[ <img src="figs/unnamed-chunk-41-1.png" width="432" /> ] --- # Including plots .leftcol[ ```r bearMonthPlot <- bears %>% count(month) %>% ggplot() + geom_col( aes(x = as.factor(month), y=n)) + theme_minimal(base_size = 22) + labs(x = 'Month', y = 'Count') ``` What about this? ] --- # Including plots .leftcol[ ```r bearMonthPlot <- bears %>% count(month) %>% ggplot() + geom_col( aes(x = as.factor(month), y=n)) + theme_minimal(base_size = 22) + labs(x = 'Month', y = 'Count') *bearMonthPlot ``` What about this? ] -- .rightcol[ <img src="figs/unnamed-chunk-44-1.png" width="432" /> ] ??? so, how did we get a figure into R Markdown? Answer: it has to print! --- # Chunk options for plots - fig size - fig resolution .footnote[https://yihui.name/knitr/options/#plots] --- # `out.width` .leftcol[ ````markdown ```{r, out.width="70%"} bearMonthPlot ``` ```` <img src="figs/unnamed-chunk-45-1.png" width="70%" /> ] -- .rightcol[ ````markdown ```{r, out.width="20%"} bearMonthPlot ``` ```` <img src="figs/unnamed-chunk-46-1.png" width="20%" /> ] --- # `fig.width` & `fig.height` .leftcol[ ````markdown ```{r, fig.width=6, fig.height=4} bearMonthPlot ``` ```` <img src="figs/unnamed-chunk-47-1.png" width="432" /> ] -- .rightcol[ ````markdown ```{r, fig.width=3, fig.height=4} bearMonthPlot ``` ```` <img src="figs/unnamed-chunk-48-1.png" width="216" /> ] --- # `fig.path` ````markdown ```{r, fig.path="figs/", echo=FALSE} bearMonthPlot ``` ```` <img src="figs/unnamed-chunk-49-1.png" width="432" /> --- # `fig.path` ````markdown ```{r bear-month-plot, fig.path="figs/", echo=FALSE} bearMonthPlot ``` ```` <img src="figs/bear-month-plot-1.png" width="432" /> --- class: middle, center # A good chunk label **Think: kebabs, not snakes** .leftcol[ ### Good `my-plot` `myplot` `myplot1` `myplot-1` `MY-PLOT` ] .rightcol[ ### Bad `my_plot` `my plot` everything else! ] --- ### View default options .code60[ ```r str(knitr::opts_chunk$get()) ``` ``` #> List of 53 #> $ eval : logi TRUE #> $ echo : logi TRUE #> $ results : chr "markup" #> $ tidy : logi FALSE #> $ tidy.opts : NULL #> $ collapse : logi FALSE #> $ prompt : logi FALSE #> $ comment : chr "#>" #> $ highlight : logi TRUE #> $ size : chr "normalsize" #> $ background : chr "#F7F7F7" #> $ strip.white : 'AsIs' logi TRUE #> $ cache : logi FALSE #> $ cache.path : chr "index_cache/html/" #> $ cache.vars : NULL #> $ cache.lazy : logi TRUE #> $ dependson : NULL #> $ autodep : logi FALSE #> $ cache.rebuild: logi FALSE #> $ fig.keep : chr "high" #> $ fig.show : chr "asis" #> $ fig.align : chr "default" #> $ fig.path : chr "figs/" #> $ dev : chr "png" #> $ dev.args : NULL #> $ dpi : num 72 #> $ fig.ext : NULL #> $ fig.width : num 7.25 #> $ fig.height : num 4 #> $ fig.env : chr "figure" #> $ fig.cap : NULL #> $ fig.scap : NULL #> $ fig.lp : chr "fig:" #> $ fig.subcap : NULL #> $ fig.pos : chr "" #> $ out.width : NULL #> $ out.height : NULL #> $ out.extra : NULL #> $ fig.retina : num 3 #> $ external : logi TRUE #> $ sanitize : logi FALSE #> $ interval : num 1 #> $ aniopts : chr "controls,loop" #> $ warning : logi FALSE #> $ error : logi FALSE #> $ message : logi FALSE #> $ render : NULL #> $ ref.label : NULL #> $ child : NULL #> $ engine : chr "R" #> $ split : logi FALSE #> $ include : logi TRUE #> $ purl : logi TRUE ``` ] --- class: inverse, middle # Two more important chunks: - ## Images - ## Tables --- # .center[Image chunks] -- ### Insert images with markdown ```markdown ![](images/p4a_hex_sticker.png) ``` -- ### Insert images with chunks (so you can resize it) ````markdown ```{r, echo=FALSE, out.width="20%"} knitr::include_graphics("images/p4a_hex_sticker.png") ``` ```` --- # .center[Image chunks] .leftcol[ ````markdown ```{r, echo=FALSE, out.width="20%"} knitr::include_graphics("images/p4a_hex_sticker.png") ``` ```` <img src="images/p4a_hex_sticker.png" width="20%" /> ] .rightcol[ ````markdown ```{r, echo=FALSE, out.width="50%"} knitr::include_graphics("images/p4a_hex_sticker.png") ``` ```` <img src="images/p4a_hex_sticker.png" width="50%" /> ] --- # Convert a data frame to a table with `kable()` -- .leftcol[ ```r bears %>% count(bearType, wildOrCaptive) ``` ``` #> # A tibble: 6 ร 3 #> bearType wildOrCaptive n #> <chr> <chr> <int> #> 1 Black Captive 16 #> 2 Black Wild 60 #> 3 Brown Captive 8 #> 4 Brown Wild 72 #> 5 Polar Captive 4 #> 6 Polar Wild 6 ``` ] -- .rightcol[ ```r bears %>% count(bearType, wildOrCaptive) %>% kable() ``` <table> <thead> <tr> <th style="text-align:left;"> bearType </th> <th style="text-align:left;"> wildOrCaptive </th> <th style="text-align:right;"> n </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Black </td> <td style="text-align:left;"> Captive </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> Black </td> <td style="text-align:left;"> Wild </td> <td style="text-align:right;"> 60 </td> </tr> <tr> <td style="text-align:left;"> Brown </td> <td style="text-align:left;"> Captive </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Brown </td> <td style="text-align:left;"> Wild </td> <td style="text-align:right;"> 72 </td> </tr> <tr> <td style="text-align:left;"> Polar </td> <td style="text-align:left;"> Captive </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Polar </td> <td style="text-align:left;"> Wild </td> <td style="text-align:right;"> 6 </td> </tr> </tbody> </table> ] --- class: inverse
15
:
00
## Your turn: College Majors .font90[ 1) Create a new R Markdown file (`.Rmd`) in RStudio - title it _"College Majors Analysis"_ 2) Create a "setup" code chunk to load the `tidyverse` library and the `recent_grads.csv` file. 3) Use text, code, and plots to find answers each of the following questions - show your code and results to justify each answer: - What are the highest earning engineering majors? - Within the engineering majors, which ones have better employment rates? - Within the engineering majors, which ones have a better gender balance? (Use good code chunk names for your figures!) ]