Skip to content

Commit 88ea83a

Browse files
kimberlyh66dlebauer
authored andcommitted
Reorganized tutorials book to include vignette section (#81)
* fixed link to agronomic metadata tutorial * Updated tutorial. Added a summary showing the 3 ways to access API data. * Further updated the tutorial on how to access API using URL query and bash shell. * Further updated tutorial. Added short comment on the use of the fromJSON function. * Further updated R jsonlite portion of tutorial * Update traits/01-web-access.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> * added comment. link to traitvis webapp not working. * fixed links to terraref and pecan sites, betydb schemas, and doi reference. also updated comment on embedded shiny app (link to traitvis app does work). * updated beta user program link, and fixed links to agronomic metadata tutorial and terraref bety home page * made changes according to previous comments * made minor edits - assuming betykey in traits directory since ~ does not work on Windows * minor edit - removed comment * -changed trait in sorghum height search from 'canopy_cover' to 'canopy_height' -removed geom_smooth layer from ggplot * made minor edits. changed api version from 'beta' to 'v1' and hid the results of the 'query-danforth' r chunk * changed image name from 'betydb-postgis' to 'bety-postgis' * changed ~ (home directory) to . (current directory) * changed ~ (home directory) to . (current directory) * Removed YAML metadata header. Added Chapter title. * Removed YAML header. Added chapter title. * Changed chapter title to be more specific. Chapter 6 title is also Phenotype Analysis. * Removed text comment * updated chapter title - made more specific * commented out all chunks that were related to the traits-05-get-variables chunk (running this chunk results in a HTTP 504 Gateway Timeout) * set betydb_query parameters using options * loaded in some needed packages; set betydb_query function parameters using options * set options for chunks to improve look of output * changed headers to level 2 so not treated like a chapter title * set warning option to FALSE * removed .Rmd files that are not ready to be built into book * changed path to .betykey file * added tutorial 07 to rmd_files * minor edit to rmd_files * minor edits * removed extra parentheses; commented out portion of code that called the 'yields' object (object not ever created - and no rows returned for yields table query) * rearranged chunks and added some chunk options to improve output * added chunk option * Traits tutorials revisions (#41) * added some context to index * updated / added some stubs where we need more details * updated rmd_files (added tutorials 5 and 6) * added section header; added pointer urls * added some context to traits/06 * changed files to include in rmd_files * removed section header - not supposed to be in this file * created section header * changed some chunk parameters and added repository for traits installation * changed chunk parameters * changed chunk parameters and commented out sections that were not running correctly (HTTP 504) * changed chunk parameters * fix canopy_height query ... using sitename now. * remove cache from canopy_height chunk * remove cache from canopy_height chunk * query canopy_height from season 2 instead of season 6 * updated tutorials to include in rmd_files * removed cache = TRUE from chunk option * changed chunk options and path to .betykey * changed chunk options and path to .betykey * loaded in jsonlite package; changed path to .betykey * changed chunk options and installation of traits package from CRAN to github * updated file to include new vignettes section * created vignette folder and files * changed traits section number to 2 * minor edits to chunk options and variable names * updated introduction * changed names of Rmd files - deleted old ones * created new Rmd files (changed file names) * removed tutorials that have not yet been revised and updated and added sensor tutorials on images and weather * removed all references to the public key * Update traits/05-maricopa-field-scanner.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> * Update traits/05-maricopa-field-scanner.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> * removed references to public key and made minor spacing edits * deleted file * removed "installing database locally" section * changed title and added my draft of traits vignette * Added in vignette contents * Reverting my changes * Fleshed out images vignette * updated file names to match those in the vignettes folder * Update vignettes/03-get-images-python.Rmd Co-Authored-By: Chris-Schnaufer <schnaufer@email.arizona.edu> * Added information on getting API keys * * added ending to comment in index.Rmd * set python engine to python3 * Cleaned up an API key related spot * Removed some extra words * Update vignettes/01-get-trait-data-R.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> * added a clarifying comment on how to use ~ for partial string matching in a query
1 parent 7dab0d8 commit 88ea83a

14 files changed

Lines changed: 425 additions & 130 deletions

_bookdown.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ output_dir: "docs"
33
language:
44
ui:
55
chapter_name: "Chapter "
6-
rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd",
7-
"traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd",
8-
"traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"]#, "10-simulated-sorghum.Rmd"
6+
rmd_files: ["index.Rmd", "vignettes/00-introduction.Rmd", "vignettes/01-get-trait-data-R.Rmd", "vignettes/02-get-weather-data-R.Rmd",
7+
"vignettes/03-get-images-python.Rmd", "vignettes/04-synthesis-data.Rmd", "traits/03-access-r-traits.Rmd","sensors/01-meteorological-data.Rmd",
8+
"sensors/06-list-datasets-by-plot.Rmd"]
99

index.Rmd

Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,38 @@ output:
1111

1212
# Overview
1313

14-
This book is intended to introduce users to TERRA REF data as quickly as possible.
14+
This book is intended to quickly introduce users to TERRA REF data through a series of tutorials. TERRA REF has many types of data, and most can be accessed in multiple ways. Although this makes it more complicated to learn (and teach!), the objective is to provide users with the flexibility to access data in the most useful way.
1515

16-
It introduces to the wide range of phenomics datasets generated by the TERRA Reference program. Not only does TERRA REF have a large number of data sets, but many of the databases can be accessed in a number of different ways. While this makes it more complicated to learn, the goal is to provide users with the flexibility to access data in the most useful way.
16+
17+
## Contents
18+
19+
The first section walks the user through the steps of downloading and combining three different types of data: plot level phenotypes, meteorological data, and images. Subesquent sections provide more detailed examples that show how to access a larger variety of data and meta-data.
20+
21+
## Pre-requisites
22+
23+
While we assume that readers will have some familiarity with the nature of the problem - remote sensing of crop plants - for the most part, these tutorials assume that the user will bring their own scientific questions and a sense of curiosity and are eager to learn.
24+
25+
These tutorials are aimed at users who are familiar with or willing to learn programming languages including R (particularly for accessing plot level trait data) and Python (primarily for accessing environmental data and sensor data). In addition, there are examples of using SQL for more sophisticated database queries as well as the bash terminal.
26+
27+
Some of the lessons only require a web browser; others will assume familarity with programming at the command line in (typically only one of) Python, R, and / or SQL. You should be willing to find help (see finding help, below).
28+
29+
## Technical Requirements
30+
31+
At a minimum, you should have:
32+
33+
* An internet connection
34+
* Web Browser
35+
* Access to the data that you are using
36+
+ The tutorials will state which databases you will need access to
37+
* Software:
38+
+ Software requirements vary with the tutorials, and may be complex
1739

1840
## User Accounts and permission to access TERRA REF data
1941

20-
TODO: link to relevant parts of docs.terraref.org
42+
We have tried to write these tutorials using open access sample data sets. However, access to much of the data will require you to 1) fill out the TERRA REF Beta user questionaire ([terraref.org/beta](terraref.org/beta)) and 2) request access to specific databases.
2143

22-
* Info on how to [request access to data](https://docs.terraref.org/user-manual/how-to-access-data/using-betydb-trait-data-experimental-metadata)
44+
<!-- Not sure where this goes, either in documentation or perhaps in an appendix. But I don't think this belongs in the introduction. Perhaps after the vignettes chaper
45+
-->
2346

2447
## Ways of Acessing Data
2548

@@ -40,41 +63,21 @@ The TERRA REF website: [terraref.org](http://terraref.org/)
4063

4164
The TERRA REF Technical Documentation: [docs.terraref.org](docs.terraref.org)
4265

43-
## Contents
44-
45-
Scope ...
46-
47-
Audience ...
48-
49-
50-
## Pre-requisites
51-
52-
While we assume that readers will have some familiarity with the nature of the problem - remote sensing of crop plants - for the most part, these tutorials assume that the user will bring their own scientific questions and a sense of curiosity and are eager to learn.
53-
54-
Some of the lessons only require a web browser; others will assume familarity with programming at the command line in (typically only one of) Python, R, and / or SQL. You should be willing to find help (see finding help, below).
55-
56-
## Technical Requirements
57-
58-
At a minimum, you should have:
59-
60-
* An internet connection
61-
* Web Browser
62-
* A TERRA REF Beta User account
63-
+ If you have not done so, please sign up at [terraref.org/beta](terraref.org/beta)
64-
* Access to the data that you are using
65-
+ The tutorials will state which databases you will need access to
66-
* Software:
67-
+ Software requirements vary with the tutorials, and may be complex
68-
6966

7067
## Finding help
7168

72-
- [Slack](terra-ref.slack.com)
73-
- [GitHub](https://github.com/terraref/tutorials)
74-
- [Google](https://www.google.com/)
69+
- Slack at terra-ref.slack.com ([signup](https://terraref-slack-invite.herokuapp.com/))
70+
- Browse issues and repositories in GitHub:
71+
- search the organization at github.com/terraref
72+
- questions about the tutorials in the [tutorials repository](https://github.com/terraref/tutorials/issues)
73+
- about the data in the [reference-data repository](https://github.com/terraref/reference-data/issues)
7574

7675
```{r, include = FALSE}
77-
knitr::opts_chunk$set(echo = FALSE, cache = TRUE)
76+
knitr::opts_chunk$set(echo = FALSE,
77+
engine.path = list(
78+
python = 'python3'
79+
))
80+
7881
options(warn = -1)
7982
```
8083

traits/00-BETYdb-getting-started.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# (PART\*) Secton 1: Traits {-}
1+
# (PART\*) Secton 2: Traits {-}
22

33
# Getting Started with BETYdb
44

traits/02-betydb-api-access.Rmd

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,19 @@ The first step toward reproducible pipelines is to automate the process of searc
2626
### Using Your API key to Connect
2727

2828
An API key is like a password. It allows you to access data, and should be kept private.
29-
Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the https://terraref.ncsa.illinois.edu/bety-test database.
3029

31-
A common way of handling private API keys is to place it in a text file in your current directory.
32-
Don't put it in a project directory where it might be inadvertently shared.
30+
Therefore, we are not going to put it in code that we share.
31+
32+
A common way of handling private API keys is to place it in a text file in your current directory. Don't put it in a project directory where it might be inadvertently shared.
3333

3434
Here is how to find and save your API key:
3535

3636
* click file --> new --> text file
3737
* copy the api key that was sent when you registered into the file
3838
* file --> save as '.betykey'
3939

40-
For the public key, you can call this file `.betykey_public`.
40+
An API key is not needed to access public data. This includes metadata tables and simulated data in the https://terraref.ncsa.illinois.edu/bety-test database.
41+
4142

4243

4344
## Accessing data using a URL query
@@ -49,7 +50,9 @@ For the public key, you can call this file `.betykey_public`.
4950
* path to the api: `/api/v1`
5051
* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables.
5152
* Query parameters: `genus=Sorghum`
52-
* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database.
53+
54+
* Authentication: `key=api_key` is your assigned API key. This will only be needed when querying trait data. No key is needed to access the public metadata tables.
55+
5356

5457
### Constructing a URL query
5558

@@ -62,17 +65,17 @@ First, lets construct a query by putting together a URL.
6265
3. Add the name of the table you want to query. Lets start with `variables`
6366
* terraref.ncsa.illinois.edu/bety/api/v1/variables
6467
4. add query terms by appending a `?` and combining with `&`, for example:
65-
* `key=9999999999999999999999999999999999999999`
6668
* `type=trait` where the variable type is 'trait'
6769
* `name=~height` where the variable name contains 'height'
6870
5. This is your complete query:
69-
* `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999`
71+
* `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height`
7072
* it will query all variables that are type trait and have 'height' in the name
7173
* Does it return the expected values?
7274

7375
## Your Turn
7476

75-
> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return?
77+
> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum return?
78+
7679

7780
> Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner`
7881
@@ -84,23 +87,30 @@ Type the following command into a bash shell (the `-o` option names the output f
8487

8588
```sh
8689
curl -o sorghum.json \
87-
"https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999"
90+
"https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum"
8891
```
8992

9093
If you want to write the query without exposing the key in plain text, you can construct it like this:
9194

9295
```sh
9396
curl -o sorghum.json \
94-
"https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`"
97+
"https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum"
9598
```
9699

97100
## Using the R jsonlite package to access the API with a URL query
98101

102+
103+
```{r 02-jsonlite-load, include = FALSE}
104+
105+
library(jsonlite)
106+
107+
```
108+
99109
```{r text-api, warning = FALSE}
100110
sorghum.json <- readLines(
101111
paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=",
102-
readLines('traits/.betykey')))
103-
112+
readLines('.betykey')))
113+
104114
## print(sorghum.json)
105115
## not a particularly useful format
106116
## lets convert to a data frame

traits/03-access-r-traits.Rmd

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -8,59 +8,61 @@ The rOpenSci traits package makes it easier to query the TERRA REF trait databas
88

99
Install the traits package
1010

11-
The traits package is on CRAN, and can therefore be installed using the following command:
11+
The traits package can be installed through github using the following command:
1212

1313
```{r install_traits, echo = TRUE, message = FALSE}
14-
install.packages('traits', repos = 'http://cran.rstudio.com/')
14+
15+
if(packageVersion("traits") == '0.2.0'){
16+
devtools::install_github('ropensci/traits')
17+
}
18+
1519
```
1620

1721
Load other packages that we will need to get started.
1822

19-
```{r 00-setup, message = FALSE, echo = TRUE}
23+
```{r 00-setup, message = FALSE, echo = TRUE, warning = FALSE}
2024
library(traits)
2125
library(ggplot2)
2226
library(ggthemes)
2327
theme_set(theme_bw())
2428
library(dplyr)
2529
```
30+
Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. You will need this personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions.
2631

27-
Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. The public key will provide access to all metadata; you will need a personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions.
2832

2933
```{r writing-key, echo = TRUE}
3034
# This should be done once with the key sent to you in your email
31-
# writeLines('abcdefg_rest_of_key_sent_in_email',
35+
36+
# Example:
37+
#writeLines('abcdefg_rest_of_key_sent_in_email',
3238
# con = '.betykey')
3339
34-
# Example with the public key:
35-
writeLines('9999999999999999999999999999999999999999',
36-
con = '.betykey_public')
3740
```
3841

42+
3943
#### R - using the traits package
4044

4145
The R traits package is an API 'client'. It does two important things:
4246
1. It makes it easier to specify the query parameters without having to construct a URL
4347
2. It returns the results as a data frame, which is easier to use within R
4448

45-
Lets start with the query of information about Sorghum from species table from above
49+
Lets start with the query of information about Sorghum from the species table
4650

47-
```{r query-species, echo = TRUE}
51+
```{r query-species, results = 'hide', echo = TRUE}
4852
4953
sorghum_info <- betydb_query(table = 'species',
50-
genus = "Sorghum",
51-
api_version = 'v1',
52-
limit = 'none',
53-
betyurl = "https://terraref.ncsa.illinois.edu/bety/",
54-
key = readLines('.betykey', warn = FALSE))
54+
genus = "Sorghum",
55+
api_version = 'v1',
56+
limit = 'none',
57+
betyurl = "https://terraref.ncsa.illinois.edu/bety/",
58+
key = readLines('.betykey', warn = FALSE))
5559
5660
```
5761

5862
#### R - setting options for the traits package
5963

6064
Notice all of the arguments that the `betydb_query` function requires? We can change this by setting the default connection options thus:
6165

62-
63-
6466
```{r 03-set-up, echo = TRUE}
6567
options(betydb_key = readLines('.betykey', warn = FALSE),
6668
betydb_url = "https://terraref.ncsa.illinois.edu/bety/",
@@ -69,7 +71,8 @@ options(betydb_key = readLines('.betykey', warn = FALSE),
6971

7072
Now the same query can be reduced to:
7173

72-
```{r query-species-reduce, echo = TRUE, results = FALSE}
74+
```{r query-species-reduce, message = FALSE, echo = TRUE}
75+
7376
sorghum_info <- betydb_query(table = 'species',
7477
genus = "Sorghum",
7578
limit = 'none')
@@ -78,20 +81,23 @@ sorghum_info <- betydb_query(table = 'species',
7881
### Time series of height
7982

8083
Now let's query some trait data.
81-
```{r canopy_height, echo = TRUE, results = FALSE}
82-
sorghum_height <- betydb_query(table = 'search',
84+
85+
```{r canopy_height, echo = TRUE, message = FALSE}
86+
canopy_height <- betydb_query(table = 'search',
8387
trait = "canopy_height",
84-
sitename = "~Season 6",
88+
sitename = "~Season 2",
8589
limit = 'none')
8690
```
8791

8892
```{r plot_height}
89-
ggplot(data = sorghum_height,
93+
94+
ggplot(data = canopy_height,
9095
aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) +
9196
geom_point(size = 0.5, position = position_jitter(width = 0.1)) +
9297
# scale_x_datetime(date_breaks = '6 months') +
9398
xlab("Day of Year") + ylab("Plant Height") +
9499
guides(color = guide_legend(title = 'Genotype')) +
95100
theme_bw()
101+
96102
```
97103

traits/04-danforth-indoor-phenotyping-facility.Rmd

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Danforth Indoor Phenotype Analysis
22

33
```{r 02-setup, include=FALSE}
4+
45
knitr::opts_chunk$set(echo = TRUE, cache = TRUE)
56
library(jsonlite)
67
library(dplyr)
@@ -21,7 +22,8 @@ library(traits)
2122
Unlike the first two tutorials, now we will be querying real data from the public TERRA REF database. So we will use a new URL, https://terraref.ncsa.illinois.edu/bety/, and we will need to use our own private key.
2223

2324
```{r terraref-connect-options}
24-
options(betydb_key = readLines('traits/.betykey', warn = FALSE),
25+
26+
options(betydb_key = readLines('.betykey', warn = FALSE),
2527
betydb_url = "https://terraref.ncsa.illinois.edu/bety/",
2628
betydb_api_version = 'v1')
2729
```
@@ -92,7 +94,7 @@ ggplot(data = danforth_sorghum) +
9294

9395
### Growth rate over time
9496

95-
```{r danforth-phenotypes, fig.width=8, fig.height=4}
97+
```{r danforth-phenotypes, fig.width=8, fig.height=4, message = FALSE}
9698
9799
ggplot(data = danforth_sorghum, aes(x = date, y = mean, color = cultivar)) +
98100
# geom_line(aes(group = entity), size = 0.1) +

0 commit comments

Comments
 (0)