Skip to content

Commit 52c3a61

Browse files
committed
reorganizing tutorials - mostly separating out different methods of access
1 parent c06eb82 commit 52c3a61

9 files changed

Lines changed: 285 additions & 245 deletions

_bookdown.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,8 @@ output_dir: "docs"
33
language:
44
ui:
55
chapter_name: "Chapter "
6-
rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-simulated-sorghum.Rmd", "traits/02-danforth-phenotyping-facility.Rmd",
7-
"traits/03-maricopa-field-scanner.Rmd","traits/04-agronomic-metadata.Rmd"]#, "sensors/05-command-line-hyperspectral-workflow.Rmd"]
6+
rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd",
7+
"traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd",
8+
"traits/04-danforth-indoor-phenotyping-facility.Rmd", "traits/05-maricopa-field-scanner.Rmd",
9+
"traits/06-agronomic-metadata.Rmd"]#, "10-simulated-sorghum.Rmd"
10+

traits/00-BETYdb-getting-started.Rmd

Lines changed: 12 additions & 213 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,10 @@ output: html_document
88
## TERRA Ref Trait Database
99

1010
The TERRA Ref program uses the BETYdb database and web application software to store plant and plot level trait data.
11+
We use BETYdb to organize, manage and distribute agricultural and ecological data.
12+
It contains trait (phenotype) data at the plot or plant level as well as meta data including plot boundaries, experimental design, methods, trait definitions, cultivars (genotypes) and agronomic managements.
1113

12-
### BETYdb: database software and web application
14+
### Introduction to BETYdb
1315

1416
The TERRA REF trait database (terraref.ncsa.illinois.edu/bety) uses the BETYdb data schema (structure) and web application.
1517
The BETYdb software is actively used and developed by the [TERRA Reference](terraref.org) program as well as by the [PEcAn project](pecanproject.org).
@@ -22,224 +24,21 @@ For more information about BETYdb, see the following:
2224
* _BETYdb Technical Documentation_ is written for advanced users and website and database administrators who may also be interested in the [full database schema](betydb.org/schemas)
2325
* BETYdb: A Yield, Trait and Ecosystem Service Database Applied to Second Generation Bioenergy Feedstocks. ([LeBauer et al, 2017](dx.doi.org/10.1111/gcbb.12420))
2426

25-
There are at least a half-dozen other databases using the BETYdb software that these exercises will work with, though the results will depend on the available data.
26-
The first, betydb.org is described in LeBauer et al, 2017.
27-
Others are listed in the 'distributed BETYdb' section of the technical documentation.
27+
Other than the TERRA REF trait database, there are a handful of other projects that use the BETYdb software, mostly with the PEcAn and TERRA programs. The content presented here is focused on the TERRA REF instance of BETYdb. Most of the information presented here is relevant to other databases, but the TERRA program has more emphasis on trait diversity among cultivars or genotypes within a crop whereas PEcAn focuses on the diversity of traits within ecosystems and plant functional types. In addition, the TERRA program is more focused on high throughput phenotyping - intensive monitoring of agricultural breeding trials whereas PEcAn focuses on assimilating heterogeneous data to forecast ecosystem functioning. Fortunately, both uses can use the shared ecosystem of software used for these tasks. For example, the PEcAn crop modeling infrastructure can be directly used to infer additional targets of breeding, and the diversity of traits observed in breeding trials can be a first step toward predicting the impacts of crop traits on productivity and ecosystem functioning.
28+
29+
The original instance of betydb.org is described in LeBauer et al, 2017.
30+
Others instances are listed in the 'distributed BETYdb' section of the technical documentation.
2831

2932
When there is a public-facing website, BETYdb is only designed to keep its trait and yield data private.
3033
Metadata such as field management and experimental design are available if the url is public.
3134

32-
## Getting an account for the TERRA trait database
33-
34-
* sign up for an account at terraref.ncsa.illinois.edu/bety
35-
* sign up for alpha user [link to form]
36-
* wait for database access to be granted
37-
* Your API key will be sent in the email; it can also be found - and regenerated - by navigating to 'data --> users' in the web interface
38-
39-
TODO add signup info from handout
40-
41-
## First steps: download data from web interface
42-
43-
* Point your browser to terraref.ncsa.illinois.edu/bety
44-
* login
45-
* enter "NDVI" in the search box
46-
* on the next page you will see the results of this search
47-
* if you want all of the data, including data that has not gone through QA/QC, make sure to check the 'include unchecked records' option
48-
* in the upper right, you will see a button that will allow you to download the search results as a CSV file. Click it. Open the file in a text editor or spreadsheet program and review its contents.
49-
50-
Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](../traits/04-agronomic-metadata.Rmd).
51-
52-
## Advanced: Using URLs to construct Queries
53-
54-
The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now.
55-
56-
### What is an API?
57-
58-
An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data.
59-
60-
All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs.
61-
62-
63-
### Using Your API key to Connect
64-
65-
An API key is like a password. It allows you to access data, and should be kept private.
66-
Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the terraref.ncsa.illinois.edu/bety-test database.
67-
68-
A common way of handling private API keys is to place it in a text file in your home directory.
69-
Don't put it in a project directory where it might be inadvertently shared.
70-
71-
Here is how to find and save your API key:
72-
73-
* click file --> new --> text file
74-
* copy the api key that was sent when you registered into the file
75-
* file --> save as '~/.betykey'
76-
77-
For the public key, you can call this file `~/.betykey_public`.
78-
79-
### Components of a URL query
80-
81-
82-
* base url: `terraref.ncsa.illinois.edu/bety`
83-
* path to the api: `/api/beta`
84-
* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables.
85-
* Query parameters: `genus=Sorghum`
86-
* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database.
87-
88-
89-
### Constructing a URL query
90-
91-
First, lets construct a query by putting together a URL.
92-
93-
1. start with the database url: `terraref.ncsa.illinois.edu/bety`
94-
* this url brings you to the home page
95-
2. Add the path to the API, `/api/beta`
96-
* now we have terraref.ncsa.illinois.edu/bety/api/beta, which points to the API documentation
97-
3. Add the name of the table you want to query. Lets start with `variables`
98-
* terraref.ncsa.illinois.edu/bety/api/beta/variables
99-
4. add query terms by appending a `?` and combining with `&`, for example:
100-
* `key=9999999999999999999999999999999999999999`
101-
* `type=trait` where the variable type is 'trait'
102-
* `name=~height` where the variable name contains 'height'
103-
5. This is your complete query:
104-
* `terraref.ncsa.illinois.edu/bety/api/beta/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999`
105-
* it will query all variables that are type trait and have 'height' in the name
106-
* Does it return the expected values?
107-
108-
109-
#### Your Turn
110-
111-
> What will the URL https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return?
112-
113-
> write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner`
114-
115-
What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`?
116-
117-
### Our first Query
118-
119-
#### Shell
120-
121-
```sh
122-
wget -O sorghum.json \\ # -O names the output file
123-
"https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=999999999999999999999999999999999999
124-
9999"
125-
```
126-
127-
If you want to write the query without exposing the key in plain text, you can construct it thus:
128-
129-
```sh
130-
wget -O sorghum.json \\
131-
"https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=`cat ~/.betykey_public`"
132-
```
133-
134-
> What does `cat ~/.betykey_public` do?
135-
136-
> How can you look at the files?
35+
### Current Contents
13736

37+
The traitvis webapp provides an interface for exploring available data that is updated daily. Below the page has been embedded in the
13838

139-
#### R - using the jsonlite package
140-
141-
```{r text-api}
142-
sorghum.json <- readLines(
143-
paste0("https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=",
144-
readLines('~/.betykey')))
145-
146-
## print(sorghum.json)
147-
## not a particularly useful format
148-
## lets convert to a data frame
149-
sorghum <- jsonlite::fromJSON(sorghum.json)
150-
```
151-
152-
## Using the R traits package to query the database
153-
154-
The rOpenSci traits package makes it easier to query the TERRA REF trait database, or any database that uses BETYdb software.
155-
156-
First, make sure we have the latest version from the terraref fork of the repository on github. (you can install using the standard `install.packages('traits')` but I can't promise everything will work.
157-
158-
### Install the package
159-
160-
```{r install_traits, echo=FALSE}
161-
devtools::install_github('terraref/traits')
162-
```
163-
164-
Now, we can load the packages that we will need to get started.
165-
166-
```{r 00-setup}
167-
library(traits)
168-
knitr::opts_chunk$set(echo = FALSE, cache = TRUE)
169-
library(ggplot2)
170-
library(ggthemes)
171-
theme_set(theme_bw())
172-
library(dplyr)
173-
```
174-
175-
176-
177-
```{r writing-key}
178-
# This should be done once with the key sent to you in your email
179-
# writeLines('abcdefg_rest_of_key_sent_in_email',
180-
# con = '~/.betykey')
181-
182-
# Example with the public key:
183-
writeLines('9999999999999999999999999999999999999999',
184-
con = '~/.betykey_public')
185-
```
186-
187-
#### R - using the traits package
188-
189-
The R traits package is an API 'client'. It does two important things:
190-
1. It makes it easier to specify the query parameters without having to construct a URL
191-
2. It returns the results as a data frame, which is easier to use within R
192-
193-
Lets start with the query of information about Sorghum from species table from above
194-
195-
```{r query-species}
196-
197-
sorghum_info <- betydb_query(table = 'species',
198-
genus = "Sorghum",
199-
api_version = 'beta',
200-
limit = 'none',
201-
betyurl = "https://terraref.ncsa.illinois.edu/bety/",
202-
key = readLines('~/.betykey', warn = FALSE))
203-
204-
```
205-
206-
#### R - setting options for the traits package
207-
208-
Notice all of the arguments that the `betydb_query` function requires? We can change this by setting the default connection options thus:
209-
210-
211-
```{r}
212-
options(betydb_key = readLines('~/.betykey', warn = FALSE),
213-
betydb_url = "https://terraref.ncsa.illinois.edu/bety/",
214-
betydb_api_version = 'beta')
215-
```
216-
217-
Now the same query can be reduced to:
218-
219-
```{r sv_area}
220-
sorghum_height <- betydb_query(table = 'search',
221-
trait = "plant_height",
222-
site = "~MAC",
223-
api_version = 'beta',
224-
limit = 'none',
225-
betyurl = "https://terraref.ncsa.illinois.edu/bety/",
226-
key = readLines('~/.betykey', warn = FALSE))
227-
```
228-
229-
### Time series of height
230-
231-
Now we can take a look at the data that we have just queried.
39+
* github repository https://github.com/terraref/traitvis-webapp
40+
* website: https://traitvis.workbench.terraref.org
23241

23342
```{r}
234-
ggplot(data = sorghum_height,
235-
aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean, color = cultivar)) +
236-
geom_smooth(se = FALSE, size = 0.5) +
237-
geom_point(size = 0.5, position = position_jitter(width = 0.1)) +
238-
# scale_x_datetime(date_breaks = '6 months', date_labels = "%b %Y") +
239-
# ylim(c(0,6)) +
240-
xlab("Day of Year") + ylab("Plant Height") +
241-
guides(color = guide_legend(title = 'Genotype')) +
242-
theme_bw()
243-
43+
knitr::include_app("https://traitvis.workbench.terraref.org/", height = "1400px")
24444
```
245-

traits/01-web-access.Rmd

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
title: "Accessing Trait Data Via the BETYdb Web Interface"
3+
author: "David LeBauer"
4+
date: "`r Sys.Date()`"
5+
output: html_document
6+
---
7+
8+
9+
10+
## Getting an account for the TERRA trait database
11+
12+
### Web interface
13+
14+
* Sign up for an account at https://terraref.ncsa.illinois.edu/bety
15+
* Sign up for the TERRA REF [beta user program](https://docs.google.com/forms/d/e/1FAIpQLScBsD042RrRok70BCGCRwARTcm9etvVHqvQaz1c5X7c5y0H3w/viewform?c=0&w=1)
16+
* Wait for database access to be granted
17+
* Your API key will be sent in the email. It can also be found - and regenerated - by navigating to the Users page (data --> users)](https://terraref.ncsa.illinois.edu/bety/users) in the web interface.
18+
19+
TODO add signup info from handout
20+
21+
## Searching for data
22+
23+
On the Welcome page there is a search option for trait and yield data. This tool allows users to search the entire collection of trait data for specific sites, citations, species, and traits.
24+
25+
26+
### Download search results as as csv file from the web interface
27+
28+
* Point your browser to terraref.ncsa.illinois.edu/bety
29+
* login
30+
* enter "NDVI" in the search box
31+
* on the next page you will see the results of this search
32+
* if you want all of the data, including data that has not gone through QA/QC, make sure to check the 'include unchecked records' option
33+
* in the upper right, you will see a button that will allow you to download the search results as a CSV file. Click it. Open the file in a text editor or spreadsheet program and review its contents.
34+
35+
Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](../traits/04-agronomic-metadata.Rmd).

0 commit comments

Comments
 (0)