Merge branch 'traits_tutorials' of github.com:terraref/tutorials into traits_tutorials

kimberlyh66 · kimberlyh66 · commit 026acff173b7 · 2018-12-10T12:52:33.000-08:00
diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd
@@ -1,106 +1,106 @@
-# Accessing Trait Data Via the BETYdb API
-
-## Using URLs to construct Queries
-
-The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. 
-
-### What is an API?
-
-An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data. 
-
-All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs. 
-
-
-### Using Your API key to Connect
-
-An API key is like a password. It allows you to access data, and should be kept private. 
-Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the https://terraref.ncsa.illinois.edu/bety-test database.
-
-A common way of handling private API keys is to place it in a text file in your current directory. 
-Don't put it in a project directory where it might be inadvertently shared.
-
-Here is how to find and save your API key:
-
-* click file --> new --> text file
-* copy the api key that was sent when you registered into the file
-* file --> save as '.betykey'
-
-For the public key, you can call this file `.betykey_public`. 
-
-### Ways to access API data
-
-1. Through a URL query
-2. Using the bash shell
-3. Using the R jsonlite package
-
-
-### Accessing data using a URL query
-
-
-## Components of a URL query
-
-* base url: `terraref.ncsa.illinois.edu/bety`
-* path to the api: `/api/v1`
-* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. 
-* Query parameters: `genus=Sorghum`
-* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. 
-
-## Constructing a URL query
-
-First, lets construct a query by putting together a URL.
-
-1. start with the database url: `terraref.ncsa.illinois.edu/bety`
-  * this url brings you to the home page
-2. Add the path to the API, `/api/v1`
-  * now we have terraref.ncsa.illinois.edu/bety/api/v1, which points to the API documentation
-3. Add the name of the table you want to query. Lets start with `variables`
-  * terraref.ncsa.illinois.edu/bety/api/v1/variables
-4. add query terms by appending a `?` and combining with `&`, for example:
-  * `key=9999999999999999999999999999999999999999`
-  * `type=trait` where the variable type is 'trait'
-  * `name=~height` where the variable name contains 'height'
-5. This is your complete query:
-  * `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999`
-  * it will query all variables that are type trait and have 'height' in the name
-  * Does it return the expected values?
-  
-## Your Turn
-
-> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return?
-
-> Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner`
-
-What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? 
-
-
-
-#### Accessing data using the Shell
-
-Type the following command into a bash shell (the `-o` option names the output file): 
-
-```sh
-curl -o sorghum.json \
-   "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999"
-```
-
-If you want to write the query without exposing the key in plain text, you can construct it like this:
-
-```sh
-curl -o sorghum.json \
-    "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`"
-```
-
-### Accessing API data using the R jsonlite package
-
-```{r text-api, warning = FALSE}
-sorghum.json <- readLines(
-  paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", 
-         readLines('traits/.betykey')))
-
-## print(sorghum.json) 
-## not a particularly useful format
-## lets convert to a data frame
-sorghum <- jsonlite::fromJSON(sorghum.json)
-```
-
-More on how to use the rOpenSci traits package coming up in the [next tutorial](03-access-r-traits.Rmd)
+# Accessing Trait Data Via the BETYdb API
+
+## Using URLs to construct Queries
+
+The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. 
+
+### What is an API?
+
+An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data. 
+
+All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs. 
+
+
+### Using Your API key to Connect
+
+An API key is like a password. It allows you to access data, and should be kept private. 
+Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the https://terraref.ncsa.illinois.edu/bety-test database.
+
+A common way of handling private API keys is to place it in a text file in your current directory. 
+Don't put it in a project directory where it might be inadvertently shared.
+
+Here is how to find and save your API key:
+
+* click file --> new --> text file
+* copy the api key that was sent when you registered into the file
+* file --> save as '.betykey'
+
+For the public key, you can call this file `.betykey_public`. 
+
+### Ways to access API data
+
+1. Through a URL query
+2. Using the bash shell
+3. Using the R jsonlite package
+
+
+### Accessing data using a URL query
+
+
+## Components of a URL query
+
+* base url: `terraref.ncsa.illinois.edu/bety`
+* path to the api: `/api/v1`
+* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. 
+* Query parameters: `genus=Sorghum`
+* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. 
+
+## Constructing a URL query
+
+First, lets construct a query by putting together a URL.
+
+1. start with the database url: `terraref.ncsa.illinois.edu/bety`
+  * this url brings you to the home page
+2. Add the path to the API, `/api/v1`
+  * now we have terraref.ncsa.illinois.edu/bety/api/v1, which points to the API documentation
+3. Add the name of the table you want to query. Lets start with `variables`
+  * terraref.ncsa.illinois.edu/bety/api/v1/variables
+4. add query terms by appending a `?` and combining with `&`, for example:
+  * `key=9999999999999999999999999999999999999999`
+  * `type=trait` where the variable type is 'trait'
+  * `name=~height` where the variable name contains 'height'
+5. This is your complete query:
+  * `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999`
+  * it will query all variables that are type trait and have 'height' in the name
+  * Does it return the expected values?
+  
+## Your Turn
+
+> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return?
+
+> Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner`
+
+What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? 
+
+
+
+#### Accessing data using the Shell
+
+Type the following command into a bash shell (the `-o` option names the output file): 
+
+```sh
+curl -o sorghum.json \
+   "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999"
+```
+
+If you want to write the query without exposing the key in plain text, you can construct it like this:
+
+```sh
+curl -o sorghum.json \
+    "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`"
+```
+
+### Accessing API data using the R jsonlite package
+
+```{r text-api, warning = FALSE}
+sorghum.json <- readLines(
+  paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", 
+         readLines('traits/.betykey')))
+
+## print(sorghum.json) 
+## not a particularly useful format
+## lets convert to a data frame
+sorghum <- jsonlite::fromJSON(sorghum.json)
+```
+
+More on how to use the rOpenSci traits package coming up in the [next tutorial](03-access-r-traits.Rmd)
diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd
@@ -6,7 +6,9 @@ The rOpenSci traits package makes it easier to query the TERRA REF trait databas
 
 First, make sure we have the latest version from the terraref fork of the repository on github. (you can install using the standard `install.packages('traits')` but as of Jan 2018 this version times out on very large datasets).
 
-### Install the package
+### Install the traits package
+
+This is for users working on their own computer - if you are using the TERRA REF Rstudio Docker container (including on workbench) you can skip this step. 
 
 ```{r install_traits, echo = FALSE, include = FALSE}
 devtools::install_github('terraref/traits')
@@ -16,7 +18,6 @@ Now, we can load the packages that we will need to get started.
 
 ```{r 00-setup, message = FALSE}
 library(traits)
-knitr::opts_chunk$set(echo = FALSE, cache = TRUE)
 library(ggplot2)
 library(ggthemes)
 theme_set(theme_bw())
@@ -59,6 +60,7 @@ sorghum_info <- betydb_query(table = 'species',
 Notice all of the arguments that the `betydb_query` function requires? We can change this by setting the default connection options thus:
 
 
+
 ```{r set-up, echo = TRUE}
 options(betydb_key = readLines('traits/.betykey', warn = FALSE),
         betydb_url = "https://terraref.ncsa.illinois.edu/bety/",