Merge branch 'traits_tutorials' of github.com:terraref/tutorials into traits_tutorials

kimberlyh66 · kimberlyh66 · commit 187f53850ccf · 2019-01-29T15:22:27.000-08:00
diff --git a/index.Rmd b/index.Rmd
@@ -42,6 +42,7 @@ At a minimum, you should have:
 We have tried to write these tutorials using open access sample data sets. However, access to much of the data will require you to 1) fill out the TERRA REF Beta user questionaire ([terraref.org/beta](terraref.org/beta)) and 2) request access to specific databases.
 
 <!-- Not sure where this goes, either in documentation or perhaps in an appendix. But I don't think this belongs in the introduction. Perhaps after the vignettes chaper 
+-->
 
 ## Ways of Acessing Data
 
@@ -72,7 +73,10 @@ The TERRA REF Technical Documentation: [docs.terraref.org](docs.terraref.org)
   - about the data in the [reference-data repository](https://github.com/terraref/reference-data/issues)
 
 ```{r, include = FALSE}
-knitr::opts_chunk$set(echo = FALSE)
+knitr::opts_chunk$set(echo = FALSE,
+                      engine.path = list(
+                        python = 'python3'
+                      ))
 
 options(warn = -1)
 ```
diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd
@@ -1 +1,116 @@
-# Images Vignette
+---
+title: "Get Source Image Files"
+output: html_document
+---
+
+# Objective: To be able to demonstrate how to locate and retrieve RGB image files
+
+This vignette shows how to locate and retrieve image files associated with growing Season 6 
+from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) 
+using Python. The files are stored online on the data management system Clowder, 
+which is accessed using an API. We will be working with the image files generated during the 
+month of May by limiting the requests to that time period.
+
+After completing this vignette it should be possible to search for and retrieve other
+files through the use of the API.
+
+As an added bonus we've also included an exmple of how to retrieve the list of available 
+sensor names through the API. By using the sensor names returned, it's possible to retrieve
+other files containing the data the sensors have collected.
+
+**requirements** 
+* Python 3 
+* the terrautils library
+   * this can be installed from pypi by running `pip install terrautils` in the terminal
+* an API key to access these data
+
+The API key is a string that gets generated upon request through your Clowder account. Existing
+API keys will work with this vignette. To get a new API key it is necessary to first register 
+with Clowder at "https://terraref.ncsa.illinois.edu/clowder/". First click the `Login` button and 
+wait for the login screen to appear. Then select the `Sign up` button and enter an email
+address you have access to. An email is sent to the entered address with instructions for 
+completing the registration process. Once registration is complete, log 
+into Clowder and select the `View profile` menu option from the drop-down that is near the search 
+control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new 
+key is gnerated.
+
+## Locating the images
+
+To begin looking for files, a sensor name and site name are needed. We will be using 
+'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this
+vignette we show how to retrieve the list of available sensors.
+
+As mentioned in the overview, the url string will point to the API to use. In this case
+we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the
+one you created for your Clowder account.
+
+```{python eval=FALSE}
+from terrautils.products import get_file_listing
+
+url = 'https://terraref.ncsa.illinois.edu/clowder/api'
+key = 'YOUR_KEY_GOES_HERE'
+sensor = 'RGB GeoTIFFs Datasets'
+sitename = ''
+files = get_file_listing(None, url, key, sensor, sitename,
+                   since='2018-05-01', until='2018-05-31')
+```
+
+The `files` variable now contains an array of all the file in the datasets that match the
+sensor in the plot for the month of May. When performing you own queries it's possible that there
+are no matches found and the `files` array would be empty.
+
+# Retrieving the images
+
+Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL
+that identifies the file to retrieve, making the API call to retrieve the file contents, and writing
+the contents to disk.
+
+To create the correct URL we start with the one defined before and attach the keyword '/files/'
+followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving 
+the file would be: 
+
+``` {sh eval=FALSE}
+https://terraref.ncsa.illinois.edu/clowder/api/files/111
+```
+
+By looping through each of the returned files from the previous example, and using their ID and 
+filename, we can retrieve the files from the server and store them locally. 
+
+We are streaming the data returned from our server request (`stream=True` in the code below) due to 
+the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire 
+contents would be in the `r` variable which could then be written to the local file.
+
+```{python eval=FALSE}
+# We are using the same `url` and `key` variables declared in the previous example above.
+filesurl = url + '/files/'
+params={ 'key': key }
+
+for f in files:
+  r = requests.get(fileurl +  f.id, params=params, stream=True)
+  with open(f.filename, 'wb') as o:
+        for chunk in r.iter_content(chunk_size=1024): 
+            if chunk:
+                o.write(chunk)
+     
+```
+
+The images are now stored on the local file system.
+
+# Retrieving sensor names
+
+In this section we retrieve the names of different sensor types that are available. This will
+allow you to retrieve files other than those containing RBG image data.
+
+```{python eval=FALSE}
+# We are using the same `url` and `key` variables declared in the previous example above.
+from terrautils.products import get_sensor_list, unique_sensor_names
+
+sensors = get_sensor_list(None, url, key)
+names = unique_sensor_names(sensors)
+```
+
+The variable `names` will now contain the list of all available sensors. Using these sensor
+names it's possible to use the above search to locate and then retrieve additional data files.
+Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is 
+assigned above.
+