Skip to content

Commit bcc4a5b

Browse files
Merge pull request #96 from terraref/images_vignette
Images vignette
2 parents 589f4c7 + f4f25d3 commit bcc4a5b

1 file changed

Lines changed: 116 additions & 1 deletion

File tree

vignettes/03-get-images-python.Rmd

Lines changed: 116 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,116 @@
1-
# Images Vignette
1+
---
2+
title: "Get Source Image Files"
3+
output: html_document
4+
---
5+
6+
# Objective: To be able to demonstrate how to locate and retrieve RGB image files
7+
8+
This vignette shows how to locate and retrieve image files associated with growing Season 6
9+
from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/)
10+
using Python. The files are stored online on the data management system Clowder,
11+
which is accessed using an API. We will be working with the image files generated during the
12+
month of May by limiting the requests to that time period.
13+
14+
After completing this vignette it should be possible to search for and retrieve other
15+
files through the use of the API.
16+
17+
As an added bonus we've also included an exmple of how to retrieve the list of available
18+
sensor names through the API. By using the sensor names returned, it's possible to retrieve
19+
other files containing the data the sensors have collected.
20+
21+
**requirements**
22+
* Python 3
23+
* the terrautils library
24+
* this can be installed from pypi by running `pip install terrautils` in the terminal
25+
* an API key to access these data
26+
27+
The API key is a string that gets generated upon request through your Clowder account. Existing
28+
API keys will work with this vignette. To get a new API key it is necessary to first register
29+
with Clowder at "https://terraref.ncsa.illinois.edu/clowder/". First click the `Login` button and
30+
wait for the login screen to appear. Then select the `Sign up` button and enter an email
31+
address you have access to. An email is sent to the entered address with instructions for
32+
completing the registration process. Once registration is complete, log
33+
into Clowder and select the `View profile` menu option from the drop-down that is near the search
34+
control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new
35+
key is gnerated.
36+
37+
## Locating the images
38+
39+
To begin looking for files, a sensor name and site name are needed. We will be using
40+
'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this
41+
vignette we show how to retrieve the list of available sensors.
42+
43+
As mentioned in the overview, the url string will point to the API to use. In this case
44+
we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the
45+
one you created for your Clowder account.
46+
47+
```{python eval=FALSE}
48+
from terrautils.products import get_file_listing
49+
50+
url = 'https://terraref.ncsa.illinois.edu/clowder/api'
51+
key = 'YOUR_KEY_GOES_HERE'
52+
sensor = 'RGB GeoTIFFs Datasets'
53+
sitename = ''
54+
files = get_file_listing(None, url, key, sensor, sitename,
55+
since='2018-05-01', until='2018-05-31')
56+
```
57+
58+
The `files` variable now contains an array of all the file in the datasets that match the
59+
sensor in the plot for the month of May. When performing you own queries it's possible that there
60+
are no matches found and the `files` array would be empty.
61+
62+
# Retrieving the images
63+
64+
Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL
65+
that identifies the file to retrieve, making the API call to retrieve the file contents, and writing
66+
the contents to disk.
67+
68+
To create the correct URL we start with the one defined before and attach the keyword '/files/'
69+
followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving
70+
the file would be:
71+
72+
``` {sh eval=FALSE}
73+
https://terraref.ncsa.illinois.edu/clowder/api/files/111
74+
```
75+
76+
By looping through each of the returned files from the previous example, and using their ID and
77+
filename, we can retrieve the files from the server and store them locally.
78+
79+
We are streaming the data returned from our server request (`stream=True` in the code below) due to
80+
the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire
81+
contents would be in the `r` variable which could then be written to the local file.
82+
83+
```{python eval=FALSE}
84+
# We are using the same `url` and `key` variables declared in the previous example above.
85+
filesurl = url + '/files/'
86+
params={ 'key': key }
87+
88+
for f in files:
89+
r = requests.get(fileurl + f.id, params=params, stream=True)
90+
with open(f.filename, 'wb') as o:
91+
for chunk in r.iter_content(chunk_size=1024):
92+
if chunk:
93+
o.write(chunk)
94+
95+
```
96+
97+
The images are now stored on the local file system.
98+
99+
# Retrieving sensor names
100+
101+
In this section we retrieve the names of different sensor types that are available. This will
102+
allow you to retrieve files other than those containing RBG image data.
103+
104+
```{python eval=FALSE}
105+
# We are using the same `url` and `key` variables declared in the previous example above.
106+
from terrautils.products import get_sensor_list, unique_sensor_names
107+
108+
sensors = get_sensor_list(None, url, key)
109+
names = unique_sensor_names(sensors)
110+
```
111+
112+
The variable `names` will now contain the list of all available sensors. Using these sensor
113+
names it's possible to use the above search to locate and then retrieve additional data files.
114+
Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is
115+
assigned above.
116+

0 commit comments

Comments
 (0)