Skip to content

Commit 6d839f3

Browse files
Added in vignette contents
1 parent 467d274 commit 6d839f3

1 file changed

Lines changed: 100 additions & 1 deletion

File tree

vignettes/03-get-images-python.Rmd

Lines changed: 100 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,100 @@
1-
# Images Vignette
1+
---
2+
title: "Get Source Image Files"
3+
output: html_document
4+
---
5+
6+
# Objective: To be able to demonstrate how to locate and retrieve RGB image files
7+
8+
This vignette shows how to locate and retrieve image files associated with growing Season 6
9+
from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/)
10+
using Python. The files are stored online on the data management system Clowder,
11+
which is accessed using an API. We will be working with the image files generated during the
12+
month of May by limiting the requests to that time period.
13+
14+
After completing this vignette it should be possible to search for and retrieve other
15+
files through the use of the API.
16+
17+
As an added bonus we've also included an exmple of how to retrieve the list of available
18+
sensor names through the API. By using the sensor names returned, it's possible to retrieve
19+
other files containing the data the sensors have collected.
20+
21+
## Locating the images
22+
23+
To begin looking for files, a sensor name and site name are needed. We will be using
24+
'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this
25+
vignette we show how to retrieve the list of available sensors.
26+
27+
As mentioned in the overview, the url string will point to the API to use. In this case
28+
we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the
29+
one you received in an email.
30+
31+
```{python eval=FALSE}
32+
from terrautils.products import get_file_listing
33+
34+
url = 'https://terraref.ncsa.illinois.edu/clowder/api'
35+
key = 'YOUR_KEY_GOES_HERE'
36+
sensor = 'RGB GeoTIFFs Datasets'
37+
sitename = ''
38+
files = get_file_listing(None, url, key, sensor, sitename,
39+
since='2018-05-01', until='2018-05-31')
40+
```
41+
42+
The `files` variable now contains an array of all the file in the datasets that match the
43+
sensor in the plot for the month of May. When performing you own queries it's possible that there
44+
are no matches found and the `files` array would be empty.
45+
46+
# Retrieving the images
47+
48+
Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL
49+
that identifies the file to retrieve, making the API call to retrieve the file contents, and writing
50+
the contents to disk.
51+
52+
To create the correct URL we start with the one defined before and attach the keyword '/files/'
53+
followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving
54+
the file would be:
55+
56+
``` {sh eval=FALSE}
57+
https://terraref.ncsa.illinois.edu/clowder/api/files/111
58+
```
59+
60+
By looping through each of the returned files from the previous example, and using their ID and
61+
filename, we can retrieve the files from the server and store them locally.
62+
63+
We are streaming the data returned from our server request (`stream=True` in the code below) due to
64+
the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire
65+
contents would be in the `r` variable which could then be written to the local file.
66+
67+
```{python eval=FALSE}
68+
# We are using the same `url` and `key` variables declared in the previous example above.
69+
filesurl = url + '/files/'
70+
params={ 'key': key }
71+
72+
for f in files:
73+
r = requests.get(fileurl + f.id, params=params, stream=True)
74+
with open(f.filename, 'wb') as o:
75+
for chunk in r.iter_content(chunk_size=1024):
76+
if chunk:
77+
o.write(chunk)
78+
79+
```
80+
81+
The images are now stored on the local file system.
82+
83+
# Retrieving sensor names
84+
85+
In this section we retrieve the names of different sensor types that are available. This will
86+
allow you to retrieve files other than those containing RBG image data.
87+
88+
```{python eval=FALSE}
89+
# We are using the same `url` and `key` variables declared in the previous example above.
90+
from terrautils.products import get_sensor_list, unique_sensor_names
91+
92+
sensors = get_sensor_list(None, url, key)
93+
names = unique_sensor_names(sensors)
94+
```
95+
96+
The variable `names` will now contain the list of all available sensors. Using these sensor
97+
names it's possible to use the above search to locate and then retrieve additional data files.
98+
Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is
99+
assigned above.
100+

0 commit comments

Comments
 (0)