Skip to content

Commit 5c5ab18

Browse files
KristinaRiemerdlebauer
authored andcommitted
WIP Add second walkthrough notes (#199)
* Create first draft of second walkthrough notes * Add outline of second walkthrough * Complete full version of notes for second walkthrough * Edit notes for walkthrough
1 parent 815661e commit 5c5ab18

2 files changed

Lines changed: 396 additions & 0 deletions

File tree

videos/second_walkthrough.Rmd

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
title: "Second Walkthrough Notes"
3+
author: "Kristina Riemer"
4+
output: github_document
5+
urlcolor: blue
6+
---
7+
8+
## Introduction
9+
10+
The purpose of this walkthrough is to review the experimental design of TERRA REF, and show how to find and access data on the two platforms. The focus will be on image data, using RGB data from the project as an example.
11+
12+
## Video 1: Explore Data with Traitvis Web App
13+
14+
Go to [website](https://traitvis.workbench.terraref.org/), which takes a minute to load. Displays plots of various data across collection time.
15+
16+
As example, we'll look at data from Season 6, which we looked at last week, by going to "MAC Season 6" tab.
17+
18+
Can choose different variables and cultivars. Select "Canopy Cover" from first dropdown and second dropdown to "PI656026" to look at cultivar from last time. How much ground is covered increases across year with max at 100. Hover over parts of graph to get specific values.
19+
20+
Change second dropdown to "None" to get all cultivars. Select "Map" tab to look at data spatially. Shows long field in Maricopa, Arizona.
21+
22+
## Video 2: TERRA REF Experimental Design
23+
24+
*first slide*
25+
26+
Field is passed over by large robot called a gantry. Lot of equipment in hanging box to collect many types of data.
27+
28+
Gantry goes systematically over entire field once a day. Some sensors take data every day, like 9,000 images from RGB camera. Others more intermittently, like hyperspectral images, because of data space limits.
29+
30+
*second slide* + *table*
31+
32+
Sensors include:
33+
34+
* camera that takes pairs of red-green-blue images (**Stereo RGB**)
35+
* thermal infrared images (**FLIR**)
36+
* images at a bunch of wavelengths to get hyperspectral data (**VNIR/SWIR**)
37+
* laser that collects points on plant surfaces to create 3D image (**3D Laser**)
38+
* measures plant fluorescence (**PS II Fluor**)
39+
* handful of others, including environmental data such as temperature and light
40+
41+
*rest of slides*
42+
43+
See example data for some of these.
44+
45+
*traitvis webapp*
46+
47+
Collecting data since 2015, and are up to 8 seasons worth in that time. Originally for sorghum, but now open to other crop species and organizations that want to use system.
48+
49+
Field is split up into plots. Referenced using range by column system, can see by hovering over map.
50+
51+
Can choose data within season with slider bar. Set to July 25, or 2018-07-25. Takes a moment to pull data for that date.
52+
53+
Currently shows canopy cover value for each plot. See for single plot "Range 20 Column 1". Zoom in on lower left hand part. Hover over that and see a canopy cover value of ~18%.
54+
55+
These data are summarized from camera data, can see that by unselecting "Heat Map" button on left. These are downscaled versions of infrared data. Main image data are infrared and RGB.
56+
57+
## Video 3: Downloading RGB Files from Globus
58+
59+
We can download these individual files. TERRA REF data are on two platforms. We pulled data from the first platform Clowder last week. Other platform is called Globus, let's work with that first.
60+
61+
All data are on Globus. Go to [website](globus.org) and log in. Need to set up account and get acccess to Terraref collection. Instructions are [here](https://docs.terraref.org/user-manual/how-to-access-data/using-globus-sensor-and-genomics-data).
62+
63+
In File Manager section. Click "Start here", "Shared With You" tab, and then "Terraref" option. This gives access to a bunch of files for the project.
64+
65+
Go to "ua-mac" folder, which contains data from Maricopa site. Can look at plot-level data like in map in web app. Select "Level_1_Plots" and "ir_geotiff" and "2018-07-15" and "MAC Field Scanner Season 6 Range 20 Column 1" (fourth of way down). Returns all of the infrared files for that date and plot.
66+
67+
Let's do same but focus instead on RGB images. Back up three levels, then "rgb_geotiff", "2018-07-25", "Range 20 Column 1". Click on single RGB image of interest within that plot on that date. "rgb_geotiff_L1_ua-mac_2018-07-25__13-30-49-010_left.tif" is near bottom. They're labeled by exact time image was taken.
68+
69+
Can get file locally. First have to create an endpoint on local computer. In Globus, right click on "Endpoints" on right hand side, then "Create new endpoint" in upper right hand corner and select "Globus Connect Personal". This walks through how to name endpoint, get key, and download Globus Connect Personal program on computer. I already have one.
70+
71+
Go back to file in File Manager. Click "Transfer or Sync to" button on right hand side. Click "Select a collection" and double click on my endpoint "My University of Arizona MacBook". Select "Desktop" on right hand side to specify where file should go. Still have file of interest selected, can hit Start button on lower left to transfer. Look at Finder, can see it's downloading. Can open it up and look at it, nice image of plant.
72+
73+
Globus is good for downloading a bunch of images, from a particular date and/or plot. These can take a long time, especially with lots of files. But can't see ahead of time.
74+
75+
## Video 4: Downloading RGB Files from Clowder
76+
77+
Second platform is Clowder, which is better for browsing through files. Website is [here](https://terraref.ncsa.illinois.edu/clowder/). You can follow along, these are publicly accessible files. Clowder is an interface on top of Globus.
78+
79+
All data are organized in several ways. In spaces, collections, or datasets. We can find same RGB tif from before. Under "Explore" tab, select "Collections". Look at "Season 6 (2018)", which takes a minute to load. Then "RGB Camera Data (Season6 Samples)". Scroll down to third file, can see it's the one from the same date and time as before.
80+
81+
Unlike Globus, can see thumbnails and previews of images to better browse. Can click on this and see preview of image.
82+
83+
Can download like before by clicking Download button. This will take a minute to download. Move this into Desktop and Clowder_RGB folder. Can look at file like before.
84+
85+
Clowder is easier in some ways for browsing, but can be slower because of the interface and files have to be downloaded one at a time.
86+
87+
Open new tab with Clowder home page. These are all the publicly accessible files, that are sample data drawn from a couple of seasons. Can also get an account on Clowder and access more of the data. Click on "Sign Up" in top right hand and enter email.
88+
89+
Log in to my account. If I navigate to Datasets with Explore tab, can see at the top some RGB tifs that were collected this year.
90+
91+
## Video 4: Download with Python
92+
93+
In addition to manually downloading data through Clowder or Globus, can get programmatically using Python. Will walk through this now, can follow along if you have the Vice app like I used last week.
94+
95+
Go to Cyverse Discovery Environment and open up TERRA REF RStudio app. I will be running Python in the command line. The benefit of using this app is that the Python modules we'll use are already installed. Go to Terminal tab.
96+
97+
First thing we'll do is starting running Python. Specifically the newest version of Python.
98+
99+
```{python, eval=FALSE}
100+
python3
101+
```
102+
103+
Read in requests Python module, which is used to connect to the URL where we'll pull data from and makes sure there is data there.
104+
105+
This function needs a url and an additional parameter input. Create object with url, like we did on Friday. Combine Clowder base URL with the string of letters and numbers that identify file. Navigate back to Clowder and copy that from the URL.
106+
107+
Can see if it works by running object, should return a 200 message.
108+
109+
```{python, eval=FALSE}
110+
import requests
111+
112+
file_url = 'https://terraref.ncsa.illinois.edu/clowder/files/5c5488fa4f0c4b0cbe7af98a'
113+
api_key = {'key': ''}
114+
file_request = requests.get(file_url, api_key)
115+
file_request
116+
```
117+
118+
We then want to open up the file and save it locally. Use another Python module called io. Because the files can be big and the URL request could time out while waiting, want to pull down file in chunks using a for loop.
119+
120+
Create object for name of file we want. Copy and paste from Clowder page.
121+
122+
Use io function `open`, specifying file name and that want want to write file (`w`) as a binary file (`b`), and we'll call it object.
123+
124+
The for loop does this in chunks of 1024 bytes. For every chunk that exists, it writes it to the object.
125+
126+
Should then see file in file system. Can use it within this app or export to local machine.
127+
128+
```{python, eval=FALSE}
129+
from io import open
130+
131+
file_name = 'rgb_geotiff_L1_ua-mac_2018-07-25__13-30-49-010_left.tif'
132+
with open(file_name, 'wb') as object:
133+
for chunk in file_request.iter_content(chunk_size=1024):
134+
if chunk:
135+
object.write(chunk)
136+
137+
```
138+
139+
This can be scaled up by using a list of file urls and names to download a bunch of files.
140+
141+
There is also a Python module called terrautils that is specifically designed for interacting with TERRA REF data. Documentation is [here](https://pypi.org/project/terrautils/). Don't have time to get into that today.
142+
143+
Working with data across plots and/or across time can be difficult because these files are large. They take a long time to download and process.
144+
145+
Most researchers have this workflow. They download a few files, like RGB images, develop this algorithm or extraction method. They then work with TERRA REF team to implement their method in a processing pipeline for larger amounts of data.
146+
147+
We're trying to make these data more usable to anyone who wants to do that, so feedback on either of these interfaces or any of the documentation is very welcome. These data are on a large enough scale that there are storage and access challenges.
148+
149+
In next week's webinar, we will follow up on this work by getting some RGB images, calculating a greenness index, and combining with trait data like we worked with last week.
150+
151+
I will be sending out an email with the followup survey, if everyone could take that, and notes from this session. This session was recorded and I will be posting it as YouTube videos soon.

videos/second_walkthrough.md

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
Second Walkthrough Notes
2+
================
3+
Kristina Riemer
4+
5+
## Introduction
6+
7+
The purpose of this walkthrough is to review the experimental design of
8+
TERRA REF, and show how to find and access data on the two platforms.
9+
The focus will be on image data, using RGB data from the project as an
10+
example.
11+
12+
## Video 1: Explore Data with Traitvis Web App
13+
14+
Go to [website](https://traitvis.workbench.terraref.org/), which takes a
15+
minute to load. Displays plots of various data across collection time.
16+
17+
As example, we’ll look at data from Season 6, which we looked at last
18+
week, by going to “MAC Season 6” tab.
19+
20+
Can choose different variables and cultivars. Select “Canopy Cover” from
21+
first dropdown and second dropdown to “PI656026” to look at cultivar
22+
from last time. How much ground is covered increases across year with
23+
max at 100. Hover over parts of graph to get specific values.
24+
25+
Change second dropdown to “None” to get all cultivars. Select “Map” tab
26+
to look at data spatially. Shows long field in Maricopa, Arizona.
27+
28+
## Video 2: TERRA REF Experimental Design
29+
30+
*first slide*
31+
32+
Field is passed over by large robot called a gantry. Lot of equipment in
33+
hanging box to collect many types of data.
34+
35+
Gantry goes systematically over entire field once a day. Some sensors
36+
take data every day, like 9,000 images from RGB camera. Others more
37+
intermittently, like hyperspectral images, because of data space limits.
38+
39+
*second slide* + *table*
40+
41+
Sensors include:
42+
43+
- camera that takes pairs of red-green-blue images (**Stereo RGB**)
44+
- thermal infrared images (**FLIR**)
45+
- images at a bunch of wavelengths to get hyperspectral data
46+
(**VNIR/SWIR**)
47+
- laser that collects points on plant surfaces to create 3D image
48+
(**3D Laser**)
49+
- measures plant fluorescence (**PS II Fluor**)
50+
- handful of others, including environmental data such as temperature
51+
and light
52+
53+
*rest of slides*
54+
55+
See example data for some of these.
56+
57+
*traitvis webapp*
58+
59+
Collecting data since 2015, and are up to 8 seasons worth in that time.
60+
Originally for sorghum, but now open to other crop species and
61+
organizations that want to use system.
62+
63+
Field is split up into plots. Referenced using range by column system,
64+
can see by hovering over map.
65+
66+
Can choose data within season with slider bar. Set to July 25, or
67+
2018-07-25. Takes a moment to pull data for that date.
68+
69+
Currently shows canopy cover value for each plot. See for single plot
70+
“Range 20 Column 1”. Zoom in on lower left hand part. Hover over that
71+
and see a canopy cover value of ~18%.
72+
73+
These data are summarized from camera data, can see that by unselecting
74+
“Heat Map” button on left. These are downscaled versions of infrared
75+
data. Main image data are infrared and RGB.
76+
77+
## Video 3: Downloading RGB Files from Globus
78+
79+
We can download these individual files. TERRA REF data are on two
80+
platforms. We pulled data from the first platform Clowder last week.
81+
Other platform is called Globus, let’s work with that first.
82+
83+
All data are on Globus. Go to [website](globus.org) and log in. Need to
84+
set up account and get acccess to Terraref collection. Instructions are
85+
[here](https://docs.terraref.org/user-manual/how-to-access-data/using-globus-sensor-and-genomics-data).
86+
87+
In File Manager section. Click “Start here”, “Shared With You” tab, and
88+
then “Terraref” option. This gives access to a bunch of files for the
89+
project.
90+
91+
Go to “ua-mac” folder, which contains data from Maricopa site. Can look
92+
at plot-level data like in map in web app. Select “Level\_1\_Plots” and
93+
“ir\_geotiff” and “2018-07-15” and “MAC Field Scanner Season 6 Range
94+
20 Column 1” (fourth of way down). Returns all of the infrared files for
95+
that date and plot.
96+
97+
Let’s do same but focus instead on RGB images. Back up three levels,
98+
then “rgb\_geotiff”, “2018-07-25”, “Range 20 Column 1”. Click on single
99+
RGB image of interest within that plot on that date.
100+
"rgb\_geotiff\_L1\_ua-mac\_2018-07-25\_\_13-30-49-010\_left.tif" is near
101+
bottom. They’re labeled by exact time image was taken.
102+
103+
Can get file locally. First have to create an endpoint on local
104+
computer. In Globus, right click on “Endpoints” on right hand side, then
105+
“Create new endpoint” in upper right hand corner and select “Globus
106+
Connect Personal”. This walks through how to name endpoint, get key, and
107+
download Globus Connect Personal program on computer. I already have
108+
one.
109+
110+
Go back to file in File Manager. Click “Transfer or Sync to” button on
111+
right hand side. Click “Select a collection” and double click on my
112+
endpoint “My University of Arizona MacBook”. Select “Desktop” on right
113+
hand side to specify where file should go. Still have file of interest
114+
selected, can hit Start button on lower left to transfer. Look at
115+
Finder, can see it’s downloading. Can open it up and look at it, nice
116+
image of plant.
117+
118+
Globus is good for downloading a bunch of images, from a particular date
119+
and/or plot. These can take a long time, especially with lots of files.
120+
But can’t see ahead of time.
121+
122+
## Video 4: Downloading RGB Files from Clowder
123+
124+
Second platform is Clowder, which is better for browsing through files.
125+
Website is [here](https://terraref.ncsa.illinois.edu/clowder/). You can
126+
follow along, these are publicly accessible files. Clowder is an
127+
interface on top of Globus.
128+
129+
All data are organized in several ways. In spaces, collections, or
130+
datasets. We can find same RGB tif from before. Under “Explore” tab,
131+
select “Collections”. Look at “Season 6 (2018)”, which takes a minute to
132+
load. Then “RGB Camera Data (Season6 Samples)”. Scroll down to third
133+
file, can see it’s the one from the same date and time as before.
134+
135+
Unlike Globus, can see thumbnails and previews of images to better
136+
browse. Can click on this and see preview of image.
137+
138+
Can download like before by clicking Download button. This will take a
139+
minute to download. Move this into Desktop and Clowder\_RGB folder. Can
140+
look at file like before.
141+
142+
Clowder is easier in some ways for browsing, but can be slower because
143+
of the interface and files have to be downloaded one at a time.
144+
145+
Open new tab with Clowder home page. These are all the publicly
146+
accessible files, that are sample data drawn from a couple of seasons.
147+
Can also get an account on Clowder and access more of the data. Click on
148+
“Sign Up” in top right hand and enter email.
149+
150+
Log in to my account. If I navigate to Datasets with Explore tab, can
151+
see at the top some RGB tifs that were collected this year.
152+
153+
## Video 4: Download with Python
154+
155+
In addition to manually downloading data through Clowder or Globus, can
156+
get programmatically using Python. Will walk through this now, can
157+
follow along if you have the Vice app like I used last week.
158+
159+
Go to Cyverse Discovery Environment and open up TERRA REF RStudio app. I
160+
will be running Python in the command line. The benefit of using this
161+
app is that the Python modules we’ll use are already installed. Go to
162+
Terminal tab.
163+
164+
First thing we’ll do is starting running Python. Specifically the newest
165+
version of Python.
166+
167+
``` python
168+
python3
169+
```
170+
171+
Read in requests Python module, which is used to connect to the URL
172+
where we’ll pull data from and makes sure there is data there.
173+
174+
This function needs a url and an additional parameter input. Create
175+
object with url, like we did on Friday. Combine Clowder base URL with
176+
the string of letters and numbers that identify file. Navigate back to
177+
Clowder and copy that from the URL.
178+
179+
Can see if it works by running object, should return a 200 message.
180+
181+
``` python
182+
import requests
183+
184+
file_url = 'https://terraref.ncsa.illinois.edu/clowder/files/5c5488fa4f0c4b0cbe7af98a'
185+
api_key = {'key': ''}
186+
file_request = requests.get(file_url, api_key)
187+
file_request
188+
```
189+
190+
We then want to open up the file and save it locally. Use another Python
191+
module called io. Because the files can be big and the URL request could
192+
time out while waiting, want to pull down file in chunks using a for
193+
loop.
194+
195+
Create object for name of file we want. Copy and paste from Clowder
196+
page.
197+
198+
Use io function `open`, specifying file name and that want want to write
199+
file (`w`) as a binary file (`b`), and we’ll call it object.
200+
201+
The for loop does this in chunks of 1024 bytes. For every chunk that
202+
exists, it writes it to the object.
203+
204+
Should then see file in file system. Can use it within this app or
205+
export to local machine.
206+
207+
``` python
208+
from io import open
209+
210+
file_name = 'rgb_geotiff_L1_ua-mac_2018-07-25__13-30-49-010_left.tif'
211+
with open(file_name, 'wb') as object:
212+
for chunk in file_request.iter_content(chunk_size=1024):
213+
if chunk:
214+
object.write(chunk)
215+
```
216+
217+
This can be scaled up by using a list of file urls and names to download
218+
a bunch of files.
219+
220+
There is also a Python module called terrautils that is specifically
221+
designed for interacting with TERRA REF data. Documentation is
222+
[here](https://pypi.org/project/terrautils/). Don’t have time to get
223+
into that today.
224+
225+
Working with data across plots and/or across time can be difficult
226+
because these files are large. They take a long time to download and
227+
process.
228+
229+
Most researchers have this workflow. They download a few files, like RGB
230+
images, develop this algorithm or extraction method. They then work with
231+
TERRA REF team to implement their method in a processing pipeline for
232+
larger amounts of data.
233+
234+
We’re trying to make these data more usable to anyone who wants to do
235+
that, so feedback on either of these interfaces or any of the
236+
documentation is very welcome. These data are on a large enough scale
237+
that there are storage and access challenges.
238+
239+
In next week’s webinar, we will follow up on this work by getting some
240+
RGB images, calculating a greenness index, and combining with trait data
241+
like we worked with last week.
242+
243+
I will be sending out an email with the followup survey, if everyone
244+
could take that, and notes from this session. This session was recorded
245+
and I will be posting it as YouTube videos soon.

0 commit comments

Comments
 (0)