Skip to content

Commit 7dab0d8

Browse files
KristinaRiemerdlebauer
authored andcommitted
Add first draft of weather vignette (#85)
* Create new weather vignette with relevant parts from weather tutorial * Add objective and explanatory paragraph to weather vignette, change title * Restructure and rewrite db structure and API sections of weather vignette * Improve R section, add GDD calculations * Add section for getting all possible variables to weather vignette * Add final sentence and fit text into readable margins * Move API intro to beginning of weather vignette * Remove extra backslash from URL creation section * Change two R chunk names to be unique * Fix name of folder for vignettes
1 parent 3f15190 commit 7dab0d8

1 file changed

Lines changed: 150 additions & 0 deletions

File tree

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
---
2+
title: "Get Weather Data"
3+
output: html_document
4+
---
5+
6+
# Objective: To be able to demonstrate how to get TERRA REF meteorological data
7+
8+
This vignette shows how to read weather data for the month of January 2017 from
9+
the [weather station](https://cals-mac.arizona.edu/weather-station) at the
10+
University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/)
11+
into R. These data are stored online on the data management system Clowder,
12+
which is accessed using an API. Data across time for two of the weather
13+
variables, temperature and precipitation, are plotted in R. Lastly, how to get
14+
the list of all possible weather variables is demonstrated.
15+
16+
### Using the API to get data
17+
18+
In order to access the data, we need to contruct a URL that links to where the
19+
data is located on [Clowder](https://clowder.ncsa.illinois.edu/). The data is
20+
then pulled down using the API, which ["receives requests and sends responses"](https://medium.freecodecamp.org/what-is-an-api-in-english-please-b880a3214a82)
21+
, for Clowder.
22+
23+
### The structure of the database
24+
25+
The meteorological data that is collected for the TERRA REF project is contained
26+
in multiple related tables, also know as a [relational database](https://datacarpentry.org/sql-socialsci/01-relational-database/index.html).
27+
The first table contains data about the sensor that is collecting data. This is
28+
then linked to a stream table, which contains information about a datastream
29+
from the sensor. Sensors can have multiple datastreams. The actual weather data
30+
is in the third table, the datapoint table. A visual representation of this
31+
structure is shown below.
32+
33+
![](https://cloud.githubusercontent.com/assets/9286213/16991300/b2f2b09a-4e60-11e6-96b7-8b63c3d1f995.jpg)
34+
35+
In this vignette, we will be using data from a weather station at the Maricopa
36+
Agricultural Center, with datapoints for the month of January 2017 from a
37+
certain sensor. These data are five minute summaries aggregated from
38+
observations taken every second.
39+
40+
### Creating the URLs for all data table types
41+
42+
All URLs have the same beginning
43+
(https://terraref.ncsa.illinois.edu/clowder/api/geostreams),
44+
then additional information is added for each type of data table as shown below.
45+
46+
* Station: /sensors/sensor_name=[name]
47+
* Sensor: /sensors/[sensor number]/streams
48+
* Datapoints: /datapoints?stream_id=[datapoints number]&since=[start date]&until=[end date]
49+
50+
A certain time period can be specified for the datapoints.
51+
52+
For example, below are the URLs for the particular data being used in this
53+
vignette. These can be pasted into a browser to see how the data is stored as
54+
text using JSON.
55+
56+
* Station: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/sensors?sensor_name=UA-MAC+AZMET+Weather+Station
57+
* Sensor: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/sensors/438/streams
58+
* Datapoints: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31
59+
60+
Possible sensor numbers for a station are found on the page for that station
61+
under "id:", and then datapoints numbers are found on the sensor page under
62+
"stream_id:".
63+
64+
### Download data using the command line
65+
66+
Data can be downloaded from Clowder using the command line program Curl. If the
67+
following is typed into the commmand line, it will download the datapoints data
68+
that we're interested in as a file which we have chosen to call `spectra.json`.
69+
70+
```{sh eval=FALSE}
71+
curl -o spectra.json -X GET https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31
72+
```
73+
74+
### Read in data using R
75+
76+
```{r met-vignette-setup, include=FALSE}
77+
knitr::opts_chunk$set(cache = TRUE, message = FALSE)
78+
```
79+
80+
The same data can be accessed with the URL using the R package `jsonlite`. We
81+
are calling that library along with several others that will be used to clean
82+
and plot the data. The data is read in by the `fromJSON` function as a dataframe
83+
that also has two nested dataframes, called `properties` and `geometries`.
84+
85+
```{r met-geostream-example}
86+
library(dplyr)
87+
library(ggplot2)
88+
library(jsonlite)
89+
library(lubridate)
90+
91+
weather_all <- fromJSON('https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31', flatten = FALSE)
92+
```
93+
94+
The `geometries` dataframe is then pulled out from these data, which contains
95+
the datapoints from this stream. This is combined with a transformed version of
96+
the end of the time period from the stream.
97+
98+
```{r met-datapoints}
99+
weather_data <- weather_all$properties %>%
100+
mutate(time = ymd_hms(weather_all$end_time))
101+
```
102+
103+
The temperature data, which is five minute averages for the entire month of
104+
January 2017, is used to calculate the growing degree days for each day. Growing
105+
degree days is a measurement that is used to predict when certain plant
106+
developmental phases happen. This new dataframe will be used in the last
107+
vignette to synthesize the trait, weather, and image data.
108+
109+
```{r met-GDD}
110+
daily_values = weather_data %>%
111+
mutate(date = as.Date(time),
112+
air_temp_converted = air_temperature - 273.15) %>%
113+
group_by(date) %>%
114+
summarise(min_temp = min(air_temp_converted),
115+
max_temp = max(air_temp_converted),
116+
gdd = ifelse(sum(min_temp, max_temp) / 2 > 10,
117+
(max_temp + min_temp) / 2 - 10, 0))
118+
```
119+
120+
### Plot data using R
121+
122+
The five minute summary weather variables in the `weather_data` dataframe can be
123+
plotted across time, as shown below for temperature and precipitation.
124+
125+
```{r temp-plot}
126+
theme_set(ggthemes::theme_few())
127+
ggplot(data = weather_data) +
128+
geom_point(aes(x = time, y = air_temperature), size = 0.1) +
129+
labs(x = "Date", y = "Temperature (K)")
130+
```
131+
132+
```{r precip-plot}
133+
ggplot(data = weather_data) +
134+
geom_point(aes(x = time, y = precipitation_rate), size = 0.1) +
135+
labs(x = "Date", y = "Precipitation (kg/m2s)")
136+
```
137+
138+
### Get all available weather variables
139+
140+
The weather variables that are available from these datapoints data are
141+
extracted below from the column names of the dataframe that we read in earlier.
142+
Any of these variables that are of interest can be analyzed and plotted.
143+
144+
```{r variables-list}
145+
cols = colnames(weather_data)
146+
cols[!cols %in% c("source", "source_file", "time")]
147+
```
148+
149+
You should now be able to find, get, and use weather data from the TERRA REF
150+
project via Clowder.

0 commit comments

Comments
 (0)