|
| 1 | +--- |
| 2 | +title: "Get Weather Data" |
| 3 | +output: html_document |
| 4 | +--- |
| 5 | + |
| 6 | +# Objective: To be able to demonstrate how to get TERRA REF meteorological data |
| 7 | + |
| 8 | +This vignette shows how to read weather data for the month of January 2017 from |
| 9 | +the [weather station](https://cals-mac.arizona.edu/weather-station) at the |
| 10 | +University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) |
| 11 | +into R. These data are stored online on the data management system Clowder, |
| 12 | +which is accessed using an API. Data across time for two of the weather |
| 13 | +variables, temperature and precipitation, are plotted in R. Lastly, how to get |
| 14 | +the list of all possible weather variables is demonstrated. |
| 15 | + |
| 16 | +### Using the API to get data |
| 17 | + |
| 18 | +In order to access the data, we need to contruct a URL that links to where the |
| 19 | +data is located on [Clowder](https://clowder.ncsa.illinois.edu/). The data is |
| 20 | +then pulled down using the API, which ["receives requests and sends responses"](https://medium.freecodecamp.org/what-is-an-api-in-english-please-b880a3214a82) |
| 21 | +, for Clowder. |
| 22 | + |
| 23 | +### The structure of the database |
| 24 | + |
| 25 | +The meteorological data that is collected for the TERRA REF project is contained |
| 26 | +in multiple related tables, also know as a [relational database](https://datacarpentry.org/sql-socialsci/01-relational-database/index.html). |
| 27 | +The first table contains data about the sensor that is collecting data. This is |
| 28 | +then linked to a stream table, which contains information about a datastream |
| 29 | +from the sensor. Sensors can have multiple datastreams. The actual weather data |
| 30 | +is in the third table, the datapoint table. A visual representation of this |
| 31 | +structure is shown below. |
| 32 | + |
| 33 | + |
| 34 | + |
| 35 | +In this vignette, we will be using data from a weather station at the Maricopa |
| 36 | +Agricultural Center, with datapoints for the month of January 2017 from a |
| 37 | +certain sensor. These data are five minute summaries aggregated from |
| 38 | +observations taken every second. |
| 39 | + |
| 40 | +### Creating the URLs for all data table types |
| 41 | + |
| 42 | +All URLs have the same beginning |
| 43 | +(https://terraref.ncsa.illinois.edu/clowder/api/geostreams), |
| 44 | +then additional information is added for each type of data table as shown below. |
| 45 | + |
| 46 | +* Station: /sensors/sensor_name=[name] |
| 47 | +* Sensor: /sensors/[sensor number]/streams |
| 48 | +* Datapoints: /datapoints?stream_id=[datapoints number]&since=[start date]&until=[end date] |
| 49 | + |
| 50 | +A certain time period can be specified for the datapoints. |
| 51 | + |
| 52 | +For example, below are the URLs for the particular data being used in this |
| 53 | +vignette. These can be pasted into a browser to see how the data is stored as |
| 54 | +text using JSON. |
| 55 | + |
| 56 | +* Station: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/sensors?sensor_name=UA-MAC+AZMET+Weather+Station |
| 57 | +* Sensor: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/sensors/438/streams |
| 58 | +* Datapoints: https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31 |
| 59 | + |
| 60 | +Possible sensor numbers for a station are found on the page for that station |
| 61 | +under "id:", and then datapoints numbers are found on the sensor page under |
| 62 | +"stream_id:". |
| 63 | + |
| 64 | +### Download data using the command line |
| 65 | + |
| 66 | +Data can be downloaded from Clowder using the command line program Curl. If the |
| 67 | +following is typed into the commmand line, it will download the datapoints data |
| 68 | +that we're interested in as a file which we have chosen to call `spectra.json`. |
| 69 | + |
| 70 | +```{sh eval=FALSE} |
| 71 | +curl -o spectra.json -X GET https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31 |
| 72 | +``` |
| 73 | + |
| 74 | +### Read in data using R |
| 75 | + |
| 76 | +```{r met-vignette-setup, include=FALSE} |
| 77 | +knitr::opts_chunk$set(cache = TRUE, message = FALSE) |
| 78 | +``` |
| 79 | + |
| 80 | +The same data can be accessed with the URL using the R package `jsonlite`. We |
| 81 | +are calling that library along with several others that will be used to clean |
| 82 | +and plot the data. The data is read in by the `fromJSON` function as a dataframe |
| 83 | +that also has two nested dataframes, called `properties` and `geometries`. |
| 84 | + |
| 85 | +```{r met-geostream-example} |
| 86 | +library(dplyr) |
| 87 | +library(ggplot2) |
| 88 | +library(jsonlite) |
| 89 | +library(lubridate) |
| 90 | +
|
| 91 | +weather_all <- fromJSON('https://terraref.ncsa.illinois.edu/clowder/api/geostreams/datapoints?stream_id=46431&since=2017-01-02&until=2017-01-31', flatten = FALSE) |
| 92 | +``` |
| 93 | + |
| 94 | +The `geometries` dataframe is then pulled out from these data, which contains |
| 95 | +the datapoints from this stream. This is combined with a transformed version of |
| 96 | +the end of the time period from the stream. |
| 97 | + |
| 98 | +```{r met-datapoints} |
| 99 | +weather_data <- weather_all$properties %>% |
| 100 | + mutate(time = ymd_hms(weather_all$end_time)) |
| 101 | +``` |
| 102 | + |
| 103 | +The temperature data, which is five minute averages for the entire month of |
| 104 | +January 2017, is used to calculate the growing degree days for each day. Growing |
| 105 | +degree days is a measurement that is used to predict when certain plant |
| 106 | +developmental phases happen. This new dataframe will be used in the last |
| 107 | +vignette to synthesize the trait, weather, and image data. |
| 108 | + |
| 109 | +```{r met-GDD} |
| 110 | +daily_values = weather_data %>% |
| 111 | + mutate(date = as.Date(time), |
| 112 | + air_temp_converted = air_temperature - 273.15) %>% |
| 113 | + group_by(date) %>% |
| 114 | + summarise(min_temp = min(air_temp_converted), |
| 115 | + max_temp = max(air_temp_converted), |
| 116 | + gdd = ifelse(sum(min_temp, max_temp) / 2 > 10, |
| 117 | + (max_temp + min_temp) / 2 - 10, 0)) |
| 118 | +``` |
| 119 | + |
| 120 | +### Plot data using R |
| 121 | + |
| 122 | +The five minute summary weather variables in the `weather_data` dataframe can be |
| 123 | +plotted across time, as shown below for temperature and precipitation. |
| 124 | + |
| 125 | +```{r temp-plot} |
| 126 | +theme_set(ggthemes::theme_few()) |
| 127 | +ggplot(data = weather_data) + |
| 128 | + geom_point(aes(x = time, y = air_temperature), size = 0.1) + |
| 129 | + labs(x = "Date", y = "Temperature (K)") |
| 130 | +``` |
| 131 | + |
| 132 | +```{r precip-plot} |
| 133 | +ggplot(data = weather_data) + |
| 134 | + geom_point(aes(x = time, y = precipitation_rate), size = 0.1) + |
| 135 | + labs(x = "Date", y = "Precipitation (kg/m2s)") |
| 136 | +``` |
| 137 | + |
| 138 | +### Get all available weather variables |
| 139 | + |
| 140 | +The weather variables that are available from these datapoints data are |
| 141 | +extracted below from the column names of the dataframe that we read in earlier. |
| 142 | +Any of these variables that are of interest can be analyzed and plotted. |
| 143 | + |
| 144 | +```{r variables-list} |
| 145 | +cols = colnames(weather_data) |
| 146 | +cols[!cols %in% c("source", "source_file", "time")] |
| 147 | +``` |
| 148 | + |
| 149 | +You should now be able to find, get, and use weather data from the TERRA REF |
| 150 | +project via Clowder. |
0 commit comments