|
1 | | -# Traits Vignette |
| 1 | +# Accessing trait data in R |
| 2 | + |
| 3 | +```{r chunk-options-setup, echo = FALSE} |
| 4 | +
|
| 5 | +options(width = 100) |
| 6 | +
|
| 7 | +``` |
| 8 | + |
| 9 | +# Introduction |
| 10 | + |
| 11 | +The objective of this vignette is to demonstrate to users how to query TERRA REF trait data using the traits package. The traits package allows users to easily pass query parameters into a R function, and returns the data in a tabular format that can be analyzed. |
| 12 | + |
| 13 | +Through this vignette, users will learn how to query and visualize season 6 canopy height data for May 2018. In addition, users will also be shown how to find more information on a season, such as available traits and dates, when performing their own queries. |
| 14 | + |
| 15 | +\newline |
| 16 | +\newline |
| 17 | + |
| 18 | +# Getting Started |
| 19 | + |
| 20 | +First, you will need to install and load the traits package from github. |
| 21 | + |
| 22 | +```{r traits-setup, message = FALSE, results = FALSE} |
| 23 | +
|
| 24 | +devtools::install_github('terraref/traits', force = TRUE) |
| 25 | +library(traits) |
| 26 | +
|
| 27 | +
|
| 28 | +``` |
| 29 | + |
| 30 | +\newline |
| 31 | +\newline |
| 32 | + |
| 33 | +# How to query trait data |
| 34 | + |
| 35 | +## Setting options |
| 36 | + |
| 37 | +The function that you will be using to perform your queries is `betydb_query`. Options can be set to reduce the number of arguments that need to be passed into the function. |
| 38 | + |
| 39 | +Note: the `betydb_key` option only needs to be set when accessing non-public data. We will be using public data, so this option does not need to be set. However, when needed, pass in the API key that you were assigned when you first registered for access to the TERRA REF database. The key should be kept private and saved to a file named `.betykey` in your current directory. If you are having trouble locating your API key, you can go to [https://terraref.ncsa.illinois.edu/bety/users](https://terraref.ncsa.illinois.edu/bety/users). |
| 40 | + |
| 41 | + |
| 42 | +```{r options-setup} |
| 43 | +
|
| 44 | +options(betydb_key = readLines('.betykey', warn = FALSE), #need to comment this out later |
| 45 | + betydb_url = "https://terraref.ncsa.illinois.edu/bety/", |
| 46 | + betydb_api_version = 'v1') |
| 47 | +
|
| 48 | +``` |
| 49 | + |
| 50 | +## An example: Season 6 canopy height data |
| 51 | + |
| 52 | +The following is an example of how to query season 6, canopy height data for May 2018. |
| 53 | + |
| 54 | +```{r canopy_height_query, message = FALSE} |
| 55 | +
|
| 56 | +canopy_height <- betydb_query(table = "search", |
| 57 | + trait = "canopy_height", |
| 58 | + sitename = "~Season 6", |
| 59 | + date = "~2018 May", |
| 60 | + limit = "none") |
| 61 | +
|
| 62 | +
|
| 63 | +``` |
| 64 | + |
| 65 | +A breakdown of the above query: |
| 66 | + |
| 67 | +* `table = "search"` |
| 68 | + + Specify a table to query with the `table` parameter. Trait data may be queried using the `search` table. |
| 69 | + |
| 70 | +* `trait = "canopy_height"` |
| 71 | + + Specify the trait of interest with the `trait` parameter. |
| 72 | + + Trait names must be expressed exactly as they are in the TERRA REF databse. So passing in `Canopy height` instead of `canopy_height` would give NULL results. |
| 73 | + + More information on how to determine available traits for a season can be found below under `How to query other seasons, traits, and dates`. |
| 74 | + |
| 75 | +* `sitename = "~Season 6"` |
| 76 | + + Indicate the sites that you would like to query using the `sitename` parameter. |
| 77 | + + A tilde `~` is used in this query to get all sitenames that contain `Season 6` |
| 78 | + |
| 79 | +* `date = "~2018 May"` |
| 80 | + + Indicate the date of data collection using the `date` parameter. |
| 81 | + + A tilde `~` is used in this query to get all records that have a collection date that contains `2018 May` |
| 82 | + |
| 83 | +* `limit = "none"` |
| 84 | + + Indicate the maximum numnber of records you would like returned with the `limit` parameter. We want all records for this query, so we set limit to `none`. |
| 85 | + |
| 86 | +## Time series of canopy height |
| 87 | + |
| 88 | +Here is an example of how to visualize the data that we just queried. |
| 89 | + |
| 90 | +```{r canopy_height_plot, warning = FALSE, message = FALSE, results = FALSE} |
| 91 | +
|
| 92 | +#load in necessary packages |
| 93 | +library(ggplot2) |
| 94 | +library(lubridate) |
| 95 | +
|
| 96 | +#plot a time series of canopy height |
| 97 | +ggplot(data = canopy_height, |
| 98 | + aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) + |
| 99 | + geom_point(size = 0.5, position = position_jitter(width = 0.1)) + |
| 100 | + xlab("Day of Year") + ylab("Plant Height") + |
| 101 | + guides(color = guide_legend(title = 'Genotype')) + |
| 102 | + theme_bw() |
| 103 | +
|
| 104 | +``` |
| 105 | + |
| 106 | +\newline |
| 107 | +\newline |
| 108 | + |
| 109 | +# May 2018 Season 6 Summary |
| 110 | + |
| 111 | +The TERRA REF database contains other trait data for May 2018 of season 6. Each trait was measured using a specific method. Here is a summary of available traits and their corresponding methods of measurement. |
| 112 | + |
| 113 | +```{r season_6_query, message = FALSE, results = FALSE, echo = FALSE} |
| 114 | +
|
| 115 | +#load in dplyr package |
| 116 | +library(dplyr) |
| 117 | +
|
| 118 | +#get all season 6 data for May 2018 |
| 119 | +season_6 <- betydb_query(table = "search", |
| 120 | + sitename = "~Season 6", |
| 121 | + date = "~2018 May", |
| 122 | + limit = "none") |
| 123 | +#get summary |
| 124 | +season_6_summary <- season_6 %>% group_by(trait, method_name) %>% summarise(number_of_observations = n()) |
| 125 | +
|
| 126 | +``` |
| 127 | + |
| 128 | +```{r season_6_summary, echo = FALSE, comment = ""} |
| 129 | +
|
| 130 | +print.data.frame(season_6_summary) |
| 131 | +
|
| 132 | +``` |
| 133 | + |
| 134 | +\newline |
| 135 | +\newline |
| 136 | + |
| 137 | +# How to query other seasons, traits, and dates |
| 138 | + |
| 139 | +You can query other seasons, traits, and dates by changing the season number, trait name, and date in the example query. If you are unsure of what traits or dates are available for a season, you can use the following R code to get a subset of a season and figure out what specific dates and traits are available. |
| 140 | + |
| 141 | +To broaden your queries, remove specific parameters. For example, in order to get all of season 2's data for October 2016, remove the `trait` parameter. |
| 142 | + |
| 143 | +```{r season_2_query, results = FALSE, message = FALSE} |
| 144 | +
|
| 145 | +#get all of season 2 data for October 2016 |
| 146 | +season_2_sub <- betydb_query(table = "search", |
| 147 | + sitename = "~Season 2", |
| 148 | + date = "~2016 Oct", |
| 149 | + limit = "none") |
| 150 | +
|
| 151 | +``` |
| 152 | + |
| 153 | +```{r season_2_traits, comment = ""} |
| 154 | +
|
| 155 | +#get traits available for the subset of season 2 data |
| 156 | +traits <- unique(season_2_sub$trait) |
| 157 | +
|
| 158 | +print(traits) |
| 159 | +
|
| 160 | +``` |
| 161 | + |
| 162 | +```{r season_2_dates, comment = ""} |
| 163 | +
|
| 164 | +#filter for NDVI trait records |
| 165 | +ndvi <- dplyr::filter(season_2_sub, trait == 'NDVI') |
| 166 | +
|
| 167 | +#get unique dates for NDVI records |
| 168 | +ndvi_dates <- unique(ndvi$date) |
| 169 | +
|
| 170 | +print(ndvi_dates) |
| 171 | +``` |
0 commit comments