class: center, middle, inverse, title-slide # Gridded Climate Data in R ## Introduction to AOI & climateR ### Mike Johnson --- # Todays libraries ```r # From CRAN library(sf) # dealing with vector data library(raster) # raster data handling library(RNetCDF) # OPeNDAP and NetCDF access library(exactextractr) # Fast zonal statistics library(rasterVis) # From Github #remotes::install_github("mikejohnson51/AOI") #remotes::install_github("mikejohnson51/climateR") library(climateR) library(AOI) ``` --- Climate Data is a common data source used in applications across human and physical geography. For our purposes, climate data can have up to 5 dimensions - 1D: timeseries - 2D: grid - 3D: XY grid with time - 4D: XYT cube with multiple variables (temperature, rainfall) - 5D: XYT cube with multiple variables and models <div class="figure" style="text-align: center"> <img src="img/diminsions.png" alt="Figure 1: raster data model" width="50%" /> <p class="caption">Figure 1: raster data model</p> </div> --- These notes covers a few topics related to climate data and its used in R including: 1. The raster data model 2. Finding and using climate data (the **_hard_** way) 3. Finding and using climate data (the **_easy_** way) 4. Some simple applications and operations that can be applied to climate data --- # 1. The raster data model - The raster data model is one of principle type of storing geographic data - the other being object or vector. -- - The raster data model defines an coverage of space discritized into a set of (usually) regular grid cells -- - A "raster" can be thought of in the same way you might think of an image you take with your phone, the picture you see on your screen, or the image you print from your computer. -- - Whats unique about spatial rasters, is that the come with a defined coordinate reference system that describe the area of the earth (or elsewhere) it covers. -- - There are plenty of great tutorials about how to _use_ raster data in R. Here, we want to discuss the data model as it is critical to retrieving climate data from remote sources. --- ## The five components: A raster is defined by five key pieces of information: .pull-left[ - Values - An extent - The resolution (Xres, Yres) - The number of cells in the X an Y directions - A spatial reference system ] .pull-right[ <div class="figure" style="text-align: center"> <img src="img/18-raster.png" alt="Figure 2: raster data model. Source: neon" width="75%" /> <p class="caption">Figure 2: raster data model. Source: neon</p> </div> ] --- ## Constructing a basic raster: ### Values Vectors are dimensionless in R, they have length. ```r values = c(1:10000) ``` -- ```r length(values) ``` ``` ## [1] 10000 ``` ```r dim(values) ``` ``` ## NULL ``` ```r head(values, 10) ``` ``` ## [1] 1 2 3 4 5 6 7 8 9 10 ``` --- ### X and Y diminsions - Adding a `dim` attribute to an atomic vector allows it to behave like a multi-dimensional array. - A special case of the array is the matrix, which has 2 dimensions. ```r dim(values) = c(100,100) ``` -- ```r length(values) ``` ``` ## [1] 10000 ``` ```r dim(values) ``` ``` ## [1] 100 100 ``` ```r values[1:5, 1:5] ``` ``` ## [,1] [,2] [,3] [,4] [,5] ## [1,] 1 101 201 301 401 ## [2,] 2 102 202 302 402 ## [3,] 3 103 203 303 403 ## [4,] 4 104 204 304 404 ## [5,] 5 105 205 305 405 ``` --- # Raster as an image .pull-left[ ```r (r = raster::raster(values)) ``` ``` ## class : RasterLayer ## dimensions : 100, 100, 10000 (nrow, ncol, ncell) ## resolution : 0.01, 0.01 (x, y) ## extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax) ## crs : NA ## source : memory ## names : layer ## values : 1, 10000 (min, max) ``` - Origin: 0,0 - Max: 1,1 - Cells distributed across this space where: - `\(xres = \frac{1}{ncol}\)` - `\(yres = \frac{1}{nrow}\)` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-9-1.png" width="75%" style="display: block; margin: auto;" /> ] --- ## Adding an Extent - We want to define the **spatial** extent of the raster we just created ... -- - Lets say the raster covers the **extent** of Colorado defined by the minimum and maximum latitude & longitude of the state... ```r extent(r) = c(-109.06006, -102.04188, 36.99243, 41.00307 ) r ``` ``` ## class : RasterLayer ## dimensions : 100, 100, 10000 (nrow, ncol, ncell) ## resolution : 0.0701818, 0.0401064 (x, y) ## extent : -109.0601, -102.0419, 36.99243, 41.00307 (xmin, xmax, ymin, ymax) ## crs : NA ## source : memory ## names : layer ## values : 1, 10000 (min, max) ``` -- <br> - Note that the assignment of an extent modified the resolution - `\(xres = \frac{xmax-xmin}{ncol}\)` - `\(yres = \frac{ymax - ymin}{nrow}\)` --- ### Spatial Reference System .pull-left[ ```r crs(r) = sf::st_crs(4326)$proj4string r ``` ``` ## class : RasterLayer ## dimensions : 100, 100, 10000 (nrow, ncol, ncell) ## resolution : 0.0701818, 0.0401064 (x, y) ## extent : -109.0601, -102.0419, 36.99243, 41.00307 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs ## source : memory ## names : layer ## values : 1, 10000 (min, max) ``` ] -- .pull-right[ ```r plot(AOI::aoi_get(state = "conus")$geometry) plot(r, add = TRUE, legend = FALSE) ``` <img src="index_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] --- ## Time (or any 3D) <div class="figure" style="text-align: center"> <img src="img/cube.png" alt="Figure 2: raster data model with time. [Source](https://edzer.github.io/jena/)" width="50%" /> <p class="caption">Figure 2: raster data model with time. [Source](https://edzer.github.io/jena/)</p> </div> --- ## XYT (3D) raster .pull-left[ ```r values = sample(1:5, 40000, replace = TRUE) dim(values) = c(100,100, 4) length(values) ``` ``` ## [1] 40000 ``` ```r dim(values) ``` ``` ## [1] 100 100 4 ``` ```r b = raster::brick(values) extent(b) = c(-109.06006, -102.04188, 36.99243, 41.00307) crs(b) = sf::st_crs(4326)$proj4string b ``` ``` ## class : RasterBrick ## dimensions : 100, 100, 10000, 4 (nrow, ncol, ncell, nlayers) ## resolution : 0.0701818, 0.0401064 (x, y) ## extent : -109.0601, -102.0419, 36.99243, 41.00307 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs ## source : memory ## names : layer.1, layer.2, layer.3, layer.4 ## min values : 1, 1, 1, 1 ## max values : 5, 5, 5, 5 ``` ] -- .pull-right[ ```r rasterVis::levelplot(b) ``` <img src="index_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> ] --- ## So what? ### rasters are matrices / lists of matrices ```r #subsetting by row,col index r[1,3] ``` ``` ## [1] 201 ``` ```r dim(r) ``` ``` ## [1] 100 100 1 ``` ```r length(r) ``` ``` ## [1] 10000 ``` -- ```r b[[1]][1,7] ``` ``` ## [1] 3 ``` ```r dim(b) ``` ``` ## [1] 100 100 4 ``` ```r length(b) ``` ``` ## [1] 40000 ``` --- # Key Concepts - There is a fundamental difference between the value **location**, and the value **index** -- - location is related to the _SRS_ -- - index is based on the _Xdim_ and _Ydim_ -- - They **are** related at the raster level, but **not** at the value/matrix/array level... -- - This will come back shortly 😄 --- # Stack vs Layer: R nomenclature - A raster layer is a single 2D raster object - A raster stack is a collection of 2D raster objects with identical extents and diminisons stored as a single object <img src="img/layer-v-stack.png" width="50%" style="display: block; margin: auto;" /> --- # The landscape of climate data: - Multi-dimensional cubes covering large domains, over long times period, with many variables, and possibly ensembles -- - **HUGE** Data (often terabytes to petabytes) -- - Rarely do we need **all** the data across time -- - To reduce size we can: -- - Use raster utilities like `mask` or `crop` on in memory data (**memory** limited) -- - Call _specific_ pieces of data from local files (e.g. with gdal) (**disk** limited) -- - Call _specific_ pieces of data from remote files (OPeNDAP) (**internet** limited) --- ## Data from remote files: Web based Access and Subsetting The THREDDS Data Server (TDS) is a web server that provides data access for scientific datasets, using OPeNDAP, OGC WMS and WCS, HTTP, and other remote data access protocols. -- - THREDDS **Catalogs** provide virtual directories of available data and their associated metadata. -- - The Netcdf-Java/CDM library reads multidiminsional data (netCDF, HDF5, GRIB, etc.) into a **Common Data Model** (CDM). -- - An integrated server provides OPeNDAP access to any CDM dataset. OPeNDAP is a widely used, subsetting data access method extending the HTTP protocol. --- ## THREDDS Catalogs <img src="img/thredds.png" width="75%" style="display: block; margin: auto;" /> --- ## Common Data Model: Precipitation <img src="img/opendap.png" width="50%" style="display: block; margin: auto;" /> --- ## OPeNDAP <img src="img/opendap-constraint.png" width="75%" style="display: block; margin: auto;" /> --- ## Our job is then five part: - Define the extent we want respecting the spatial resolution, SRS, and time step of the data -- - Convert those values to the data set positions -- - Build an OPeNDAP URL -- - Send it out and retrieve the data (vector or matrix of values) -- - Restructure it into a useable spatial format with SRS and extent --- <div class="figure" style="text-align: center"> <img src="img/cookie-cutter.png" alt="Data Stabs and Cookie Cutters Workflow" width="60%" /> <p class="caption">Data Stabs and Cookie Cutters Workflow</p> </div> --- # 1. Define the extent/point in our SRS (X, Y, time) Defining and querying spatial boundaries is facilitated by the [AOI package ](https://mikejohnson51.github.io/AOI/) which wraps the OSM geolocation API -- ## Geocoding ```r library(AOI) AOI::geocode("UCSB") ``` ``` ## request lat lon ## 1 UCSB 34.4146 -119.8458 ``` --- ## Geocode with Spatial Return ```r AOI::geocode("UCSB", pt = TRUE) %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## Boundary Returns ```r AOI::aoi_get("UCSB") %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## User vs OSM defined Boundaries e.g. 30 mile by 20 mile square regions surrounding "Garden of the Gods" ```r AOI::aoi_get(list("Garden of the Gods", 30, 30)) %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## USA State Boundary Returns ```r AOI::aoi_get(state = "CA") %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## USA State/County Boundary Returns ```r AOI::aoi_get(state = "CA", county = "Santa Barbara") %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## Multi State / All County Boundary Returns ```r AOI::aoi_get(state = c("CA", "OR", "WA"), county = "all") %>% AOI::aoi_map(returnMap = TRUE) ```
--- ## Country Boundary Returns ```r AOI::aoi_get(country = "Netherlands") %>% AOI::aoi_map(returnMap = TRUE) ```
--- # 2. Finding and using climate data (the hard way) --- ## We are going into the weeds: - Necessary for understanding, not for implementation! <div class="figure" style="text-align: center"> <img src="img/weeds.jpg" alt="Data Stabs and Cookie Cutters Workflow" width="60%" /> <p class="caption">Data Stabs and Cookie Cutters Workflow</p> </div> --- class: center, middle, inverse > _"I want daily precipitation data for California on January 19th, 2018...."_ --- # Cookie Cutter ... .pull-left[ ```r AOI = aoi_get(state = "CA") (extent = st_bbox(AOI)) ``` ``` ## xmin ymin xmax ymax ## -124.40959 32.53416 -114.13905 42.00925 ``` ```r (crs = st_crs(AOI)$proj4string) ``` ``` ## [1] "+proj=longlat +datum=NAD83 +no_defs" ``` ] -- .pull-right[ ```r plot(AOI$geometry, border = "darkred",lwd = 2) plot(st_as_sfc(extent), add = TRUE, lwd = 3) ``` <img src="index_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> ] --- # Resource Connection ... Note: '#fillmismatch' is appended to the call in order forcibly convert the _FillValue value to the type of the variable on DAP calls ```r catolgue = "http://thredds.northwestknowledge.net:8080/thredds/dodsC/" cdm = "agg_met_pr_1979_CurrentYear_CONUS.nc" url = paste0(catolgue, cdm, "#fillmismatch") ``` -- ```r (nc = RNetCDF::open.nc(url)) ``` ``` ## [1] 65536 ## attr(,"handle_ptr") ## <pointer: 0x7f92cb46d280> ## attr(,"class") ## [1] "NetCDF" ``` --- # Variable Extraction ... With a connection to a remote resource (object `nc`), we can extract the 1D coordinate and time variables .pull-left[ ```r X = RNetCDF::var.get.nc(nc, "lon") Y = RNetCDF::var.get.nc(nc, "lat") time = RNetCDF::var.get.nc(nc, "day") length(X) ``` ``` ## [1] 1386 ``` ```r length(Y) ``` ``` ## [1] 585 ``` ```r length(time) ``` ``` ## [1] 15267 ``` ] -- .pull-right[ ```r head(X) ``` ``` ## [1] -124.7667 -124.7250 -124.6833 -124.6417 -124.6000 -124.5583 ``` ```r head(Y) ``` ``` ## [1] 49.40000 49.35833 49.31667 49.27500 49.23333 49.19167 ``` ```r head(time) ``` ``` ## [1] 28854 28855 28856 28857 28858 28859 ``` ] --- # Index Identification ... - Great! We know the grid coordinates, but need the position of the California bounding coordinates in relation to the CDM dataset ... - CDM Arrays are 0 indexed - not 1 indexed like R! - note the use of `sort` here, we will come back to this soon... .pull-left[ ```r (xmin = which.min(abs(sort(X) - extent$xmin)) - 1) ``` ``` ## [1] 9 ``` ```r (xmax = which.min(abs(sort(X) - extent$xmax)) - 1) ``` ``` ## [1] 255 ``` ```r (ymin = which.min(abs(sort(Y) - extent$ymin)) - 1) ``` ``` ## [1] 179 ``` ```r (ymax = which.min(abs(sort(Y) - extent$ymax)) - 1) ``` ``` ## [1] 407 ``` ] .pull-right[ ```r time = as.Date("2018-01-19") - as.Date("1979-01-01") (time = as.numeric(time)) ``` ``` ## [1] 14263 ``` ] --- # Construct URL and establish connection ... ```r url = paste0( catolgue, cdm, "?precipitation_amount", "[", time, ":1:", time, "]", "[", ymin, ":1:", ymax, "]", "[", xmin, ":1:", xmax, "]", "#fillmismatch") (nc = RNetCDF::open.nc(url)) ``` ``` ## [1] 196608 ## attr(,"handle_ptr") ## <pointer: 0x7f92bcc89340> ## attr(,"class") ## [1] "NetCDF" ``` -- Can we get coordinates from this resource? -- ```r RNetCDF::var.get.nc(nc, "lat") ``` ``` ## Error in var.inq.nc(ncfile, variable): NetCDF: Variable not found ``` --- # Is it right? ```r rain = RNetCDF::var.get.nc(nc, "precipitation_amount", unpack = TRUE) ``` -- ```r dim(rain) ``` ``` ## [1] 247 229 ``` -- ```r plot(raster::raster(rain)) ``` <img src="index_files/figure-html/unnamed-chunk-44-1.png" style="display: block; margin: auto;" /> --- # Orient data, enforce spatial constraints ... ```r rain = raster(t(rain)) crs(rain) = crs extent(rain) = extent(c(X[xmin], X[xmax], Y[ymax], Y[ymin])) ``` -- ```r plot(rain) plot(AOI$geometry, add = TRUE) ``` <img src="index_files/figure-html/unnamed-chunk-46-1.png" style="display: block; margin: auto;" /> --- # 3. Finding and using climate data (the easy way) --- # Coming out the weeds ... <img src="img/out-of-the-weeds.jpeg" width="60%" style="display: block; margin: auto;" /> --- # climateR - The [climateR package ](https://mikejohnson51.github.io/climateR/) provides access to 11 climate resources centralized under a common access pattern |**Number**|**Dataset** | **Description** | **Dates** | |----------|---------------------| -----------------------------------------------------------|----------------------| |1 | **GridMET** | Gridded Meteorological Data. | 1979 - Yesterday | |2 | **Daymet** | Daily Surface Weather and Climatological Summaries | 1980 - 2019 | |3 | **TopoWX** | Topoclimatic Daily Air Temperature Dataset | 1948 - 2016 | |4 | **PRISM** | Parameter-elevation Regressions on Independent Slopes | 1981 - (Yesterday-1) | |5 | **MACA** | Multivariate Adaptive Constructed Analogs | 1950 - 2099 | |6 | **LOCA** | Localized Constructed Analogs | 1950 - 2100 | |7 | **BCCA** | Bias Corrected Constructed Analogs | 1950 - 2100 | |8 | **BCSD** | Bias Corrected Spatially Downscaled VIC: Monthly Hydrology | 1950 - 2099 | |9 | **TerraClimate** | TerraClimate Monthly Gridded Data | 1958 - 2019 | |10 | **CHIRPS** | Climate Hazards Group InfraRed Precipitation with Station | 1980 - Current month | |11 | **EDDI** | Evaporative Demand Drought Index | 1980 - Current year | --- # Foundations for today - `climateR` function signatures all have 3 minimal requirements: - AOI
- climate parameter
- startDate
-- - Today we we look at three of these - `getGridMET` for historic daily reanalysis data (USA) - `getMACA` for downscaled climate projections (USA) - `getTerraClim` for global, monthly historic data -- ```r library(climateR) ``` --- ## Single layer (1 variable, 1 day, California) ```r rain = getGridMET(aoi_get(state = "CA"), param = "prcp", startDate = "2018-01-19") str(rain, max.level = 2) ``` ``` ## List of 1 ## $ prcp:Formal class 'RasterStack' [package "raster"] with 11 slots ``` -- ```r plot(rain$prcp) plot(aoi_get(state = "CA")$geometry, add = TRUE) ``` <img src="index_files/figure-html/unnamed-chunk-50-1.png" style="display: block; margin: auto;" /> --- ## 3D (1 variable, 16 days, Gulf Coast) ```r harvey = getGridMET(aoi_get(state = c("TX", "FL")), param = "prcp", startDate = "2017-08-20", endDate = "2017-09-03") ``` -- ```r levelplot(harvey$prcp, par.settings = BTCTheme, main = "Hurricane Harvey") ``` <img src="index_files/figure-html/unnamed-chunk-52-1.png" style="display: block; margin: auto;" /> --- ```r sandy = getGridMET(aoi_get(state = c("NC", "ME")), param = "prcp", startDate = "2012-10-22", endDate = "2012-11-02") ``` -- ```r levelplot(sandy$prcp, par.settings = BTCTheme, main = "Hurricane Sandy") ``` <img src="index_files/figure-html/unnamed-chunk-54-1.png" style="display: block; margin: auto;" /> --- # Aggregation over a time period: .pull-left[ ```r levelplot(max(harvey$prcp)/25.4, par.settings = BTCTheme, margin = FALSE, main = "Inches of Rainfall") ``` <img src="index_files/figure-html/unnamed-chunk-55-1.png" style="display: block; margin: auto;" /> ] .pull-right[ ```r levelplot(max(sandy$prcp)/25.4, par.settings = BTCTheme, margin = FALSE, main = "Inches of Rainfall") ``` <img src="index_files/figure-html/unnamed-chunk-56-1.png" style="display: block; margin: auto;" /> ] --- # Maximum Rainfall over the last 30 days ```r p2020 = getGridMET(aoi_get(state = "conus"), param = "prcp", startDate = "2020-09-15", endDate = "2020-10-15") levelplot(max(p2020$prcp)/25.4, par.settings = BTCTheme, margin = FALSE, main = "Maximum Rainfall (2020/09/15 - 2020/10/15)\n~ Last 30 days") ``` <img src="index_files/figure-html/unnamed-chunk-57-1.png" style="display: block; margin: auto;" /> --- ## Multi-**layer** (2 variables, 1 day, Michigan) ```r grids = getGridMET(aoi_get(state = "MI"), param = c("tmax", 'prcp'), startDate = "2018-10-10") ``` -- ```r plot(stack(grids), main = c("Rainfall", "Maximum Temperture")) ``` <img src="index_files/figure-html/unnamed-chunk-59-1.png" style="display: block; margin: auto;" /> --- ## Multi-**Cube** (3 variables, 3 day, Washington State) .pull-left[ ```r system.time({ grids = getGridMET(aoi_get(state = "WA"), param = c("tmax", 'prcp', 'srad'), startDate = "2018-10-10", endDate = "2018-10-12") }) ``` ``` ## user system elapsed ## 0.240 0.121 1.026 ``` ] -- .pull-right[ ```r plot(stack(grids)) ``` <img src="index_files/figure-html/unnamed-chunk-61-1.png" style="display: block; margin: auto;" /> ] --- # Parameter metadata... - Model parameter data is stored in the `param_meta` object - Looking at the `gridmet` offerings we see ... ```r DT::datatable(param_meta$gridmet) ```
--- # Ensemble Cubes - Future forecasts (maca vs gridmet) - Uncertain future = multiple _models_ - Uncertain future = different _pathways_ <img src="img/CMIP5-short.png" width="50%" style="display: block; margin: auto;" /> --- ## User defined models (5 models + 1 summary, 1 day, 1 variable, Colorado) .pull-left[ ```r models = c("bnu-esm","canesm2", "ccsm4", "cnrm-cm5", "csiro-mk3-6-0") ensembles = getMACA(aoi_get(state = "CO"), param = 'tmin', model = models, startDate = "2080-11-29") #stack model outputs and add a mean to stack s = stack(ensembles) s = addLayer(s, mean(s)) names(s) = c(models, "Ensemble Mean") ``` ] -- .pull-right[ ```r levelplot(s, par.settings = BuRdTheme) ``` <img src="index_files/figure-html/unnamed-chunk-65-1.png" style="display: block; margin: auto;" /> ] --- ## Random model choice (4 models, 2 days, 1 variable, Colorado) .pull-left[ ```r random = getMACA(aoi_get(state = "CO"), param = 'prcp', model = 4, startDate = "2080-11-29", endDate = "2080-11-30") ``` ] -- .pull-right[ ```r levelplot(stack(random), par.settings = BTCTheme, layout=c(2, 4)) ``` <img src="index_files/figure-html/unnamed-chunk-67-1.png" style="display: block; margin: auto;" /> ] --- ### Multi-Scenario, Mulit-Model (2 models, 2 days, 2 scenarios, 1 variable, Tactas) ```r random = getMACA(aoi_get(state = c("TX")), param = 'tmax', model = 2, scenario = c("rcp45", "rcp85"), startDate = "2080-11-29", endDate = "2080-11-30") random = stack(random) %>% setNames(do.call(paste0, expand.grid(names(random), names(random[[1]])))) ``` --- ```r levelplot(random, par.settings = BuRdTheme, main = "Texas Model/Scenario Combinations", layout=c(4,2)) ``` <img src="index_files/figure-html/unnamed-chunk-69-1.png" style="display: block; margin: auto;" /> --- # Model Metadata Models can be confusing, for each resource offering future projections, the meta data is stored in the `model_meta` object. ```r DT::datatable(model_meta$maca) ```
--- ### Point Based Time Series (1 site, 275 timesteps, 1 variable) .pull-left[ ```r library(ggplot2) srad = getGridMET(AOI::geocode("UCSB", pt = TRUE), param = "srad", startDate = "2020-01-01", endDate = "2020-10-01") head(srad) ``` ``` ## source lat lon date srad ## 1 gridmet 34.4 -119.85 2020-01-01 146.1 ## 2 gridmet 34.4 -119.85 2020-01-02 149.3 ## 3 gridmet 34.4 -119.85 2020-01-03 151.5 ## 4 gridmet 34.4 -119.85 2020-01-04 141.2 ## 5 gridmet 34.4 -119.85 2020-01-05 148.3 ## 6 gridmet 34.4 -119.85 2020-01-06 156.5 ``` ] -- .pull-right[ ```r ggplot(data = srad, aes(x = date, y = srad, col = srad)) + geom_line() + labs(title = "Solar Radiation @ UCSB", x = "Date", y = "Solar Radiation (W/m^2)") + stat_smooth(col = "red") + theme_linedraw() + scale_color_viridis_c() ``` <img src="index_files/figure-html/unnamed-chunk-72-1.png" style="display: block; margin: auto;" /> ] --- ### Multi Point Time Series (100 sites, 1 variable, 12 months) A common task is the need to extract gridded data at the location of observation sites in order to evaluate the utility of a model/reanalysis product for that location This example was a request from a researcher at Universidade Estadual de Campinas, Brazil .pull-left[ ```r braz = AOI::aoi_get(country = "BRA") sites = read.csv('data/example.csv') %>% st_as_sf(coords = c("long", "lat"), crs = 4326) brazil_rain = getTerraClim(AOI = sites, param = "prcp", startDate = "2018-01-01", endDate = "2018-12-31") ``` ] -- .pull-right[ ```r plot(braz$geometry) plot(brazil_rain$prcp$X2018.01, add = TRUE) plot(sites$geometry, add = TRUE, pch = 16, cex = .2) ``` <img src="index_files/figure-html/unnamed-chunk-74-1.png" style="display: block; margin: auto;" /> ] --- # ClimateR multi-point extraction .pull-left[ ```r site_extract = extract_sites(brazil_rain, sites, "ID") site_extract$prcp[1:10, 1:4] ``` ``` ## date site_190760 site_267801 site_219885 ## 1 2018-01-01 316 331 379 ## 2 2018-02-01 115 133 111 ## 3 2018-03-01 189 313 238 ## 4 2018-04-01 39 44 36 ## 5 2018-05-01 31 20 31 ## 6 2018-06-01 97 36 72 ## 7 2018-07-01 59 44 71 ## 8 2018-08-01 100 51 88 ## 9 2018-09-01 119 69 115 ## 10 2018-10-01 211 186 214 ``` ] -- .pull-right[ ```r library(tidyr) brazil_sites = pivot_longer(site_extract$prcp, -date) head(brazil_sites) ``` ``` ## # A tibble: 6 x 3 ## date name value ## <date> <chr> <dbl> ## 1 2018-01-01 site_190760 316 ## 2 2018-01-01 site_267801 331 ## 3 2018-01-01 site_219885 379 ## 4 2018-01-01 site_200445 292 ## 5 2018-01-01 site_74789 218 ## 6 2018-01-01 site_18343 297 ``` ] --- # ClimateR multi-point extraction .pull-left[ ```r ggplot(data = site_extract$prcp, aes(x = date, y = site_190760)) + geom_line() + theme_linedraw() ``` <img src="index_files/figure-html/unnamed-chunk-77-1.png" style="display: block; margin: auto;" /> ] -- .pull-right[ ```r ggplot(data = brazil_sites, aes(x = date, y = value, col = name)) + scale_color_viridis_d() + geom_line() + theme_linedraw() + theme(legend.position = "none") ``` <img src="index_files/figure-html/unnamed-chunk-78-1.png" style="display: block; margin: auto;" /> ] --- ## Fast Zonal Statistics with `exactextractr` .pull-left[ ```r counties = aoi_get(state = "conus", county = "all") param = c("tmax", "tmin", "prcp", "srad") s = getTerraClim(counties, param = param, startDate = "2018-06-01") %>% stack() %>% setNames(param) ``` ] .pull-right[ ```r plot(s) ``` <img src="index_files/figure-html/unnamed-chunk-80-1.png" style="display: block; margin: auto;" /> ] --- ## Fast Zonal Statistics with `exactextractr` ```r library(dplyr) library(exactextractr) dat = cbind(counties, exact_extract(s, counties, "mean", progress = FALSE)) d3 = select(dat, starts_with("mean")) ``` -- ```r (d3) ``` ``` ## Simple feature collection with 3108 features and 4 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: -124.7258 ymin: 24.49813 xmax: -66.9499 ymax: 49.38436 ## geographic CRS: NAD83 ## First 10 features: ## mean.tmax mean.tmin mean.prcp mean.srad geometry ## 1 111.76131 227.0853 28.63453 16.88646 MULTIPOLYGON (((-83.35353 3... ## 2 153.75168 242.0251 29.13041 16.53436 MULTIPOLYGON (((-98.80777 4... ## 3 124.43208 231.2724 25.87235 14.74772 MULTIPOLYGON (((-91.65045 4... ## 5 38.71700 282.4640 34.98621 22.27932 MULTIPOLYGON (((-98.92015 3... ## 6 103.55418 292.9190 34.51898 20.93750 MULTIPOLYGON (((-98.62315 3... ## 7 201.54256 234.8227 28.66088 17.36349 MULTIPOLYGON (((-95.74161 4... ## 8 90.87596 259.2128 31.80395 20.50548 MULTIPOLYGON (((-89.72134 3... ## 9 156.34970 248.8716 33.12452 21.52703 MULTIPOLYGON (((-82.0565 27... ## 10 154.31866 276.6255 30.45088 16.74695 MULTIPOLYGON (((-99.64346 4... ## 11 59.71199 253.2209 25.14288 11.93623 MULTIPOLYGON (((-84.85093 4... ``` --- ## Fast Zonal Statistics with `exactextractr` ```r plot(st_transform(d3, 5070), border = NA) ``` <img src="index_files/figure-html/unnamed-chunk-83-1.png" style="display: block; margin: auto;" /> --- # BONUS! Animation ... ```r library(gifski) south = AOI::aoi_get(state = "conus") save_gif( {for(i in 1:nlayers(harvey$prcp)) { plot(harvey$prcp[[i]], col = blues9, legend = FALSE, main = names(harvey$prcp)[i]) plot(south$geometry, add = TRUE) } }, gif_file = "img/ppt.gif", width = 800, height = 600, delay = .75, loop = TRUE) ``` --- ![](img/ppt.gif) --- ```r gulf = aoi_get(state = "conus", county = "all") bb = st_as_sfc(st_bbox(harvey$prcp)) %>% st_transform(st_crs(gulf)) gulf = st_intersection(gulf,bb) rainfall = exact_extract(harvey$prcp, gulf, "mean", progress = FALSE) breaks = seq(min(rainfall, na.rm = TRUE), max(rainfall, na.rm = TRUE), length.out = 100) save_gif( {for(i in 1:ncol(rainfall)) { gulf$rain = rainfall[, i] plot(gulf['rain'], breaks = breaks, main = colnames(rainfall)[i]) } }, gif_file = "img/ppt2.gif", width = 800, height = 600, delay = .75, loop = TRUE) ``` --- ![](img/ppt2.gif) --- # Conclusions - Climate data is a useful resource for environmental data science - The nature of climate models and data make the resources themselves HUGE - To only get the data we need, we can request specific subsets of remotely stored data - To do this, we need a good understanding of the raster data structure and web technologies - `AOI` & `climateR` are two packages that facilitate these workflows - The way you use climate data is endless but require an understanding of raster data processing and manipulation... - More detail about the raster data model can be found [here](https://mikejohnson51.github.io/spds/lecture-17.html) - Some starting material on raster manipulation can be found [here](https://mikejohnson51.github.io/spds/lecture-18.html) - Some starting material on map algebra can be found [here](https://mikejohnson51.github.io/spds/lecture-19) - Some starting material on kmeans raster classification can be found [here](https://mikejohnson51.github.io/spds/lecture-20) - In a few weeks Casey will discuss even bigger and better things! --- class: middle, center, inverse # Thank you **Mike Johnson** ****** [Email
](mailto:jmj00@ucsb.edu) [Github
](https://github.com/mikejohnson51) [Webpage
](http://mikejohnson51.github.io) ****** [AOI
](https://github.com/mikejohnson51/AOI) [climateR
](https://github.com/mikejohnson51/climateR)