library(zonal)
library(ggplot2)
library(dplyr)
library(sf)
library(tidyr)
library(terra)
library(paint)
In zonal
categorical data is handled with
execute_zonal(..., FUN = 'freq')
which computes the
relative proportion of a numeric class in each aggregation unit. In the
following we illustrate its use using a mosaiced 1km grid containing
MODIS 2019 land cover.
file = '2019-01-01.tif'
(r = rast(file))
## class : SpatRaster
## dimensions : 2896, 4616, 1 (nrow, ncol, nlyr)
## resolution : 1000, 1000 (x, y)
## extent : -2357000, 2259000, 277000, 3173000 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +datum=NAD27 +units=m +no_defs
## source : 2019-01-01.tif
## name : Land_Cover_Type_1
terra::plot(r)
Looking at the grid we can see in consists of 13367936 grid cells each with a 1000 meter by 1000 meter resolution. Additionally, there are 18 unique values in the grid (17 land cover and one nodata value).
First, we want to identify the percent of each land cover within each
county in the USA Northeast. Doing this follows the same process as all
zonal
workflows and requires (1) identifying the
aggregation units, (2) building a weight grid and (3) running the
intersection.
AOI = AOI::aoi_get(state = "Northeast", county = "all")
plot(AOI$geometry, main = paste(nrow(AOI), "counties"))
system.time({
lc = execute_zonal(file, geom = AOI, ID = "geoid", FUN = "freq")
})
## user system elapsed
## 1.826 0.462 2.316
paint(lc)
## sf [1997, 18]
## active geometry column: geometry (MULTIPOLYGON)
## crs: 4269 (NAD83)
## crs unit: degree
## geoid chr 09001 09001 09001 09001 09001 09001
## statefp chr 09 09 09 09 09 09
## countyfp chr 001 001 001 001 001 001
## countyns chr 00212794 00212794 00212794 00212794 0~
## affgeoid chr 0500000US09001 0500000US09001 0500000~
## name chr Fairfield Fairfield Fairfield Fairfie~
## namelsad chr Fairfield County Fairfield County Fai~
## stusps chr CT CT CT CT CT CT
## state_name chr Connecticut Connecticut Connecticut C~
## lsad chr 06 06 06 06 06 06
## aland dbl 1618664029 1618664029 1618664029 1618~
## awater dbl 549280913 549280913 549280913 5492809~
## state_name.1 chr Connecticut Connecticut Connecticut C~
## state_abbr chr CT CT CT CT CT CT
## jurisdiction_type chr state state state state state state
## value dbl 17 3 11 13 4 1
## percentage dbl 0.025932 0.002465 0.012004 0.257007 0~
## geometry sfc MPOLY 1,416B MPOLY 1,416B MPOLY 1,416~
While the above works, calling on fields by their numeric ID is prone to error. Instead, this example shows how a reclassification table can be supplied to modify the column headings of the output table. A reclassification table tells us what each numeric value represents in a categorical raster. Below, we use a CSV file to define this mapping. The schema used is that one column must be named “from” - this is the existing data values, and one column must be named “to” - this is the desired column headings.
## data.frame [17, 2]
## from int 1 2 3 4 5 6
## to chr evergreen_needle evergreen_broad deciduous_needle ~
system.time({
lc = execute_zonal(file, geom = AOI, ID = "geoid", FUN = "freq", rcl = rcl)
})
## user system elapsed
## 1.808 0.454 2.285
paint(lc)
## sf [1997, 18]
## active geometry column: geometry (MULTIPOLYGON)
## crs: 4269 (NAD83)
## crs unit: degree
## geoid chr 09001 09001 09001 09001 09001 09001
## statefp chr 09 09 09 09 09 09
## countyfp chr 001 001 001 001 001 001
## countyns chr 00212794 00212794 00212794 00212794 0~
## affgeoid chr 0500000US09001 0500000US09001 0500000~
## name chr Fairfield Fairfield Fairfield Fairfie~
## namelsad chr Fairfield County Fairfield County Fai~
## stusps chr CT CT CT CT CT CT
## state_name chr Connecticut Connecticut Connecticut C~
## lsad chr 06 06 06 06 06 06
## aland dbl 1618664029 1618664029 1618664029 1618~
## awater dbl 549280913 549280913 549280913 5492809~
## state_name.1 chr Connecticut Connecticut Connecticut C~
## state_abbr chr CT CT CT CT CT CT
## jurisdiction_type chr state state state state state state
## value chr water deciduous_needle wetlands urban~
## percentage dbl 0.025932 0.002465 0.012004 0.257007 0~
## geometry sfc MPOLY 1,416B MPOLY 1,416B MPOLY 1,416~
From the exactectractr
vignettes there is an example to
compute class statistics using dplyr
. Here we compare that
approach, to one supplemented with data.table
, to
zonal
with pre-computed weights, and to a single
zonal
execution.
library(exactextractr)
library(data.table)
# exactextract with dplyr
exactextract_dplyr = function(file, AOI) {
exact_extract(terra::rast(file), AOI, function(df) {
df %>%
mutate(frac_total = coverage_fraction / sum(coverage_fraction)) %>%
group_by(geoid, value) %>%
summarise(freq = sum(frac_total), .groups = "keep")
}, summarize_df = TRUE, include_cols = "geoid", progress = FALSE)
}
# exactextract with data.table
exactextract_dt = function(file, AOI){
exact_extract(terra::rast(file), AOI, function(df) {
dt = setDT(df)
dt$frac_total = dt$coverage_fraction / sum(dt$coverage_fraction)
dt[, .(freq = sum(frac_total, na.rm = TRUE)), by = .(value)]
}, summarize_df = TRUE, include_cols = "geoid", progress = FALSE)
}
bnch <- bench::mark(
iterations = 1, check = FALSE, time_unit = "s",
exactextract_dplyr_out = exactextract_dplyr(file, AOI),
exactextract_dt_out = exactextract_dt(file, AOI),
zonal = execute_zonal(file, AOI, "geoid", FUN = "freq")
)