Demonstrate: basic terra workflows (read, crop, mask, aggregate)
Apply: simple raster algebra and zonal/focal/global operations
Interpret: examples of raster analyses (elevation, clustering)
Recap: sf and spatial concepts
Spatial phenomena can be thought of as discrete objects with clear boundaries or as continuous phenomena that can be observed everywhere, with no natural boundaries.
We have described these as objects and fields (Kuhn, 2012)
Objects are usually represented by vector data consisting of:
a geometry (simple features (sfc, sfg))
some attribute information (data.frame)
In R these are unified as a sf object
Field data is typically represented by raster data
For this, we will begin our discussions using the terra package
Continuous data can be represented either as continuous or categorical values on a regular grid of cells (a matrix).
Cells have: (1) a resolution, (2) an inferred coordinate, typically referenced by centroid, and (3) a value that applies to the entire cell area.
Recap: R data structures
Vector:
A vector can have dimensions
A 1D vector is a collection of values
A 2D vector is a matrix
A 3D vector is an array
List: a collection of objects
data.frame: a list with requirement of equal length column (vectors)
data.frames and lists (sfc) defined our vector model
Arrays will define our raster model
Spatial Extent
One last topic with respect to vector data (that will carry us into raster) is the idea of an extent:
(ny <- AOI::aoi_get(state ="NY") |>st_transform(5070) |> dplyr::select(name))#> Simple feature collection with 1 feature and 1 field#> Geometry type: MULTIPOLYGON#> Dimension: XY#> Bounding box: xmin: 1319503 ymin: 2149150 xmax: 1997508 ymax: 2658544#> Projected CRS: NAD83 / Conus Albers#> name geometry#> 1 New York MULTIPOLYGON (((1661336 263...
In geometry, the minimum bounding box for a point set (stored as POINT, POLYLINE, POLYGON) in N dimensions is “…the box with the smallest measure within which all the points lie.”
We can extract bounding box coordinates with st_bbox
length(cent) # how many grid tiles#> [1] 3468mapview::npts(grid1km) # how many points?#> [1] 17340mapview::npts(grid1km) *2# how many X and Y?#> [1] 34680
Equal area from centroid
We can use our voronoi diagram to show that the area closest to a cell centroid is the cell itself.
vor =st_union(cent) |>st_voronoi() |>st_cast() |>st_intersection(bb)plot(ny$geometry); plot(vor, add =TRUE)
Raster Model
The raster model is one of the earliest and most widely used data models within geographic information systems (Tomlin, 1990; Goodchild, 1992, Maguire, 1992).
Typically used to record, analyze and visualize data with a continuous nature such as elevation, temperature (“GIS”), or reflected or emitted electromagnetic radiation (“Remote Sensing”)
Quotes are used because you’ll find from a data perspective these differences are artificial and a product of the ESRI/ENVI/ERDAS divide
The term raster originated from the German word for screen, implying a series of orthogonality oriented parallel lines.
Digital raster objects most often take the form of a regularly spaced, grid-like pattern of rows and columns
Each element referred to as a cell, pixel, or grid point.
Many terms mean the same thing …
The entire raster is sometimes referred to as an “image”, “array”, “surface”, “matrix”, or “lattice” (Wise, 2000).
They all mean the same thing.
Cells of the raster are most often square, but may be rectangular (with differing resolutions in x and y directions) or other shapes that can be tessellated such as triangles and hexagons (Figure below from Peuquet, 1984).
Photos and Computers …
Aerial Imagery (really just a photo 😄)
What is stored in these cells?
Categorical Values (integer/factor)
Continuous Values (numeric)
Spectral Values
Either Color, or sensor
Any digital image contains an RGB channel for color:
Red, green, and blue are the three additive colors (primary colors of light)
In R, colors can be defined using the RGB channels
par(mfrow =c(1,3), mar =c(0,0,0,0))plot(ny$geometry, col =rgb(1,0,0)) # redplot(ny$geometry, col =rgb(0,1,0)) # greenplot(ny$geometry, col =rgb(0,0,1)) # blue
RGB and bytes/bits
The red, green and blue use 8 bits each (1 byte), which each have integer values from 0 to 255.
How does this relate to the selection of 256x256 pixel tiles in web maps?
RGB values are what matter most for raster imagery in this course.
Digital images store color with red, green, and blue channels.
Printers use a different system (CMYK), but the core idea is the same: color is encoded numerically.
For us, the important takeaway is that image rasters store values by channel.
par(mfrow =c(1,4), mar =c(0,0,0,0))plot(ny$geometry, col =rgb(0,1,1)) # cyanplot(ny$geometry, col =rgb(1,0,1)) # Magentaplot(ny$geometry, col =rgb(1,1,0)) # Yellowplot(ny$geometry, col =rgb(0,0,0)) # black
Why do we care?
Pixels are the base unit of raster data and have a resolution
This is the X and the Y dimension of each cell in the units of the CRS
Resolution drives image clarity (granularity)
Higher resolution (smaller cells) = more detail, but bigger data!
Raster images seek to discretize the real world into cell-based values
Again either integer (categorical), continuous, or signal
All rasters have an extent!
This is the same extent as a bounding box
Can be described as 4 values (xmin,ymin,xmax,ymax)
Implicit Coordinates
Unlike vector data, the raster data model stores cell coordinates indirectly.
Coordinates are derived from the raster origin, the resolution, and the cell index (for example, row 100 and column 150).
In practice, the column position determines movement in the x-direction and the row position determines movement in the y-direction. Cell-center coordinates are inferred from that structure rather than stored one by one.
So, any image (.png, .tif, .gif) can be read as a raster…
The raster is defined by the extent and resolution of the cells
To be spatial, the extent (and thus the coordinates) must be grounded in a CRS
A SpatRast represents single-layer (variable) raster data.
A SpatRast always stores the fundamental parameters that describe it. - The number of columns and rows, - The spatial extent - The Coordinate Reference System.
In addition, a SpatRast can store information about the file where raster values are stored (if there is such a file).
Compared to S3, the S4 object system is stricter and closer to other object-oriented systems.
What does this mean for us?
A SpatRaster is a formal object with a defined representation.
In practice, that means it has structured components and methods for things like extent, CRS, dimensions, and values. Slots can be accessed with @, even though we will usually prefer accessor functions.
str(r, max.level =2)#> S4 class 'SpatRaster' [package "terra"]r@pntr#> C++ object <0x11f011920> of class 'SpatRaster' <0x162343ab0>r@pntr$ncol#> Class method definition for method ncol()#> function () #> {#> " unsigned long ncol() \n docstring : ncol"#> .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: 0x10306d100>, #> dll = list(name = "Rcpp", path = "/Users/mikejohnson/Library/R/arm64/4.4/library/Rcpp/libs/Rcpp.so", #> dynamicLookup = TRUE, handle = <pointer: 0x73c71650>, #> info = <pointer: 0x1030088d0>, forceSymbols = FALSE), #> numParameters = -1L), <pointer: 0x162343ab0>, <pointer: 0x16233b560>, #> .pointer)#> }#> <environment: 0x17552e3c8>
v =values(elev)class(v)#> [1] "matrix" "array"length(v)#> [1] 11902500
elev#> class : SpatRaster #> size : 3450, 3450, 1 (nrow, ncol, nlyr)#> resolution : 28.98965, 28.98965 (x, y)#> extent : -810156.6, -710142.3, 1934613, 2034627 (xmin, xmax, ymin, ymax)#> coord. ref. : +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs #> source : foco-elev.tif #> name : foco-elev #> min value : 1382 #> max value : 4346
Raster Values
# The length of the vector is equal to the rows * columnslength(v) ==nrow(elev) *ncol(elev)#> [1] TRUE# The span of the x extent divided by the resolution equals the raster rows((xmax(elev) -xmin(elev)) /res(elev)[1]) ==ncol(elev) #> [1] TRUE# The span of the x extent divided by the number of rows equals the raster resolution((xmax(elev) -xmin(elev)) /ncol(elev)) ==res(elev)[1] #> [1] TRUE
plot(r)plot(river, add =TRUE, col ="blue", lwd =2)plot(line, add =TRUE, col ="black", lwd =2)plot(outlet$geom, add =TRUE, pch =16, col ="red")plot(inlet$geom, add =TRUE, pch =16, col ="green")
Sinuosity
Channel sinuosity is calculated by dividing the length of the stream channel by the straight line distance between the end points of the selected channel reach.
plot(global(prcp$ppt, max)$max, type ="l", ylab ="rainfall", xlab ="month since 2000-01")lines(global(prcp$ppt, min)$min, type ="l", col ="blue")lines(global(prcp$ppt, mean)$mean, type ="l", col ="darkred", lwd =2)
mean()
plot(mean(prcp$ppt), col = blues9)plot(AOI, add =TRUE, col =NA, lwd =2)
Main question: “how do you select and/or reduce value to a new grid value knowing each grid has 0:n contributors, and each contributor has a different weight (distance, area, etc)
Rule of Thumb: Use nearest for categories, bilienar/cubic for continuous data. But, projection is always “lossy” and is slow.
Summary: Raster Reprojection & Resampling
Task
Method
Key Consideration
Reproject to new CRS
project(rast, crs)
Choose method: nearest (categorical) or cubic (continuous)