Spatial phenomena can be thought of discrete objects with clear boundaries or as continuous phenomenon that can be observed everywhere, with no natural boundaries.
We have described these as objects and fields (Kuhn, 2012)
Objects are usually represented by vector data consisting of:
a geometry (simple features (sfc, sfg))
some attribute information (data.frame)
In R these are unified as a sf object
Field data is typically represented by raster data
For this, we will begin our discussions using the terra package
Represented continuous data either as continuous or categorical values as a regular set of cells in a grid (matrix)
cells have: (1) resolution, (2) infered cell coordinate (centroid) (3) the coordinate and value apply to the entire cell area
Recap: R data structures
Vector:
A vector can have dimensions
A 1D vector in a collection of values
A 2D vector is a matrix
A 3D vector is an array
List: a collection of objects
data.frame: a list with requirement of equal length column (vectors)
data.frames and lists (sfc) defined our vector model
Arrays will define our raster model
Spatial Extent
One last topic with respect to vector data (that will carry us into raster) is the idea of an extent:
(ny <- AOI::aoi_get(state ="NY") |>st_transform(5070) |> dplyr::select(name))#> Simple feature collection with 1 feature and 1 field#> Geometry type: MULTIPOLYGON#> Dimension: XY#> Bounding box: xmin: 1319502 ymin: 2149150 xmax: 1997508 ymax: 2658543#> Projected CRS: NAD83 / Conus Albers#> name geometry#> 1 New York MULTIPOLYGON (((1661335 263...
In geometry, the minimum bounding box for a point set (stored as POINT, POLYLINE, POLYGON) in N dimensions is “…the box with the smallest measure within which all the points lie.”
We can extract bounding box coordinates with st_bbox
length(cent) # how many grid tiles#> [1] 3468mapview::npts(grid1km) # how many points?#> [1] 17340mapview::npts(grid1km) *2# how many X and Y?#> [1] 34680
Raster Model
The raster model is one of the earliest and most widely used data models within geographic information systems (Tomlin, 1990; Goodchild, 1992, Maguire, 1992).
Typically used to record, analyze and visualize data with a continuous nature such as elevation, temperature (“GIS”), or reflected or emitted electromagnetic radiation (“Remote Sensing”)
Quotes are used because you’ll find from a data perspective these differences are artificial and a product of the ESRI/ENVI/ERDAS divide
The term raster originated from the German word for screen, implying a series of orthogonality oriented parallel lines.
Digital raster objects most often take the form of a regularly spaced, grid-like pattern of rows and columns
Each element referred to as a cell, pixel, or grid point.
Many terms mean the same thing …
The entire raster is sometimes referred to as an “image”, “array”, “surface”, “matrix”, or “lattice” (Wise, 2000).
The all mean the same thing…
Cells of the raster are most often square, but may be rectangular (with differing resolutions in x and y directions) or other shapes that can be tessellated such as triangles and hexagons (Figure below from Peuquet, 1984).
Photos and Computers …
Aerial Imagery (really just a photo 😄)
What is stored in these cells?
Categorical Values (integer/factor)
Continuous Values (numeric)
Spectral Values
Either Color, or sensor
Why do we care?
Pixels are the base unit of raster data and have a resolution
This is the X and the Y dimension of each cell in the units of the CRS
Raster images seek to discritize the real world into cell-based values
Again either integer (categorical), continuous, or signal
Resolution drives image clarity (granulairty)
Higher resolution (smaller cells) = more detail, but bigger data!
All rasters have an extent!
This is the same extent as a bounding box
Can be described as 4 values (xmin,ymin,xmax,ymax)
Implicit Coordinates
Unlike vector data, the raster data model stores the coordinate of the grid cells indirectly
Coordinates are derived from the reference (Xmin,Ymin) the resolution, and the cell index (e.g. [100,150])
For example: If we want the coordinates of a value in the 3rd row and the 40th column of a raster matrix, we have to move from the origin (Xmin, Ymin) (3 x Xres) in x-direction and (40 x Yres) in y-direction
So, any image (.png, .tif, .gif) can be read as a raster…
The raster is defined by the extent and resolution of the cells
To be spatial, the extent (thus coordinates) must be grounded in an CRS
Critically, rast() reads the data header, before calling data into memory
Only when an operation that needs the values is called, is the data streamed into memory
rast("data/foco-elev.tif")#> class : SpatRaster #> dimensions : 725, 572, 1 (nrow, ncol, nlyr)#> resolution : 30, 30 (x, y)#> extent : -769695, -752535, 1978485, 2000235 (xmin, xmax, ymin, ymax)#> coord. ref. : +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs #> source : foco-elev.tif #> name : dem #> min value : 146730 #> max value : 197781
Raster Data from the Web
GDAL exposes a number of Virtual System Interfaces (VSI) that allow for reading across various data protocols! These are implemented by prefixing a URL with the needed interface
A SpatRast represents single-layer (variable) raster data.
A SpatRast always stores the fundamental parameters that describe it. - The number of columns and rows, - The spatial extent - The Coordinate Reference System.
In addition, a SpatRast can store information about the file where raster values are stored (if there is such a file).