Lab 10: Distances and Projections

Hints & Tricks

Question 1:

Making Spatial Objects & Coordinate Transformation

Spatial objects (sf) can be built from a vector of X and Y values in addition to a coordinate reference system (CRS). For example:

df <- data.frame(name = state.name, 
                X = state.center$x, 
                Y = state.center$y)
head(df)
        name         X       Y
1    Alabama  -86.7509 32.5901
2     Alaska -127.2500 49.2500
3    Arizona -111.6250 34.2192
4   Arkansas  -92.2992 34.7336
5 California -119.7730 36.5341
6   Colorado -105.5130 38.6777
# Geographic Coordinate System (GCS)
(df_sf_gcs = st_as_sf(df, 
                      coords = c("X", "Y"), 
                      crs = 4269))
Simple feature collection with 50 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -127.25 ymin: 27.8744 xmax: -68.9801 ymax: 49.25
Geodetic CRS:  NAD83
First 10 features:
          name                 geometry
1      Alabama POINT (-86.7509 32.5901)
2       Alaska    POINT (-127.25 49.25)
3      Arizona POINT (-111.625 34.2192)
4     Arkansas POINT (-92.2992 34.7336)
5   California POINT (-119.773 36.5341)
6     Colorado POINT (-105.513 38.6777)
7  Connecticut POINT (-72.3573 41.5928)
8     Delaware POINT (-74.9841 38.6777)
9      Florida  POINT (-81.685 27.8744)
10     Georgia POINT (-83.3736 32.3329)
ggplot() + 
  geom_sf(data = df_sf_gcs) + 
  coord_sf(datum = st_crs(df_sf_gcs)) +
  theme_linedraw()

# Projected Coordinate System (PCS)
# st_transforms converts from one reference system to another
(df_sf_pcs = st_transform(df_sf_gcs, 5070))
Simple feature collection with 50 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -2805703 ymin: 640477 xmax: 2079664 ymax: 3291437
Projected CRS: NAD83 / Conus Albers
First 10 features:
          name                  geometry
1      Alabama  POINT (862043.5 1099545)
2       Alaska  POINT (-2264853 3291437)
3      Arizona  POINT (-1422260 1356663)
4     Arkansas  POINT (336061.5 1303543)
5   California  POINT (-2086972 1760961)
6     Colorado POINT (-818480.9 1779785)
7  Connecticut   POINT (1936213 2307450)
8     Delaware   POINT (1796466 1938236)
9      Florida    POINT (1409814 640477)
10     Georgia   POINT (1179012 1107322)
ggplot() + 
  geom_sf(data = df_sf_pcs) + 
  coord_sf(datum = st_crs(df_sf_pcs)) +
  theme_linedraw()

Question 2:

st_distance review

# Three most populous cities in the USA
(big3 = cities |> 
   select(city, population) |> 
   slice_max(population, n = 3))
Simple feature collection with 3 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -2032604 ymin: 1468468 xmax: 1833394 ymax: 2178657
Projected CRS: NAD83 / Conus Albers
# A tibble: 3 × 3
  city        population           geometry
  <chr>            <dbl>        <POINT [m]>
1 New York      18832416  (1833394 2178657)
2 Los Angeles   11885717 (-2032604 1468468)
3 Chicago        8489066 (684628.5 2122697)
# Fort Collins
(foco = filter(cities, city == "Fort Collins") |> 
    select(city, population))
Simple feature collection with 1 feature and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -760147.5 ymin: 1984621 xmax: -760147.5 ymax: 1984621
Projected CRS: NAD83 / Conus Albers
# A tibble: 1 × 3
  city         population            geometry
  <chr>             <dbl>         <POINT [m]>
1 Fort Collins     339256 (-760147.5 1984621)
# Distance from foco to population centers
st_distance(big3, foco)
Units: [m]
        [,1]
[1,] 2600790
[2,] 1373156
[3,] 1451359

There are two notable things about this result:

  1. It has units
  2. It is returned as a matrix, even though foco only had one row

This second point highlights a useful feature of st_distance, namley, its ability to return distance matrices between all combinations of features in x and y.

units review

While units are useful, they are not always the preferred units. By default, the units measurement is defined by the projection. For example:

st_crs(big3)$units
[1] "m"

Units can be converted using units::set_units. For example, ‘m’ can be converted to ‘km’:

big3 = mutate(big3, 
              dist_to_foco = st_distance(big3, foco),
              dist_to_foco = set_units(dist_to_foco, "km")) 

(big3$dist_to_foco)
Units: [km]
         [,1]
[1,] 2600.790
[2,] 1373.156
[3,] 1451.359

You might have noticed the data type of the st_distance objects are an S3 class of units. Sometimes, this class can cause problems when trying to using it with other classes or methods:

big3$dist_to_foco + 4
Error in Ops.units(big3$dist_to_foco, 4): both operands of the expression should be "units" objects
ggplot(data = big3) + 
  geom_col(aes(x = city, y = dist_to_foco)) + 
  theme_linedraw()

In these cases, the units class can be dropped with units::drop_units

big3 = mutate(big3, 
              dist_to_foco = st_distance(big3, foco),
              dist_to_foco = set_units(dist_to_foco, "km"),
              dist_to_foco = drop_units(dist_to_foco))

big3$dist_to_foco + 4
         [,1]
[1,] 2604.790
[2,] 1377.156
[3,] 1455.359
ggplot(data = big3) + 
  geom_col(aes(x = reorder(city, -dist_to_foco), y = dist_to_foco)) + 
    labs(title = "Distance to Fort Collins (km)") + 
  ggthemes::theme_fivethirtyeight() + 
  theme( axis.text.x = element_text(face = "bold", size = 14))

As with all functions, these steps can be nested:

big3 = mutate(big3, 
              dist_to_foco = drop_units(set_units(st_distance(big3, foco), "km")))

Geometry review

There are a few ways to manipulate existing geometries, here we discuss st_union, st_combine and st_cast

  • st_combine returns a single, combined geometry, with no resolved boundaries.

  • st_union() returns a single geometry with resolved boundaries

  • st_cast() casts one geometry type to another

(rockies = AOI::aoi_get(state = c('Montana', 'Wyoming', 'Colorado', "New Mexico")) |> 
  select(name, geometry))
Simple feature collection with 4 features and 1 field
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -116.0491 ymin: 31.33216 xmax: -102.0415 ymax: 49.00111
Geodetic CRS:  WGS 84
        name                       geometry
1    Montana MULTIPOLYGON (((-109.7985 4...
2    Wyoming MULTIPOLYGON (((-106.3212 4...
3   Colorado MULTIPOLYGON (((-105.155 36...
4 New Mexico MULTIPOLYGON (((-104.8477 3...
plot(rockies['name'], key.pos = 1)

# Combine Geometries
(combined_wc = st_combine(rockies))
Geometry set for 1 feature 
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -116.0491 ymin: 31.33216 xmax: -102.0415 ymax: 49.00111
Geodetic CRS:  WGS 84
MULTIPOLYGON (((-109.7985 45.00233, -109.7861 4...
plot(combined_wc, col = "red")

# Unioned Geometries
(unioned_wc = st_union(rockies))
Geometry set for 1 feature 
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -116.0491 ymin: 31.33216 xmax: -102.0415 ymax: 49.00111
Geodetic CRS:  WGS 84
POLYGON ((-109.05 41.00069, -109.0493 40.90865,...
plot(unioned_wc, col = "red")

# Combine Geometries
line_wc = st_cast(unioned_wc, "MULTILINESTRING")
plot(line_wc, col = "red")

Question 3:

In this section you will extend your growing ggplot skills to handle spatial data using ggrepl to label significant features; gghighlight to emphasize important criteria; and scaled color/fill to create chloropleth represnetations of variables. Below is some example code to provide an example of these tools in action:

Get some data (review)

# Define a state/region classifier and select the southern states
state.of.interest <- data.frame(state = state.name, region = state.region) |> 
  filter(region == "South") |> 
  pull(state)

# Get USA states in the southern region and transform to EPSG:5070
state = AOI::aoi_get(state = "conus") |> 
  filter(name %in% state.of.interest) |> 
  st_transform(5070)

# Get USA counties in the southern region and transform to EPSG:5070
counties =  AOI::aoi_get(state = "conus", county = 'all')  |> 
  filter(state_name %in% state.of.interest) |> 
  st_transform(5070)

# Get the 10 most populous cities in the southern region and transform to EPSG:5070
sub_cities = cities |> 
  filter(state_name %in% state.of.interest) |> 
  slice_max(population, n = 10) |> 
  st_transform(5070)

Map

ggplot() + 
  # Add districts with a dashed line (lty = 3), 
  # a color gradient from blue to red based on land_area, 
  # and a fill aplha of 0.5
  geom_sf(data = counties, aes(fill = land_area), lty = 3, alpha = .5) + 
  scale_fill_gradient(low = 'blue', high = "red") +
  # Highlight (keep blue) only those districts witn a land area > 5e10
  gghighlight(land_area > 2e9) +
  # Add the state borders with a thicker line and no fill
  geom_sf(data = state, size = 1, fill = "NA") +
  # Add the cities
  geom_sf(data = sub_cities, size= 2, color = "red") +
  # Add labels to the cities
  ggrepel::geom_label_repel(
    data = sub_cities,
    aes(label = city, geometry = geometry),
    stat = "sf_coordinates",
    size = 3) +
  labs(fill = "Area Land") + 
  ggthemes::theme_map()