Lab 10: Distances and Projections
Distance from the Border Zone
In this lab we will explore the properties of sf
, sfc
, and sfg
features & objects; how they are stored; and issues related to distance calculation and coordinate transformation.
We will continue to build on our data wrangling and data visualization skills; as well as document preparation via Quarto and GitHub.
Set-up
- Navigate to your
csu-ess-330
repository - Create a new Quarto (.qmd) file called
lab-02.qmd
- Populate its YML with a title, author, subtitle, output type and theme. For example:
---
: "Lab 02: Distances and the Border Zone"
title: 'Ecosystem Science and Sustainability 523c'
subtitle:
author- name: ...
: ...
email: html
format---
Libraries
# spatial data science
library(tidyverse)
library(sf)
library(units)
# Data
library(AOI)
# Visualization
library(gghighlight)
library(ggrepel)
library(knitr)
Background
In this lab, 4 main skills are covered:
- Ingesting / building
sf
objects from R packages and CSVs. (Q1) - Manipulating geometries and coordinate systems (Q2)
- Calculating distances (Q2)
- Building maps using ggplot (Q3)
Hints and Tricks for this lab are available here
Question 1:
For this lab we need three (3) datasets.
- Spatial boundaries of continental USA states (1.1)
- Boundaries of Canada, Mexico and the United States (1.2)
- All USA cites (1.3)
1.1 Define a Projection
For this lab we want to calculate distances between features, therefore we need a projection that preserves distance at the scale of CONUS. For this, we will use the North America Equidistant Conic:
<- '+proj=eqdc +lat_0=40 +lon_0=-96 +lat_1=20 +lat_2=60 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs' eqdc
This PROJ.4 string defines an Equidistant Conic projection with the following parameters:
+proj=eqdc → Equidistant Conic projection
+lat_0=40 → Latitude of the projection’s center (40°N)
+lon_0=-96 → Central meridian (96°W)
+lat_1=20 → First standard parallel (20°N)
+lat_2=60 → Second standard parallel (60°N)
+x_0=0 → False easting (0 meters)
+y_0=0 → False northing (0 meters)
+datum=NAD83 → Uses the North American Datum 1983 (NAD83)
+units=m → Units are in meters
+no_defs → No additional default parameters from PROJ’s database
This projection is commonly used for mapping large areas with an east-west extent, especially in North America, as it balances distortion well between the two standard parallels.
1.2 - Get USA state boundaries
In R, USA boundaries are stored in the AOI
package. In case this package and data are not installed:
::install_github("mikejohnson51/AOI") remotes
Once installed:
- USA state boundaries can be accessed with
aoi_get(state = 'conus')
. - Make sure the data is in a projected coordinate system suitable for distance measurements at the national scale (
eqdc
).
1.3 - Get country boundaries for Mexico, the United States of America, and Canada
In R, country boundaries are stored in the AOI
package.
- Country boundaries can be accessed with
aoi_get(country = c("MX", "CA", "USA"))
. - Make sure the data is in a projected coordinate system suitable for distance measurements at the national scale (
eqdc
).
1.4 - Get city locations from the CSV file
The process of finding, downloading and accessing data is the first step of every analysis. Here we will go through these steps (minus finding the data).
First go to this site and download the appropriate (free) dataset into the data directory of this project.
Once downloaded, read it into your working session using readr::read_csv()
and explore the dataset until you are comfortable with the information it contains.
While this data has everything we want, it is not yet spatial. Convert the data.frame
to a spatial object using st_as_sf
and prescribing the coordinate variables and CRS (Hint what projection are the raw coordinates in?)
Finally, remove cities in states not wanted and make sure the data is in a projected coordinate system suitable for distance measurements at the national scale:
Congratulations! You now have three real-world, large datasets ready for analysis.
Question 2:
Here we will focus on calculating the distance of each USA city to (1) the national border (2) the nearest state border (3) the Mexican border and (4) the Canadian border. You will need to manipulate you existing spatial geometries to do this using either st_union
or st_combine
depending on the situation. In all cases, since we are after distances to borders, we will need to cast (st_cast
) our MULTIPPOLYGON
geometries to MULTILINESTRING
geometries. To perform these distance calculations we will use st_distance
().
2.1 - Distance to USA Border (coastline or national) (km)
For 2.2 we are interested in calculating the distance of each USA city to the USA border (coastline or national border). To do this we need all states to act as single unit. Convert the USA state boundaries to a MULTILINESTRING
geometry in which the state boundaries are resolved. Please do this starting with the states object and NOT with a filtered country object. In addition to storing this distance data as part of the cities data.frame
, produce a table (flextable
) documenting the five cities farthest from a state border. Include only the city name, state, and distance.
2.2 - Distance to States (km)
For 2.1 we are interested in calculating the distance of each city to the nearest state boundary. To do this we need all states to act as single unit. Convert the USA state boundaries to a MULTILINESTRING
geometry in which the state boundaries are preserved (not resolved). In addition to storing this distance data as part of the cities data.frame
, produce a table (flextable
) documenting the five cities farthest from a state border. Include only the city name, state, and distance.
2.3 - Distance to Mexico (km)
For 2.3 we are interested in calculating the distance of each city to the Mexican border. To do this we need to isolate Mexico from the country objects. In addition to storing this data as part of the cities data.frame, produce a table (flextable
) documenting the five cities farthest from a state border. Include only the city name, state, and distance.
2.4 - Distance to Canada (km)
For 2.4 we are interested in calculating the distance of each city to the Canadian border. To do this we need to isolate Canada from the country objects. In addition to storing this data as part of the cities data.frame, produce a table (flextable
) documenting the five cities farthest from a state border. Include only the city name, state, and distance.
Question 3:
In this section we will focus on visualizing the distance data you calculated above. You will be using ggplot
to make your maps, ggrepl
to label significant features, and gghighlight
to emphasize important criteria.
3.1 Data
Show the 3 continents, CONUS outline, state boundaries, and 10 largest USA cities (by population) on a single map
- Use
geom_sf
to plot your layers - Use
lty
to change the line type and size to change line width - Use
ggrepel::geom_label_repel
to label your cities
3.2 City Distance from the Border
Create a map that colors USA cities by their distance from the national border. In addition, re-draw and label the 5 cities that are farthest from the border.
3.3 City Distance from Nearest State
Create a map that colors USA cities by their distance from the nearest state border. In addition, re-draw and label the 5 cities that are farthest from any border.
3.4 Equidistance boundary from Mexico and Canada
Here we provide a little more challenge. Use gghighlight
to identify the cities that are equal distance from the Canadian AND Mexican border \(\pm\) 100 km.
In addition, label the five (5) most populous cites in this zone.
Hint: (create a new variable that finds the absolute difference between the distance to Mexico and the distance to Canada)
Question 4:
Real World Application
Recently, Federal Agencies have claimed basic constitutional rights protected by the Fourth Amendment (protecting Americans from random and arbitrary stops and searches) do not apply fully at our borders (see Portland). For example, federal authorities do not need a warrant or suspicion of wrongdoing to justify conducting what courts have called a “routine search,” such as searching luggage or a vehicle. Specifically, federal regulations give U.S. Customs and Border Protection (CBP) authority to operate within 100 miles of any U.S. “external boundary”. Further information can be found at this ACLU article.
4.1 Quantifing Border Zone
- How many cities are in this 100 mile zone? (100 miles ~ 160 kilometers)
- How many people live in a city within 100 miles of the border?
- What percentage of the total population is in this zone?
- Does it match the ACLU estimate in the link above?
Report this information as a table.
4.2 Mapping Border Zone
- Make a map highlighting the cites within the 100 mile zone using
gghighlight.
- Use a color gradient from ‘orange’ to ‘darkred’.
- Label the 10 most populous cities in the Danger Zone
4.3 : Instead of labeling the 10 most populous cites, label the most populous city in each state within the Danger Zone.
Rubric
Total: 150 points
Submission
For this lab you will submit a URL to a webpage deployed with GitHub pages.
To do this:
- Render your lab document
- Stage/commit/push your files
Submit this URL in the appropriate Canvas dropbox. Also take a moment to update your personal webpage with this link and some bullet points of what you learned. While not graded as part of this lab, it will be your final!