Analysis of the U.S. Electoral College system

The Electoral College, established by Article II, Section 1 of the U.S. Constitution, serves as the mechanism for electing the President of the United States. Each state is granted a number of electors equal to its representation in Congress—comprising both Senators and Representatives. Over time, the states have developed their own methods for allocating these electoral votes (ECVs), often opting for a “winner-takes-all” system based on the state’s popular vote. Notably, Maine and Nebraska allocate their ECVs differently, by congressional district.

In this mini-project, we examine the impact of alternative electoral allocation methods on U.S. presidential election outcomes, specifically investigating whether different methods could affect perceived biases in the Electoral College system. Our analysis will simulate alternative ECV allocations using historical voting data, assessing how proportional or district-based allocations might alter electoral outcomes.

Data Set-Up and Initial Exploration

To prepare the data, I wrote R code that automates the download of U.S. congressional shapefiles¹ from 1976 to 2022, organized into two main tasks. For 1976-2012, the code downloads files from the UCLA Political Science website, using a systematic naming convention to save each file in a designated “shapefiles” directory. For 2014-2022, it pulls files from the U.S. Census Bureau² and saves them in a separate “census_shapefiles” directory. The code checks for existing files locally to prevent redundant downloads, optimizing server load and storage efficiency. Additionally, I manually downloaded U.S. House election vote data³ and statewide presidential vote⁴ counts from 1976 to 2022 from the MIT Election Data Science Lab to incorporate into the project.

Show the code

# Load necessary packages
if(!require("sf")) install.packages("sf")
library(sf)
if(!require("tigris")) install.packages("tigris")
library(tigris)
if(!require("gganimate")) install.packages("gganimate")
library(gganimate)
if(!require("usmap")) install.packages("usmap")
library(usmap)
library(httr)
library(tidyverse)
library(dplyr)
library(ggplot2)
library(gt)
library(sf)
library(stringr)


# Define file paths
president_data_path <- "C:\\Users\\w3038\\Downloads\\STA 9750\\1976-2020-president.csv"
house_data_path <- "C:\\Users\\w3038\\Downloads\\STA 9750\\1976-2022-house.csv"

# Load the datasets
president_data <- read.csv(president_data_path)
house_data <- read.csv(house_data_path)
# Display a preview of the president data in a gt table
president_data %>%
  slice_head(n = 10) %>%  # Display the first 10 rows
  gt() %>%
  tab_header(
    title = "President Data (1976-2020)"
  ) %>%
  cols_label(
    # Customize column labels here if needed, for example:
    year = "Year",
    state = "State",
    candidatevotes = "Candidate Votes"
    # Add other columns as desired
  ) %>%
  fmt_number(
    columns = where(is.numeric),
    decimals = 0
  )

President Data (1976-2020)
Year	State	state_po	state_fips	state_cen	state_ic	office	candidate	party_detailed	writein	Candidate Votes	totalvotes	version	notes	party_simplified
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	CARTER, JIMMY	DEMOCRAT	FALSE	659,170	1,182,850	20,210,113	NA	DEMOCRAT
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	FORD, GERALD	REPUBLICAN	FALSE	504,070	1,182,850	20,210,113	NA	REPUBLICAN
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	MADDOX, LESTER	AMERICAN INDEPENDENT PARTY	FALSE	9,198	1,182,850	20,210,113	NA	OTHER
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	BUBAR, BENJAMIN ""BEN""	PROHIBITION	FALSE	6,669	1,182,850	20,210,113	NA	OTHER
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	HALL, GUS	COMMUNIST PARTY USE	FALSE	1,954	1,182,850	20,210,113	NA	OTHER
1,976	ALABAMA	AL	1	63	41	US PRESIDENT	MACBRIDE, ROGER	LIBERTARIAN	FALSE	1,481	1,182,850	20,210,113	NA	LIBERTARIAN
1,976	ALABAMA	AL	1	63	41	US PRESIDENT			TRUE	308	1,182,850	20,210,113	NA	OTHER
1,976	ALASKA	AK	2	94	81	US PRESIDENT	FORD, GERALD	REPUBLICAN	FALSE	71,555	123,574	20,210,113	NA	REPUBLICAN
1,976	ALASKA	AK	2	94	81	US PRESIDENT	CARTER, JIMMY	DEMOCRAT	FALSE	44,058	123,574	20,210,113	NA	DEMOCRAT
1,976	ALASKA	AK	2	94	81	US PRESIDENT	MACBRIDE, ROGER	LIBERTARIAN	FALSE	6,785	123,574	20,210,113	NA	LIBERTARIAN

Show the code

# Display a preview of the house data in a gt table
house_data %>%
  slice_head(n = 10) %>%  # Display the first 10 rows
  gt() %>%
  tab_header(
    title = "House Data (1976-2022)"
    ) %>%
  cols_label(
    # Customize column labels here if needed, for example:
    year = "Year",
    state = "State",
    candidatevotes = "Candidate Votes"
    # Add other columns as desired
  ) %>%
  fmt_number(
    columns = where(is.numeric),
    decimals = 0
  )

House Data (1976-2022)
Year	State	state_po	state_fips	state_cen	state_ic	office	district	stage	runoff	special	candidate	party	writein	mode	Candidate Votes	totalvotes	unofficial	version	fusion_ticket
1,976	ALABAMA	AL	1	63	41	US HOUSE	1	GEN	FALSE	FALSE	BILL DAVENPORT	DEMOCRAT	FALSE	TOTAL	58,906	157,170	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	1	GEN	FALSE	FALSE	JACK EDWARDS	REPUBLICAN	FALSE	TOTAL	98,257	157,170	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	1	GEN	FALSE	FALSE	WRITEIN		TRUE	TOTAL	7	157,170	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	2	GEN	FALSE	FALSE	J CAROLE KEAHEY	DEMOCRAT	FALSE	TOTAL	66,288	156,362	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	2	GEN	FALSE	FALSE	WILLIAM L "BILL" DICKINSON	REPUBLICAN	FALSE	TOTAL	90,069	156,362	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	2	GEN	FALSE	FALSE	WRITEIN		TRUE	TOTAL	5	156,362	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	3	GEN	FALSE	FALSE	BILL NICHOLS	DEMOCRAT	FALSE	TOTAL	106,935	108,048	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	3	GEN	FALSE	FALSE	OGBURN GARDNER	PROHIBITION	FALSE	TOTAL	1,111	108,048	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	3	GEN	FALSE	FALSE	WRITEIN		TRUE	TOTAL	2	108,048	FALSE	20,230,706	FALSE
1,976	ALABAMA	AL	1	63	41	US HOUSE	4	GEN	FALSE	FALSE	LEONARD WILSON	REPUBLICAN	FALSE	TOTAL	34,531	176,022	FALSE	20,230,706	FALSE

Show the code

#UCLA Shapefile

# Create directory for shapefiles with recursive creation
dir.create("data/shapefiles", showWarnings = FALSE, recursive = TRUE)

# Function to download shapefiles systematically
download_shapefiles <- function(start, end, base_url) {
  for (session in start:end) {
    file_name <- paste0("districts", sprintf("%03d", session), ".zip")
    url <- paste0(base_url, file_name)
    destfile <- file.path("data/shapefiles", file_name)
    
    if (!file.exists(destfile)) {
      GET(url, write_disk(destfile, overwrite = TRUE))
      message(paste("Downloaded:", file_name))
    } else {
      message(paste("File already exists:", file_name))
    }
  }
}

# Define the base URL and download the files
base_url <- "https://cdmaps.polisci.ucla.edu/shp/"
download_shapefiles(93, 112, base_url)


#Census

# Create directory for shapefiles if it doesn't exist
dir.create("data/census_shapefiles", showWarnings = FALSE, recursive = TRUE)

# List of download URLs and file names for the shapefiles
shapefile_urls <- c(
  "https://www2.census.gov/geo/tiger/TIGER2014/CD/tl_2014_us_cd114.zip",
  "https://www2.census.gov/geo/tiger/TIGER2015/CD/tl_2015_us_cd114.zip",
  "https://www2.census.gov/geo/tiger/TIGER2016/CD/tl_2016_us_cd115.zip",
  "https://www2.census.gov/geo/tiger/TIGER2017/CD/tl_2017_us_cd115.zip",
  "https://www2.census.gov/geo/tiger/TIGER2018/CD/tl_2018_us_cd116.zip",
  "https://www2.census.gov/geo/tiger/TIGER2019/CD/tl_2019_us_cd116.zip",
  "https://www2.census.gov/geo/tiger/TIGER2020/CD/tl_2020_us_cd116.zip",
  "https://www2.census.gov/geo/tiger/TIGER2021/CD/tl_2021_us_cd116.zip",
  "https://www2.census.gov/geo/tiger/TIGER2022/CD/tl_2022_us_cd116.zip"
)

# Function to download shapefiles systematically
download_shapefiles_census <- function(urls, dest_dir) {
  for (url in urls) {
    file_name <- basename(url)
    destfile <- file.path(dest_dir, file_name)
    
    # Download file only if it doesn't already exist
    if (!file.exists(destfile)) {
      GET(url, write_disk(destfile, overwrite = TRUE))
      message(paste("Downloaded:", file_name))
    } else {
      message(paste("File already exists:", file_name))
    }
  }
}

# Call the download function
download_shapefiles_census(shapefile_urls, "data/census_shapefiles")

Initial Exploration of Vote Count Data

To analyze which states gained or lost the most seats in the U.S. House of Representatives between 1976 and 2022, I first filtered the House data to include only those years. Then, I grouped the data by year and state to calculate the number of seats per state. After determining the seat changes, I identified the top 5 states with the most significant gains and lost as follows:

Show the code

#3.1 Which states have gained and lost the most seats in the US House of Representatives between 1976 and 2022?

# Filter data for relevant years (1976 and 2022)
house_seats <- house_data %>%
  filter(year %in% c(1976, 2022)) %>%
  distinct(year, state, district) %>%
  group_by(year, state) %>%
  summarise(seat_count = n(), .groups = "drop")

# Calculate seat changes
seat_changes <- house_seats %>%
  pivot_wider(names_from = year, values_from = seat_count, names_prefix = "year_") %>%
  mutate(seat_change = year_2022 - year_1976) %>%
  arrange(desc(seat_change))

# Separate states with the highest seat changes
top_gained <- seat_changes %>% top_n(5, seat_change)
top_lost <- seat_changes %>% top_n(-5, seat_change)

# Plot for top gained seats
gained_plot <- ggplot(top_gained, aes(x = reorder(state, seat_change), y = seat_change, fill = seat_change)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 5 States with Most Gained Seats (1976-2022)",
       x = "State",
       y = "Number of Seats Gained") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  theme_minimal()

# Plot for top lost seats
lost_plot <- ggplot(top_lost, aes(x = reorder(state, seat_change), y = seat_change, fill = -seat_change)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 5 States with Most Lost Seats (1976-2022)",
       x = "State",
       y = "Number of Seats Lost") +
  scale_fill_gradient(low = "lightcoral", high = "darkred") +
  theme_minimal()

# Display plots
list(gained_plot, lost_plot)

[[1]]


[[2]]

In New York State, the “fusion” voting system allows a candidate to appear on multiple party lines, with their votes across all lines being totaled. This system can sometimes affect the outcome of elections. The goal of this analysis is to determine if the outcome of any U.S. House elections in New York would have changed if the fusion system had not been used, meaning candidates would only receive votes from their major party lines (Democrat or Republican).

This code analyzes U.S. House elections in New York State to identify instances where the fusion voting system might have changed the election outcome. It filters the data for New York House races, focusing on candidates from the Democrat and Republican parties. The code calculates total votes across all lines (fusion) and major party votes only (non-fusion) for each candidate. It then determines the winner in both scenarios and compares them to highlight elections where the fusion system resulted in a different winner. The results are displayed in a formatted table using the gt package.

Show the code

library(gt)
library(dplyr)
#3.2

# Filter for New York State and U.S. House elections
ny_house_data <- house_data %>%
  filter(state == "NEW YORK", office == "US HOUSE")

# Filter for candidates from major parties only (Democrat or Republican)
# and keep records for candidates with votes across different lines
fusion_analysis <- ny_house_data %>%
  mutate(is_major_party = party %in% c("DEMOCRAT", "REPUBLICAN")) %>%
  group_by(year, district, candidate) %>%
  summarise(
    total_votes_all_lines = sum(candidatevotes),
    major_party_votes = sum(candidatevotes[is_major_party]),
    .groups = "drop"
  )

# Determine the election winner by both fusion and non-fusion scenarios
# Winner with fusion system (total across all lines)
winners_fusion <- fusion_analysis %>%
  group_by(year, district) %>%
  filter(total_votes_all_lines == max(total_votes_all_lines)) %>%
  select(year, district, candidate, total_votes_all_lines) %>%
  rename(fusion_winner = candidate, fusion_votes = total_votes_all_lines)

# Winner without fusion system (major party line votes only)
winners_nonfusion <- fusion_analysis %>%
  group_by(year, district) %>%
  filter(major_party_votes == max(major_party_votes)) %>%
  select(year, district, candidate, major_party_votes) %>%
  rename(nonfusion_winner = candidate, nonfusion_votes = major_party_votes)

# Compare outcomes
election_outcomes <- winners_fusion %>%
  inner_join(winners_nonfusion, by = c("year", "district")) %>%
  filter(fusion_winner != nonfusion_winner)

# Create the gt table with updated styles
election_outcomes_table <- election_outcomes %>%
  gt() %>%
  tab_header(
    title = "Elections Affected by Fusion Voting System",
    subtitle = "Comparison of Winners with and without Fusion Voting"
  ) %>%
  cols_label(
    year = "Year",
    district = "District",
    fusion_winner = "Winner with Fusion Voting",
    fusion_votes = "Votes (Fusion)",
    nonfusion_winner = "Winner without Fusion Voting",
    nonfusion_votes = "Votes (Non-Fusion)"
  ) %>%
  fmt_number(
    columns = c(fusion_votes, nonfusion_votes),
    decimals = 0
  ) %>%
  tab_options(
    table.font.size = px(14),
    heading.align = "center"
  )

# Display the table in Quarto environment
election_outcomes_table

Elections Affected by Fusion Voting System
Comparison of Winners with and without Fusion Voting
Winner with Fusion Voting	Votes (Fusion)	Winner without Fusion Voting	Votes (Non-Fusion)
1976 - 29
EDWARD W PATTISON	100,663	JOSEPH A MARTINO	96,476
1980 - 3
GREGORY W CARMAN	87,952	JEROME A AMBRO JR	75,389
1980 - 6
JOHN LEBOUTILLIER	89,762	LESTER L WOLFF	74,319
1984 - 20
JOSEPH J DIOGUARDI	106,958	OREN J TEICHER	102,842
1986 - 27
GEORGE C WORTLEY	83,430	ROSEMARY S POOLER	81,133
1992 - 3
PETER T KING	124,727	STEVE A ORLINS	116,915
1994 - 1
MICHAEL P FORBES	90,491	GEORGE J HOCHBRUECKNER	78,692
1996 - 1
MICHAEL P FORBES	116,620	NORA L BREDES	93,816
1996 - 30
JACK QUINN	121,369	FRANCIS J PORDUM	97,686
2006 - 25
JAMES T WALSH	110,525	DAN MAFFEI	100,605
2006 - 29
JOHN R "RANDY" KUHL JR	106,077	ERIC J MASSA	94,609
2010 - 13
MICHAEL G GRIMM	65,024	MICHAEL E MCMAHON	60,773
2010 - 19
NAN HAYMORTH	109,956	JOHN J HALL	98,766
2010 - 24
RICHARD L HANNA	101,599	MICHAEL A ARCURI	89,809
2010 - 25
ANN MARIE BUERKLE	104,602	DANIEL B MAFFEI	103,954
2012 - 27
CHRIS COLLINS	161,220	KATHLEEN C HOCHUL	140,008
2018 - 1
LEE M ZELDIN	139,027	PERRY GERSHON	124,213
2018 - 24
JOHN M KATKO	136,920	DANA BALTER	115,902
2018 - 27
CHRIS COLLINS	140,146	NATHAN D MCMURRAY	128,167
2022 - 4
ANTHONY P D’ESPOSITO	140,622	LAURA A GILLEN	130,871
2022 - 17
MICHAEL V LAWLER	143,550	SEAN PATRICK MALONEY	133,457
2022 - 22
BRANDON M WILLIAMS	135,544	FRANCIS CONOLE	132,913

In exploring U.S. voting trends from 1976 to 2012, the code examines whether presidential candidates typically receive more votes than their party’s congressional candidates within the same state. It filters and aggregates vote data for both presidential and congressional races by year, then calculates the percentage difference between the total votes for presidential and congressional candidates. The results are presented in a formatted table using the gt package, and a line plot visualizes the vote percentage difference over time. The analysis reveals a general trend of decreasing presidential vote share relative to congressional vote share over the years, highlighting variations in the relationship between presidential and congressional vote shares across different election years and states.

Show the code

#3.3
# Filter and aggregate presidential votes
presidential_votes <- president_data %>%
  filter(office == "US PRESIDENT", year >= 1976, year <= 2012) %>%
  group_by(year) %>%
  summarise(total_president_votes = sum(candidatevotes), .groups = "drop")

# Filter and aggregate congressional votes
congressional_votes <- house_data %>%
  filter(office == "US HOUSE", year >= 1976, year <= 2012) %>%
  group_by(year) %>%
  summarise(total_congress_votes = sum(candidatevotes), .groups = "drop")

# Combine the datasets
vote_comparison <- presidential_votes %>%
  left_join(congressional_votes, by = "year")

# Calculate vote percentage difference
vote_comparison <- vote_comparison %>%
  mutate(vote_percentage_difference = 
           (total_president_votes - total_congress_votes) / total_congress_votes)

# Create a gt table
vote_table <- vote_comparison %>%
  gt() %>%
  tab_header(
    title = "Total Votes for Presidential and Congressional Candidates (1976 - 2012)"
  ) %>%
  cols_label(
    year = "Year",
    total_president_votes = "Total Presidential Votes",
    total_congress_votes = "Total Congressional Votes",
    vote_percentage_difference = "Vote Percentage Difference (%)"
  ) %>%
  fmt_number(
    columns = c(total_president_votes, total_congress_votes), 
    decimals = 0
  ) %>%
  fmt_percent(
    columns = vote_percentage_difference,
    decimals = 1
  )

# Plotting the vote percentage difference over the years
ggplot(vote_comparison, aes(x = year, y = vote_percentage_difference)) +
  geom_line(color = "blue", size = 1) +
  geom_point(color = "blue", size = 2) +
  labs(title = "Percentage Difference Between Presidential and Congressional Votes (1976 - 2012)",
       x = "Year",
       y = "Vote Percentage Difference (%)") +
  theme_minimal()

Show the code

# Display the gt table
vote_table

Total Votes for Presidential and Congressional Candidates (1976 - 2012)
Year	Total Presidential Votes	Total Congressional Votes	Vote Percentage Difference (%)
1976	81,601,344	74,259,171	9.9%
1980	86,496,851	77,873,913	11.1%
1984	92,654,861	82,421,874	12.4%
1988	91,586,825	81,682,171	12.1%
1992	104,599,780	97,281,410	7.5%
1996	96,389,818	90,745,365	6.2%
2000	105,593,982	98,799,965	6.9%
2004	122,349,450	113,191,293	8.1%
2008	131,419,253	122,586,298	7.2%
2012	129,139,997	122,345,021	5.6%

I also analyze voting patterns for presidential and congressional candidates by party from 1976 to 2012. The code aggregates the vote totals by year and party for both presidential and congressional races, then merges the datasets to facilitate comparison. The data is reshaped into a long format for easier visualization, and a stacked area chart is created to display the number of votes for each party in both presidential and congressional elections over time. The chart highlights a consistent trend of presidential candidates receiving more votes than congressional candidates, helping to identify shifts in party performance across years and races.

Show the code

#3.3
# Aggregate votes by party for presidential and congressional candidates
party_votes <- president_data %>%
  filter(office == "US PRESIDENT", year >= 1976, year <= 2012) %>%
  group_by(year, party_simplified) %>%
  summarise(president_votes = sum(candidatevotes), .groups = "drop") %>%
  rename(party = party_simplified)

congressional_party_votes <- house_data %>%
  filter(office == "US HOUSE", year >= 1976, year <= 2012) %>%
  group_by(year, party) %>%
  summarise(congress_votes = sum(candidatevotes), .groups = "drop")

# Combine the datasets
party_vote_comparison <- party_votes %>%
  left_join(congressional_party_votes, by = c("year", "party"))

# Reshape data to long format
party_vote_long <- party_vote_comparison %>%
  pivot_longer(cols = c(president_votes, congress_votes), 
               names_to = "vote_type", 
               values_to = "votes")

# Plotting stacked area graph by party
ggplot(party_vote_long, aes(x = year, y = votes, fill = party)) +
  geom_area(position = "stack") +
  facet_wrap(~ vote_type) +
  labs(title = "Votes for Presidential vs. Congressional Candidates by Party (1976 - 2012)",
       x = "Year",
       y = "Number of Votes",
       fill = "Party") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set1") +
  theme(legend.position = "top")

Additionally, I analyze the total votes by state in presidential and congressional elections from 1976 to 2012. The code aggregates vote totals by year and state for both races, then merges the datasets for comparison. The data is reshaped into a long format for better visualization, and a grouped bar chart is created to display the total votes for presidential and congressional candidates by state. The chart highlights that states with larger populations, such as California and New York, have higher vote totals, with presidential candidates consistently receiving more votes than congressional candidates in these states.

Show the code

#3.3
# Aggregate votes by state for presidential and congressional candidates
state_vote_comparison <- president_data %>%
  filter(office == "US PRESIDENT", year >= 1976, year <= 2012) %>%
  group_by(state, year) %>%
  summarise(presidential_votes = sum(candidatevotes), .groups = "drop")

congressional_state_votes <- house_data %>%
  filter(office == "US HOUSE", year >= 1976, year <= 2012) %>%
  group_by(state, year) %>%
  summarise(congressional_votes = sum(candidatevotes), .groups = "drop")

# Combine the datasets
state_votes_combined <- state_vote_comparison %>%
  left_join(congressional_state_votes, by = c("state", "year"))

# Reshape data to long format for ggplot
state_votes_long <- state_votes_combined %>%
  pivot_longer(cols = c(presidential_votes, congressional_votes), 
               names_to = "vote_type", 
               values_to = "votes")

# Plotting grouped bar chart for votes by state
ggplot(state_votes_long, aes(x = reorder(state, -votes), y = votes, fill = vote_type)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Total Votes by State: Presidential vs. Congressional (1976 - 2012)",
       x = "State",
       y = "Total Votes",
       fill = "Vote Type") +
  theme_minimal() +
  coord_flip() + # Flip coordinates for better visibility
  scale_fill_manual(values = c("presidential_votes" = "blue", "congressional_votes" = "orange"))

Importing and Plotting Shape File Data

Next, I extract and read shapefiles from zip archives using R’s sf library, which is designed for handling spatial data in visualization and analysis. The code defines two functions: read_ucla_shapefiles and read_census_shapefiles. The first function extracts and reads all .shp files from zip archives in a specified directory, while the second function focuses on specific Census shapefiles based on a list of filenames. Both functions utilize sf::st_read to load the shapefiles and handle any errors, returning a list of successfully read shapefiles for further analysis.

Reading Data I⁵

Show the code

# Function to read UCLA shapefiles into R

read_ucla_shapefiles <- function(directory) {

  shapefiles <- list.files(directory, pattern = "\\.zip$", full.names = TRUE)

  results <- list()

  

  for (zip_file in shapefiles) {

    unzip_dir <- tempdir()

    unzip(zip_file, exdir = unzip_dir)

    

    shp_files <- list.files(unzip_dir, pattern = "\\.shp$", full.names = TRUE, recursive = TRUE)

    

    for (shp_file in shp_files) {

      tryCatch({

        sf_object <- sf::st_read(shp_file, quiet= TRUE)

        results[[basename(shp_file)]] <- sf_object

      }, error = function(e) {

        message(paste("Error reading file:", shp_file, ":", e$message))

      })

    }

  }

  

  return(results)

}

 # Define the shapefile directory
ucla_dir <- "data/shapefiles"

# Read the downloaded UCLA shapefiles into R

ucla_shapefiles <- read_ucla_shapefiles(ucla_dir)

Reading data II

Show the code

# Function to read specific Census shapefiles into R

# Function to read specific Census shapefiles into R
read_census_shapefiles <- function(directory, filenames) {
  shapefiles <- list.files(directory, pattern = "\\.zip$", full.names = TRUE)
  results <- list()
  
  for (zip_file in shapefiles) {
    unzip_dir <- tempdir()
    unzip(zip_file, exdir = unzip_dir)
    
    shp_files <- list.files(unzip_dir, pattern = "\\.shp$", full.names = TRUE, recursive = TRUE)
    
    for (shp_file in shp_files) {
      # Check if the shapefile matches the desired filenames
      if (basename(shp_file) %in% filenames) {
        tryCatch({
          sf_object <- sf::st_read(shp_file, quiet = TRUE)
          results[[basename(shp_file)]] <- sf_object
        }, error = function(e) {
          message(paste("Error reading file:", shp_file, ":", e$message))
        })
      }
    }
  }
  
  return(results)
}

# List of specific filenames to read
desired_filenames <- c(
  "tl_2014_us_cd114.shp",
  "tl_2015_us_cd114.shp",
  "tl_2016_us_cd115.shp",
  "tl_2017_us_cd115.shp",
  "tl_2018_us_cd116.shp",
  "tl_2019_us_cd116.shp",
  "tl_2020_us_cd116.shp",
  "tl_2021_us_cd116.shp",
  "tl_2022_us_cd116.shp"
)

# Define the Census shapefile directory
census_dir <- "data/census_shapefiles"

# Read the downloaded Census shapefiles into R
census_shapefiles <- read_census_shapefiles(census_dir, desired_filenames)

Using the data previously downloaded, here is the choropleth visualization of the 2000 U.S. Presidential Election Electoral College results.

Show the code

#Task5
# Step 1: Prepare Congressional District Data
CD_106 <- ucla_shapefiles[["districts106.shp"]] %>%
  mutate(STATENAME = tolower(STATENAME)) %>%
  select(-geometry) %>%
  as.data.frame() %>%
  group_by(STATENAME) %>%
  summarize(EC = n() + 2)  # Electoral votes per state (Districts + 2 for Senate)

# Step 2: Get vote results for the year 2000
vote_results_2000 <- president_data %>%
  filter(year == 2000) %>%
  mutate(state = tolower(state)) %>%
  group_by(state) %>%
  summarize(
    winning_party = party_simplified[which.max(candidatevotes)],
    total_votes = max(totalvotes)
  ) %>%
  ungroup()

# Step 3: Special case adjustments for Minnesota & Vermont
vote_results_2000 <- vote_results_2000 %>%
  mutate(
    winning_party = ifelse(state == "minnesota" & winning_party == "DEMOCRATIC-FARMER-LABOR", "DEMOCRAT", winning_party),
    winning_party = ifelse(state == "vermont", "DEMOCRAT", winning_party)
  )

# Step 4: Combine EC with winning party data
EC_2000 <- CD_106 %>%
  left_join(vote_results_2000, by = c("STATENAME" = "state")) %>%
  mutate(
    winning_party = if_else(STATENAME == "district of columbia" & is.na(winning_party), "DEMOCRAT", winning_party)
  ) %>%
  select(STATENAME, winning_party, EC)

# Step 5: Rename columns to match `usmap` requirements
EC_2000 <- EC_2000 %>%
  rename(state = STATENAME)

# Step 6: Add state abbreviations to the EC_2000 data
state_abbreviations <- data.frame(
  state = tolower(c("alabama", "alaska", "arizona", "arkansas", "california", "colorado", "connecticut", "delaware", "florida", "georgia", "hawaii", "idaho", "illinois", "indiana", "iowa", "kansas", "kentucky", "louisiana", "maine", "maryland", "massachusetts", "michigan", "minnesota", "mississippi", "missouri", "montana", "nebraska", "nevada", "new hampshire", "new jersey", "new mexico", "new york", "north carolina", "north dakota", "ohio", "oklahoma", "oregon", "pennsylvania", "rhode island", "south carolina", "south dakota", "tennessee", "texas", "utah", "vermont", "virginia", "washington", "west virginia", "wisconsin", "wyoming", "district of columbia")),
  abbreviation = c("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY", "DC")
)

# Step 7: Merge state abbreviations with EC_2000 data
EC_2000 <- left_join(EC_2000, state_abbreviations, by = "state") %>%
  rename(Abbreviation = abbreviation)  # Rename 'abbreviation' to 'Abbreviation'

# Step 8: Get the US map data (this provides state geometries)
us_states <- usmap::us_map(regions = "states")  # Correct way to get state geometries

# Step 9: Rename 'abbreviation' to 'abbr' in EC_2000 to match the column name in us_states
EC_2000 <- EC_2000 %>%
  rename(abbr = Abbreviation)

# Ensure map_data_cleaned is a data frame before joining
map_data_with_geom <- EC_2000 %>%
  left_join(us_states, by = c("abbr" = "abbr")) %>%
  select(everything(), geom)

# Rename 'geom' to 'geometry' for consistency
map_data_with_geom <- map_data_with_geom %>%
  rename(geometry = geom)

# Convert to Simple Features (sf) object for plotting
map_data_with_geom_sf <- st_as_sf(map_data_with_geom)

# Plot using ggplot2
ggplot(map_data_with_geom_sf) +
  geom_sf(aes(fill = winning_party)) +  # Color states by winning party
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red"),
                    labels = c("DEMOCRAT" = "Gore", "REPUBLICAN" = "Bush"),
                    name = "Winning Party") +
  geom_sf_text(aes(label = EC), size = 3, color = "white") +  # Add electoral vote labels to states
  labs(title = "2000 Presidential Election Results by State",
       subtitle = "Bush vs. Gore") +
  theme_minimal()  # Use minimal theme

The faceted version showing election results over time is as follows:

Show the code

#Task6
library(usmap)
library(tools)
library(sf)
library(tigris)
library(dplyr)
library(ggplot2)
library(gganimate)

# Modify the function to return data for all years
generate_choropleth_data_for_year <- function(year, district_number, president_data, ucla_shapefiles) {
  
  # Load the specific district shapefile
  district_shapefile <- ucla_shapefiles[[paste0("districts", district_number, ".shp")]]
  
  # Create Electoral College votes column based on the district shapefile
  district_EC <- district_shapefile %>%
    mutate(STATENAME = toupper(STATENAME)) %>%
    select(-geometry) %>%
    as.data.frame() %>%
    group_by(STATENAME) %>%
    summarize(EC = n() + 2)  # 2 for senators per state

  # Filter the election data for the given year
  president_year <- president_data %>%
    filter(year == year) %>%
    group_by(state) %>%
    mutate(
      winning_party = party_simplified[which.max(candidatevotes)]  # Determine the winning party
    ) %>%
    slice_max(candidatevotes, n = 1) %>%  # Only keep the row with the highest total votes
    ungroup()  # Ungroup after summarizing
  
  # Join the election data with the district data (EC votes and winning party)
  president_EC <- district_EC %>%
    left_join(president_year, by = c("STATENAME" = "state")) %>%
    select(state = STATENAME, EC, winning_party, state_fips)
  
  # Load U.S. state shapefiles for mapping
  us_state_shapefiles <- usmap::us_map(regions = "states") %>%
    mutate(fips = as.integer(fips))
  
  # Merge district data with state shapefiles by FIPS code
  merged_data <- president_EC %>%
    left_join(us_state_shapefiles, by = c("state_fips" = "fips"))
  
  # Add the year column to merge data
  merged_data$year <- year
  
  # Return the merged data for all years (instead of a plot)
  return(merged_data)
}

# Generate data for all election years
all_years_data <- bind_rows(
  generate_choropleth_data_for_year(1976, "094", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(1980, "096", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(1984, "098", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(1988, "100", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(1992, "102", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(1996, "104", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(2000, "106", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(2004, "108", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(2008, "110", president_data, ucla_shapefiles),
  generate_choropleth_data_for_year(2012, "112", president_data, ucla_shapefiles)
)

# facet graph, work on getting in one column
ggplot(data = all_years_data) +
  geom_sf(aes(fill = winning_party, geometry = geom), color = "white", size = 2.0) +  # Color by winning party
  scale_fill_manual(values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue"), 
                    labels = c("REPUBLICAN" = "Republican", "DEMOCRAT" = "Democratic"), 
                    name = "Winning Party") +  # Red for Republican, Blue for Democrat
  geom_sf_text(aes(label = EC, geometry = geom), color = "black", size = 3, fontface = "bold") +  # Add EC vote labels
  theme_minimal() +
  labs(title = "2000 Presidential Election: Winning Party by State") +  # Removed subtitle
  theme(legend.position = "bottom", 
        plot.title = element_text(hjust = 0.5, size = 16)) +
  facet_wrap(~ year)

Comparing the Effects of ECV Allocation Rules

The following code includes different historical voting data and assigns each state’s Electoral College votes (ECVs) according to various strategies:

Task 7 - State-Wide Winner-Take-All

This strategy and code focuses on filtering U.S. presidential election data between 1976 and 2020, identifying the candidate with the highest vote in each state for each election year. This is done using the dplyr package to group and filter the data based on the maximum vote count per state, followed by summarizing to capture the winner, their party, and the highest votes. The results are displayed in a formatted table using the gt package and a bar chart is created to show the count of each party winning by state across the years.

Show the code

#| code-fold: true
#| code-summary: "Show the code"

library(dplyr)
library(DT)
library(ggplot2)

# Sample 1000 rows or fewer from each table for display
display_sample <- function(data, table_name) {
  sample_data <- data[sample(nrow(data), min(1000, nrow(data))), ]  # Sample 1000 rows or fewer
  datatable(
    sample_data,
    options = list(
      pageLength = 10,            # Display 10 rows per page
      scrollX = TRUE,             # Enable horizontal scrolling
      dom = 'tip'                 # Only show table, input filter, and pagination
    ),
    caption = paste("Sample of", table_name, "(Up to 1000 rows)")
  )
}

# Filter data for all presidential election years from 1976 to 2020
state_winner <- president_data %>%
  filter(office == "US PRESIDENT", year %in% seq(1976, 2020, by = 4)) %>%  # Include all presidential election years
  group_by(year, state) %>%
  filter(candidatevotes == max(candidatevotes, na.rm = TRUE)) %>%
  summarise(
    winner = first(candidate),
    party = first(party_simplified),
    highest_votes = max(candidatevotes, na.rm = TRUE),
    .groups = "drop"
  )

# Display a sample of the state_winner table
display_sample(state_winner, "State Winners")

Show the code

# Calculate party counts per year for the bar chart
party_count_per_year <- state_winner %>%
  count(year, party) %>%
  arrange(year, desc(n))  # Arrange the data for better display in the bar chart

# Create a bar chart to show the count of each party per year
ggplot(party_count_per_year, aes(x = factor(year), y = n, fill = party)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Party Count per Year in US Presidential Elections (1976 - 2020)",
    x = "Year",
    y = "Number of States",
    fill = "Party"
  ) +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "OTHER" = "grey")) +
  theme_minimal() +
  theme(legend.position = "top")

Task 7 - District-Wide Winner-Take-All + State-Wide “At Large” Votes

This code starts similarly by filtering data from the same period, but it focuses on calculating the total candidate votes for each party in each state, rather than identifying a single candidate’s vote. The party with the highest total votes is considered the winner for that state in each election year. The results are again shown in a gt table and visualized with a bar chart that counts the number of states won by each party for every election year, providing insights into the distribution of electoral outcomes across the U.S.

Show the code

# Load necessary libraries
library(dplyr)
library(DT)
library(ggplot2)

# Filter for presidential data in all election years from 1976 to 2020
filtered_data <- president_data %>%
  filter(office == "US PRESIDENT", year %in% seq(1976, 2020, by = 4))

# Step 1: Determine the highest candidate votes for each party by state and year
highest_votes_per_party <- filtered_data %>%
  group_by(year, state, party_simplified) %>%
  summarise(total_votes = sum(candidatevotes, na.rm = TRUE), .groups = "drop")

# Step 2: Identify the winning party for each state and year based on the highest votes
state_winners_atlarge <- highest_votes_per_party %>%
  group_by(year, state) %>%
  filter(total_votes == max(total_votes)) %>%
  slice(1) %>%  # Select first in case of ties
  ungroup()

# Display the winning party data sample using display_sample function
display_sample(state_winners_atlarge, "Winning Party by State in US Presidential Elections (1976 - 2020)")

Show the code

# Step 3: Bar chart for party count per year
party_count_per_year <- state_winners_atlarge %>%
  count(year, party_simplified) %>%
  arrange(year, desc(n))  # Arrange the data for better display in the bar chart

# Create a bar chart to show the count of each party per year
ggplot(party_count_per_year, aes(x = factor(year), y = n, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Party Count per Year in US Presidential Elections (1976 - 2020)",
    x = "Year",
    y = "Number of States",
    fill = "Party"
  ) +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "OTHER" = "grey")) +
  theme_minimal() +
  theme(legend.position = "top")

Show the code

#EC Tables

library(dplyr)
library(stringr)

# Define a function to generate EC data for a specific election year
generate_ec_table <- function(year, district_file, president_data, ucla_shapefiles, state_abbreviations) {
  
  # Load the shapefile for the corresponding congressional districts
  congressional_districts <- ucla_shapefiles[[paste0("districts", district_file, ".shp")]] %>%
    mutate(STATENAME = tolower(STATENAME)) %>%
    select(-geometry) %>%
    as.data.frame() %>%
    group_by(STATENAME) %>%
    summarize(EC = n() + 2)  # EC based on the number of districts + 2 senators per state
  
  # Filter and summarize presidential data for the given year
  vote_results <- president_data %>%
    filter(year == year) %>%
    mutate(state = tolower(state)) %>%
    group_by(state) %>%
    summarize(
      winning_party = party_simplified[which.max(candidatevotes)],
      total_votes = max(totalvotes)
    ) %>%
    ungroup()
  
  # Combine EC with winning party information
  ec_table <- congressional_districts %>%
    left_join(vote_results, by = c("STATENAME" = "state")) %>%
    mutate(
      winning_party = if_else(STATENAME == "district of columbia" & is.na(winning_party), "DEMOCRAT", winning_party)
    ) %>%
    select(STATENAME, winning_party, EC)
  
  # Rename columns to match `usmap` requirements and merge state abbreviations
  ec_table <- ec_table %>%
    rename(state = STATENAME) %>%
    left_join(state_abbreviations, by = "state") %>%
    rename(Abbreviation = abbreviation)  # Rename 'abbreviation' to 'Abbreviation'
  
  return(ec_table)
}

# List of years and corresponding district shapefile codes
years_and_districts <- list(
  c(1976, "094"), c(1980, "096"), c(1984, "098"), 
  c(1988, "100"), c(1992, "102"), c(1996, "104"), 
   c(2000, "106"), c(2004, "108"), c(2008, "110"), 
  c(2012, "112")
)

# Loop through each year and generate the EC tables
ec_tables <- lapply(years_and_districts, function(info) {
  year <- info[1]
  district_file <- info[2]
  generate_ec_table(as.numeric(year), district_file, president_data, ucla_shapefiles, state_abbreviations)
})

# Assign each generated EC table to a named list element for easy reference
names(ec_tables) <- paste0("EC_", sapply(years_and_districts, `[[`, 1))

Task 7 - State-Wide Proportional

Next, this strategy and code focuses on filtering U.S. presidential election data between 1976 and 2020, identifying the candidate with the highest vote in each state for each election year. This is done using the dplyr package to group and filter the data based on the maximum vote count per state, followed by summarizing to capture the winner, their party, and the highest votes. The results are displayed in a formatted table using the gt package and a bar chart is created to show the count of each party winning by state across the years.

Show the code

library(dplyr)
library(ggplot2)

# Step 1: Combine all electoral college data into a single dataframe from 1976 to 2012
ec_data <- bind_rows(
  lapply(names(ec_tables), function(name) {
    year <- as.numeric(sub("EC_", "", name))
    mutate(ec_tables[[name]], year = year)
  })
)

# Step 2: Filter for presidential data in all required years (1976 to 2012)
filtered_data <- president_data %>%
  filter(office == "US PRESIDENT", year %in% seq(1976, 2012, 4))  # Only presidential election years

# Step 3: Find the winning candidate for Democrat and Republican parties in each state and year
party_winners <- filtered_data %>%
  filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) %>%
  group_by(year, state, party_simplified) %>%
  summarise(
    candidate = candidate[which.max(candidatevotes)],  # Get candidate with highest votes per party
    candidate_votes = max(candidatevotes, na.rm = TRUE),
    total_state_votes = max(totalvotes, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(vote_percentage = candidate_votes / total_state_votes)  # Calculate percentage of votes

# Ensure consistent formatting of state names
party_winners <- party_winners %>%
  mutate(state = str_to_title(state))

ec_data <- ec_data %>%
  mutate(state = str_to_title(state))

# Step 4: Join with EC data and calculate allocated EC votes
proportional_ec_votes <- party_winners %>%
  left_join(ec_data, by = c("year", "state")) %>%
  mutate(
    allocated_ec_votes = round(EC * vote_percentage)
  ) %>%
  select(year, state, candidate, party_simplified, EC, vote_percentage, allocated_ec_votes)

# Step 5: Calculate total allocated EC votes by year and party
total_ec_votes_by_party <- proportional_ec_votes %>%
  group_by(year, party_simplified) %>%
  summarise(total_allocated_ec_votes = sum(allocated_ec_votes, na.rm = TRUE), .groups = "drop")

# Step 6: Create bar chart for EC votes by year and party
ec_votes_bar_chart <- ggplot(total_ec_votes_by_party, aes(x = factor(year), y = total_allocated_ec_votes, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Total Allocated Electoral Votes by Party (1976 to 2012)",
    x = "Year",
    y = "Total Allocated EC Votes",
    fill = "Party"
  ) +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal()

# Display the bar chart
print(ec_votes_bar_chart)

Task 7 - National Proportional

Lastly, the code starts similarly by filtering data from the same period, but it focuses on calculating the total candidate votes for each party in each state, rather than identifying a single candidate’s vote. The party with the highest total votes is considered the winner for that state in each election year. The results are again shown in a gt table and visualized with a bar chart that counts the number of states won by each party for every election year, providing insights into the distribution of electoral outcomes across the U.S.

Show the code

# Step 3: Calculate vote percentage for Democrat and Republican parties in each state and year
party_percentage <- filtered_data %>%
  filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) %>%
  group_by(year, state, party_simplified) %>%
  summarise(
    total_party_votes = sum(candidatevotes, na.rm = TRUE),
    total_state_votes = max(totalvotes, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(vote_percentage = total_party_votes / total_state_votes)

# Ensure consistent formatting of state names
party_percentage <- party_percentage %>%
  mutate(state = str_to_title(state))

ec_data <- ec_data %>%
  mutate(state = str_to_title(state))

# Step 4: Join with EC data and calculate allocated EC votes for each party by vote percentage
proportional_ec_votes <- party_percentage %>%
  left_join(ec_data, by = c("year", "state")) %>%
  mutate(
    allocated_ec_votes = round(EC * vote_percentage)
  ) %>%
  select(year, state, party_simplified, EC, vote_percentage, allocated_ec_votes)

# Step 5: Calculate total allocated EC votes by year and party
total_ec_votes_by_party <- proportional_ec_votes %>%
  group_by(year, party_simplified) %>%
  summarise(total_allocated_ec_votes = sum(allocated_ec_votes, na.rm = TRUE), .groups = "drop")

# Step 6: Create bar chart for EC votes by year and party
ec_votes_bar_chart <- ggplot(total_ec_votes_by_party, aes(x = factor(year), y = total_allocated_ec_votes, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Total Allocated Electoral Votes by Party (1976 to 2012)",
    x = "Year",
    y = "Total Allocated EC Votes",
    fill = "Party"
  ) +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal()

# Display the bar chart
print(ec_votes_bar_chart)

Evaluating Fairness of ECV Allocation Schemes

In evaluating the fairness of different Electoral College Vote (ECV) allocation schemes, I considered several methods: State-Wide Winner-Take-All, District-Wide Winner-Take-All + State-Wide “At Large” Votes, State-Wide Proportional, and National Proportional. After analyzing the results, I determined that the State-Wide Proportional method is the fairest. This is because, when applied across various election years, there is no significant variance in the results, and it seems to provide a more balanced representation of the popular vote relative to the electoral vote.

The District-Wide Winner-Take-All + State-Wide “At Large” Votes method, however, exhibits a notable shift in the 1984 election, showing the largest impact and variance. This method tends to amplify the margin of victory for the candidate with a slight statewide edge. In contrast, under the State-Wide Proportional method, the bar chart for 1984 reveals much less variance, reflecting a more equitable distribution of votes.

From this analysis, it is clear that the State-Wide Proportional allocation scheme provides the most consistent and balanced results across elections, while the District-Wide Winner-Take-All method can lead to more dramatic swings in electoral outcomes.

Footnotes

https://cdmaps.polisci.ucla.edu/↩︎
https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html↩︎
MIT Election Data and Science Lab, 2017, “U.S. House 1976-2022,” https://doi.org/10.7910/DVN/IGOUN2, Harvard Dataverse, v13, UNF:6: Ky5FkettbvohjTSN/IvldA== [fileUNF].↩︎
MIT Election Data and Science Lab, 2017, “U.S. President 1976-2020,” https://doi.org/10.7910/DVN/42MVDX, Harvard Dataverse, v8, UNF:6:F0opd1IRbeY190yVfzglUw== [fileUNF].↩︎
Partial code was collaborated with Chloe Yu.↩︎