In recent years, the issue of fair redistricting has gained significant attention. Gerrymandering, the practice of manipulating electoral district boundaries to favor specific political parties, undermines the democratic process. As a part of my classwork at Stony Brook University, I developed Huskies: a redistricting analyzer using GerryChain—a powerful Python library designed for generating and analyzing districting plans through Markov Chain Monte Carlo (MCMC) methods. Explore huskies here.

Introducing Huskies

Huskies, is a redistricting analyzer that processes state census data, generates new redistricting plans, and serves them to users via an HTTP server. The repository is divided into two main sections:

Server: A Spring HTTP server that serves plans and ensemble summaries from the database without performing any calculations.
Scripts: This section handles all calculations, generates new plans, and makes POST requests to the database after generating the plans.

We had various options for our tech stack but chose specific technologies to best meet our needs. For example, we could have used other backend frameworks like Django or Express, and different databases like PostgreSQL or MySQL. Our main concerns were respecting the structure of our data being GeoJson, and avoid having to add any unnecessary entities, given that very few writes are occuring and all plan generation is done offline.

Backend:

Java & Spring Boot: Powers the HTTP server, handling requests and serving data efficiently and reliably.
MongoDB: Stores the redistricting plans and ensemble summaries, offering flexibility and scalability for our data needs. Python: Runs the scripts for calculations and generating new plans, leveraging its extensive libraries for data processing and analysis. On the frontend, alternative choices could have included Angular or Vue for the framework, and Bootstrap or Tailwind for UI design. However, we opted for the following:

Frontend:

React: Provides a dynamic and interactive user interface, making the application responsive and user-friendly.
MaterialUI: Offers a clean and responsive design for the frontend, ensuring consistency and ease of use.
Leaflet: Enables detailed and interactive map visualizations for viewing district plans, making geographical data easy to explore.

Gerrymandering

Gerrymandering is common lingo in political circles, but many find themselves unclear on exactly what it is. Gerrymandering is the result of political parties playing with the boundaries of voting districts to give themselves an unfair edge. The most common forms occur when a party redraws the lines so their supporters are spread out just enough to win more areas ("cracking"), or they pack the opposing party’s voters into a few districts to minimize their impact elsewhere ("packing"). This practice can lead to weirdly shaped districts that don’t really represent the community’s interests, making elections less fair and skewing political power. It's a sneaky way to tilt the playing field in politics.

$Maryland's 3rd District, it is a fractal like shape, snaking side to side unnaturally$

Algorithms Behind Plan Generation

The Theory of GerryChain

GerryChain leverages MCMC methods to explore the space of possible redistricting plans. By iterating through potential plans and applying various constraints, GerryChain can generate a diverse set of districting plans for analysis. Here's a breakdown of the theory and process behind the algorithms:

Graph Representation

We represent the state as a graph where nodes correspond to geographic units (e.g., census blocks), and edges represent adjacencies between them. This graph serves as the foundation for all redistricting plans.

Initial Partition

The process begins with an initial redistricting plan, known as the initial partition. This plan is based on existing district assignments and serves as the starting point for the MCMC process.

Proposal Mechanism (ReCom)

The ReCom (Recursive Division) method is a key component of GerryChain. It works by merging adjacent districts and then re-dividing them to form new districts. This method ensures that the new districts maintain population balance and geographical contiguity.

Markov Chain

The Markov Chain iteratively moves from one plan to another by accepting or rejecting proposed plans based on the defined constraints. This process generates an ensemble of plans, providing a robust basis for statistical analysis.

Explaining Markov Chains

Markov Chains are a mathematical concept used to model systems that transition from one state to another. Here’s a simplified explanation:

States:
- Imagine a series of possible scenarios or "states" that something can be in. For example, weather conditions like sunny, rainy, or cloudy.
Transitions:
- The system moves from one state to another based on certain probabilities. For example, if it's sunny today, there might be a 70% chance it stays sunny tomorrow and a 30% chance it becomes cloudy.
Memoryless Property:
- Markov Chains have the property that the future state depends only on the current state, not on the sequence of events that preceded it. This is called the "memoryless" property.
Transition Matrix:
- The probabilities of moving from one state to another are usually represented in a matrix called a transition matrix. Each entry in this matrix shows the probability of transitioning from one state to another.

How Markov Chains Work

Initial State:
- Start in an initial state. For example, today’s weather is sunny.
Transitions:
- Use the transition probabilities to determine the next state. If sunny, we use the probabilities (e.g., 70% sunny, 30% cloudy) to decide tomorrow’s weather.
Iterations:
- Repeat this process over many steps to see how the system evolves over time.

Example: Weather Prediction

States: Sunny, Rainy, Cloudy.
Transition Probabilities:
- From Sunny: 70% Sunny, 20% Cloudy, 10% Rainy.
- From Rainy: 50% Rainy, 30% Cloudy, 20% Sunny.
- From Cloudy: 40% Cloudy, 40% Sunny, 20% Rainy.

By starting in a given state and using these probabilities to move from one state to the next, we can simulate and predict future states of the system over time. Markov Chains are used in various fields, including:

Economics: Modeling market behavior.
Biology: Predicting population dynamics.
Computer Science: Algorithms like Google's PageRank.
Game Theory: Analyzing strategic decisions.

For our use case, Markov Chains help us understand and predict the behavior of complex systems by using a simple yet powerful probabilistic framework.

Explaining the ReCom Algorithm for Redistricting

Redistricting is the process of drawing the boundaries of electoral districts. The ReCom algorithm is a method used to create fair and compact districting plans. Here’s how it works in simple terms:

Initial Setup:
- Start with a map divided into small units, like precincts or blocks.
- These units are grouped into an initial set of districts based on current boundaries.
Merge and Split:
- Merge: Two adjacent districts are selected and merged into one larger district.
- Split: This larger district is then split back into two districts using a method that ensures each new district has a nearly equal population and maintains geographical compactness.
Constraints:
- Population Balance: Each new district must have a population close to the ideal number, ensuring fair representation.
- Compactness: The shape of the districts should be compact, avoiding odd or stretched shapes that might indicate gerrymandering.
Markov Chain Process:
- The algorithm repeats the merge-and-split process many times, each time creating a new districting plan.
- This process is guided by a Markov Chain Monte Carlo (MCMC) method, which combines randomness with Markov chains to sample complex probability distributions.

Huskies in Action

For each state, Huskies generates a massive batch of 10,000 districting plans using the ReCom algorithm. Here's how it all breaks down:

Initial Setup:
- Start with the state's current district map as the baseline.
Running ReCom:
- For each plan, run 1,000 iterations of the ReCom algorithm. This means merging and re-dividing districts to ensure they're balanced and compact.
Picking the Best Plans:
- Out of the 10,000 plans, we cherry-pick the ones that highlight different scenarios.
Using Filters:
- With the filters on the left side, you can easily switch between these scenarios and see how each one impacts the state's political landscape.

By sampling a wide array of plans and showcasing key scenarios, Huskies helps you visualize and understand the effects of different redistricting strategies in a fun and informative way.

When you fire up Huskies and load a state, it starts by showing you the 2022 enacted plan for that state.

New York's 2022 Enacted District Plan, Democrats leading with 15/26 districts

On the left, you'll find a bunch of filters to explore different generated plans. For example, you can see how the map looks if it's manipulated to favor Democrats.

Or flip the script and check out a map favoring Republicans. Want balance? There's a filter for the most even vote split between the two parties.

You can also dive into maps with the highest and lowest population differences between districts.

These features make exploring redistricting scenarios both fun and informative, giving you a clear view of different political landscapes.

Conclusion

By leveraging GerryChain and MCMC methods, Huskies generates and analyzes redistricting plans, providing valuable insights into the fairness of districting maps. With the help of GerryChain plans are guaranteed to meet population and compactness constraints, contributing to the creation of fair and representative electoral districts. This project is a step towards addressing the challenges of gerrymandering and promoting a more democratic electoral process.

Whether you are a researcher, policymaker, or an advocate for fair redistricting, Huskies offers a robust tool for learning about gerrymandering and analyzing redistricting plans. Find the source code here.

Redistricting 101: Huskies, ReCom, and the Fight Against Gerrymandering