SUBWAY STAFFING UNDER PRESSURE
Overview of Data
Incidents Dataset
The Incidents Dataset is a simulated operational dataset created for the purposes of this analysis. It includes 10 years of incident records across Manhattan subway stations. Each incident includes:
- station where it occurred
- staffing level at the time
- response time in minutes
- categorical severity rating: low, medium, or high
The severity variable reflects the operational impact of each event:
Severity: Low
Minor issues that cause little to no service disruption.
Examples:
- Minor customer complaint
- Small cleaning or maintenance issue
- A brief equipment reset
Severity: Medium
Events that require coordination and slow down operations but don’t stop service.
Examples:
- Train dispatching issue
- Door malfunction
- Medical response affecting platform operations
Severity: High
Major incidents that likely affect service or require emergency response.
Examples:
- Track intrusion
- Severe medical emergency
- Equipment failure blocking tracks
- Power or signal failure
Local Events Dataset
The Local Events Dataset is a simulated record of public events created for this analysis. It contains records of public events happening in Manhattan during Summer 2025, each paired with the nearest subway station.
Each event includes:
- date it occurred
- event name or type
- closest subway station
- estimated attendance
Upcoming Events Dataset
The Upcoming Events Dataset is a simulated projection of public events planned for Summer 2026. It supports staffing forecasts by identifying event-driven demand.
Each planned event includes:
- date it is scheduled
- event name or type
- nearest subway station
- expected attendance
The expected_attendance variable reflects anticipated crowd size and helps estimate which stations may see increased demand.
Ridership Dataset
The ridership dataset is a simulated record of daily subway usage for Summer 2025. It captures passenger activity across Manhattan stations.
Each record includes:
- date of the ridership count
- station where activity occurred
- number of station entrances
- number of station exits
The entrances and exits variables represent overall station traffic and help evaluate how events, fare changes, and operational conditions shape daily demand.
1.Ridership & Local Events — Summer 2025
What this analysis shows
This analysis examines daily systemwide subway ridership during Summer 2025, highlighting two major influences on ridership levels:
• Local events, which create distinct ridership spikes at specific dates.
• The July 15 fare increase, which may affect overall ridership levels before and after the change.
By combining daily ridership totals with event attendance estimates and marking the fare-increase date, the visualization helps identify short-term surges, longer-term trends, and the possible behavioral impact of fare adjustments.
Average ridership (before July 15):
Average ridership (after July 15):
Change:
Key findings
- Several local events produce clear, sharp ridership spikes—especially high-attendance events such as concerts, festivals, and parades.
- The fare increase on July 15 coincides with a noticeable shift: the post-increase average ridership is lower than the pre-increase average.
- Despite day-to-day variation, ridership shows a fairly steady pattern with event-driven deviations rather than long-term declines.
- The visualization makes it easy to compare typical daily ridership against event-influenced highs and post-fare-change levels.
- The combination of area shading, event markers, and pre/post average lines provides a clear picture of both short-term disruptions and structural changes in demand.
2.Response Times Across Manhattan Subway Stations
What this analysis shows
Using 10 years of historical incident data, we compare the average response time across stations. Stations with lower average response times are considered operationally “better,” while those with higher averages may need additional staffing or resources.
Key findings
- Fastest stations are shown in green — consistently low response times.
- Average stations, shown in yellow, cluster around the system average line.
- Slowest stations, shown in lavander, may need additional staffing or operational support.
3.Staffing Burden
What this analysis shows
This heatmap visualizes how much expected event attendance each station must accommodate on each day of Summer 2026. Stations with frequent large events accumulate higher “burden” levels. The visualization helps identify where staffing levels may be misaligned with anticipated event-driven surges.
Key findings
- Burden is highly uneven: a small number of stations face many large events, while others see very little event-driven traffic.
- Canal St, Chambers St, and 34 St–Penn Station appear among the most consistently burdened stations, with multiple high-attendance event days.
- Large spikes (in dark red) indicate days with 10,000–15,000+ expected attendees—likely requiring increased staffing.
- The heatmap reveals temporal clustering: certain weeks experience multiple events at different stations, creating simultaneous staffing demands.
- Stations with many low-attendance days (light yellow) may not require staffing changes.
Conclusion
Overall, the analysis shows that subway demand in Manhattan is shaped by a mix of predictable daily patterns, operational performance differences across stations, and concentrated event-driven surges. Some stations consistently handle incidents more efficiently, while others experience slower response times that may indicate structural or staffing-related challenges. The upcoming 2026 event calendar further reveals that a small group of stations will face disproportionately high attendance on multiple days, creating clear pressure points for the system.
Together, these findings suggest that staffing decisions will be most effective when they are targeted—prioritizing stations with both historically slower operational performance and heavier projected event loads. Aligning staff resources with these patterns can help keep service reliable during periods of elevated demand while avoiding unnecessary allocation in stations with lighter activity.