Learn about the Power of Gathid Intelligence. The Future of the Identity Graph Starts Here >>>

20 Tech Experts On How To Fix Hidden Weaknesses In Emergency Systems

The widespread 911 outage in Pennsylvania in July 2025 underscored how fragile critical public safety systems can be. The disruption, which left residents without reliable access to emergency services, revealed a spectrum of vulnerabilities—from outdated infrastructure and centralized dependencies to weak change management and limited visibility into system access.

Preventing future failures will take more than short-term fixes. It requires strengthening vendor accountability, modernizing architectures, stress-testing systems under real-world conditions, and building resilience into both technology and processes. Below, members of Forbes Technology Council highlight the weaknesses exposed by the Pennsylvania 911 outage and share their recommendations for ensuring essential public safety systems remain reliable when they’re needed most.

“The Pennsylvania 911 outage highlights a critical vulnerability: a lack of visibility into identity access across interconnected systems. If access controls are outdated or unmanaged, one weak link can compromise uptime. Using knowledge graphs and digital twins, organizations can map and monitor access relationships, reducing risk and improving resilience in the face of outages like this.” – Craig Davies, Gathid

1. Conduct Regular Vendor Reviews And RFPs

Governments rarely switch vendors once they have a working product or solution in place. They need to ensure the best solution is in place to provide public benefit by holding regular reviews and open bids for such products and services. This will help strengthen systems. – Lane Campbell, GovSoft

2. Create A Federal Resilience Registry For NG911 Vendors

Current regulations under 47 CFR Part 9 require annual certifications from 911 service providers on circuit diversity, backup power and network monitoring. However, these rules lack resilience benchmarks for IP-based 911 infrastructure. A federal resilience registry and certification authority could enable real-time failover validation and ensure NG911 vendors demonstrate fault tolerance. – Cristian Randieri, Intellisystem Technologies

3. Add Automated Failover To Strengthen Service Continuity

The Pennsylvania 911 outage revealed how a single failure can disrupt critical services. Building in redundancy, real-time monitoring and automated failover creates the resilience needed to keep essential systems running, even when part of the network goes down. – Richard Danforth, Genasys

4. Adopt Hybrid Cloud Architectures To Eliminate Silos

This outage underscores how fragile critical communications systems can be when built on outdated or siloed infrastructure. To improve resilience, organizations should adopt cloud-enabled, hybrid-ready architectures that support real-time monitoring, redundancy and seamless failover. These systems offer greater flexibility and ensure continuity when stakes are high and downtime isn’t an option. – Luiz Domingos, Mitel

5. Develop A Rapid Rollback Plan For Change-Related Failures

One of the key performance indicators of any organization is mean time to resolve. In this case, while the issue was change-related, their rollback and time to restore took more than 12 hours. Having a robust change rollback strategy is instrumental in building resilient infrastructure and systems. – Abhinav Sharma, JPMorgan Chase

6. Audit Vendors With SLAs And Regression Testing Protocols

Though NG911 adds features, it also increases failure risks. Enterprise rollouts should have fully tested redundancy and failover, real-world load testing, and automated regression checks. Hold vendors accountable with service-level agreements and audits to ensure reliability and prevent critical service disruptions. – Debdeep Mazumder, Tradeweb Markets

7. Mature Operational Processes To Prevent Deployment Gaps

Technical issues, such as a node failure that results in a catastrophic, systemwide outage, are the result of process immaturity. In this case, the process of staging and testing systems wasn’t followed. As a result, a brittle system was deployed. Focusing on single points of technical failure obscures a more important root cause: A step was skipped. Maturing our processes helps avoid this. – James Stanger, CompTIA

8. Shift To Decentralized Cloud-Native Infrastructure Models

Pennsylvania’s 911 outage exposed overreliance on single points of failure and legacy infrastructure that lacked redundancy. To address this, public systems must adopt decentralized, cloud-native architectures with real-time failover, continuous monitoring and rigorous disaster recovery drills—ensuring resilience against outages and cyberthreats alike. – Katerina Axelsson, Tastry

9. Map And Stress-Test Dependency Chains

The outage revealed that resilience isn’t just about backups—it’s about knowing your dependency chain end-to-end and how each link behaves under stress. Few systems have this mapped and tested. Creating a living dependency map, exercised through controlled failure drills, ensures that when one link breaks, the whole system adapts without hesitation. – Abhesh Kumar, Springline Advisory

10. Decentralize Decision-Making While Preserving Oversight

The Pennsylvania event showed exactly what will happen to enterprises stuck in rigid hierarchies—one failure can ripple through the entire system because decision-making and recovery are slow. Critical systems must become decentralized yet still operate under an umbrella of excellence, where local autonomy drives speed and resilience and central oversight ensures standards, coordination and trust are never compromised. – Doug Shannon

11. Integrate AI-Powered Automation For Instant Self-Healing

Legacy emergency systems lack real-time adaptability and intelligent redundancy. A single failure cascaded across the entire network. By integrating AI-powered automation with tokenized data streams, agencies can create a self-healing infrastructure where alerts reroute instantly and critical services remain operational without human delay or centralized bottlenecks across regions. – Charles Morey, MobilEyes Inc.

12. Simulate Live Orchestration Failures To Validate Redundancy

The outage exposed how single-point dependencies remain buried in modern systems. Redundancy is often designed for hardware, but not for the orchestration logic itself. The remedy is boring, but effective: Simulate real failure paths quarterly, not just in theory. Validate that the failover works under live load, not in a clean lab. – Zameer Rizvi, Odesso Inc.

13. Reinforce Change Management With Staged Rollouts

As with any outage, Pennsylvania’s 911 failure highlighted how uncontrolled changes—whether in hardware, software or configurations—can disrupt critical systems. Strengthening change management with rigorous testing, staged rollouts and failover planning is key to preventing similar incidents. – Yogesh Malik, Way2Direct

14. Use Geographic Redundancy To Reduce Software Fragility

A routine update triggered a critical failure in emergency services, exposing the risks of software fragility. To avoid future breakdowns, organizations must implement staged rollouts, maintain geographic redundancy and test failover systems regularly. Resilient design isn’t optional when infrastructure serves public safety. – Dileep Rai, Hachette Book Group

15. Apply Knowledge Graphs To Strengthen Identity Access Control

The Pennsylvania 911 outage highlights a critical vulnerability: a lack of visibility into identity access across interconnected systems. If access controls are outdated or unmanaged, one weak link can compromise uptime. Using knowledge graphs and digital twins, organizations can map and monitor access relationships, reducing risk and improving resilience in the face of outages like this. – Craig Davies, Gathid

16. Upgrade Legacy Systems To Cloud Platforms With Real-Time Monitoring

The incident revealed the vulnerability of critical public systems that lack adequate redundancy and failover mechanisms. Upgrading to cloud-based systems with built-in resilience, redundancy and real-time monitoring can ensure continuity during outages. Regular testing and maintenance of these systems will help improve reliability and reduce the risk of similar failures. – Tannu Jiwnani, Microsoft

17. Deploy Localized Fail-Safes To Balance Cloud Dependencies

The Pennsylvania 911 outage reminded us that “cloud-based” doesn’t mean “storm-proof.” The real vulnerability? Centralized tech dependencies without localized fail-safes. It’s like putting all your lifeboats on one side of the ship. The solution? Hybrid-resilient architecture—local backups that can kick in like muscle memory when the cloud chokes. – Joel Frenette, TravelFun.Biz

18. Mandate Disaster Recovery Audits And Automatic Failover

The outage exposed a key flaw: inadequate disaster recovery and resiliency planning, as well as insufficient testing. Critical government communication systems should be mandated by law to include built-in resiliency and automatic failover. These capabilities must be audited and tested regularly—ideally, on a yearly basis or even more often—to ensure readiness in emergencies. – Harikrishnan Muthukrishnan, Florida Blue

19. Run Live Failover Drills To Convert Theory Into Practice

The outage showed how rarely critical systems are tested under real-world failure conditions. Paper redundancy means little if no one simulates what happens when a core node actually goes down. Regular live failover drills, combined with distributed backup routes, can turn theoretical resilience into actual continuity when the next crisis hits. – Umesh Kumar Sharma

20. Modernize Emergency Infrastructure With Automated Failovers

The outage was a harsh reminder of how fragile and outdated our emergency systems really are. One failure shouldn’t risk lives. It’s time to move toward cloud-native infrastructure with real-time monitoring and automated failovers. These upgrades aren’t just technical; they’re essential to keeping vital services running when every second counts. – Harvendra Singh, Publix Super Markets Inc.

 

Try Gathid Today

The Power of
Gathered Identities

Book your free 30 minute demo now.