Experts Say CrowdStrike Update Likely Skipped Checks

On July 19, 2024, a routine update by cybersecurity firm CrowdStrike caused a global IT outage, affecting numerous industries, including global banks, airlines, hospitals, and government offices. This significant disruption highlights the importance of thorough testing and the potential consequences of oversight in software updates.

 

What Happened

CrowdStrike's Falcon sensor software update, intended to enhance security against hacking by updating threat defenses, included faulty code that led to widespread crashes of systems running Microsoft's Windows operating system. These crashes quickly manifested as the now infamous "blue screen of death," disrupting critical services worldwide. Security experts noted that the update likely did not undergo adequate quality checks before its deployment.

Steve Cobb, Chief Security Officer at Security Scorecard, commented, “What it looks like is, potentially, the vetting or the sandboxing they do when they look at code, maybe somehow this file was not included in that or slipped through.”

Patrick Wardle, a security researcher, identified the issue as being in a file containing either configuration information or signatures. He explained, “It's very common that security products update their signatures, like once a day... because they're continually monitoring for new malware and because they want to make sure that their customers are protected from the latest threats.”

John Hammond, Principal Security Researcher at Huntress Labs, added, “Ideally, this would have been rolled out to a limited pool first. That is a safer approach to avoid a big mess like this.”

"The blue screen of death" - a critical error screen displayed by Microsoft Windows.

How We Avoided the Global IT Outage

At Total Group, we prioritise comprehensive security by rigorously testing updates before deployment. We operate under the assumption that there is always potential risk, ensuring that our clients remain protected from vulnerabilities as we essentially act as a protective barrier between you and your providers. This approach is why none of our clients were affected by the recent faulty update from providers such as CrowdStrike and Microsoft.

  • Test Critical Updates: Before deploying any critical updates to our network, we rigorously test them in a controlled environment to ensure they don’t disrupt our clients' systems.
  • Staggered Rollouts for Non-Critical Updates: Non-critical updates are tested for seven days for all deployments that protect our clients, allowing us to catch and address any issues on a smaller scale first.

So...Can It Happen Again? You Bet. With the volume of threats perpetuating day by day, the pressure to rush out reactive updates is only increasing, raising the risks of deploying faulty updates.

This recent CrowdStrike / Microsoft failure serves as a stark reminder of the importance of thorough update management and the need for a global conversation about how such IT solutions are maintained and updated. By implementing these preventive measures, businesses can significantly reduce their risk of facing similar disruptions. At Total Group, our commitment to safeguarding our clients’ operations ensures that they remain protected under our business model, even in the face of widespread IT crises.

*
*

View our privacy policy here