1% doesn’t seem like a lot until you realise that the one per cent includes some of the world’s biggest banks, TV stations, Cloud Providers, the NHS and plenty of small, medium and large organisations worldwide. Many people initially believed this to be a Microsoft issue; however, it soon became apparent this wasn’t the case.
If you don’t know, Crowdstrike provides endpoint protection and security tools for Windows, Linux and MacOS devices.
Here is what happened, in the words of Crowdstrike:
On July 19, 2024 at 04:09 UTC, as part of ongoing operations, CrowdStrike released a sensor configuration update to Windows systems. Sensor configuration updates are an ongoing part of the protection mechanisms of the Falcon platform. This configuration update triggered a logic error resulting in a system crash and blue screen (BSOD) on impacted systems.
The sensor configuration update that caused the system crash was remediated on Friday, July 19, 2024 05:27 UTC.
This issue is not the result of or related to a cyberattack.
Why is this important?
It’s very clear that many organisations affected by this issue have allowed Crowdstrike to push out updates to their sensors without any testing. This will always leave you vulnerable to this type of incident. Whilst having the ability to receive quick updates from a security vendor is great and provides immediate protection from a possible threat, you need to manage the possibility of something like this happening again. Thankfully (and I use that term loosely), this wasn’t an attack from a threat actor that had managed to compromise a product and pushed out an update.
However, the recovery from this type of incident can be long and drawn out, depending on the size of your estate. At the moment, the fix is manual, and the .sys file that caused the issue has been deleted. That’s okay for a 20-seat company with an internal IT team. Imagine if this is a global bank with 4000-5000 seats and some in remote locations. This is a much harder fix to implement.
What can we do moving forward?
What this does highlight is the need for a few things:
- Solid backup and recovery plans for endpoint devices
- Risk assessments that include software failures, along with implementable recovery plans, are agreed upon with senior business stakeholders.
- Have an incident response plan in place, tested, and agreed upon with senior management or business owners.
- Implementing a test regime before pushing out software updates to devices.
The Aftermath
Due to this incident, Crowdstrike’s share price lost between 17% and 20% of its value. What’s more important is perhaps the reputational hit the company will take. I am sure many of their competitors, such as Sophos, Microsoft Defender, or SentinelOne, will be rubbing their hands together and challenging their sales and marketing teams to make the most of the opportunity.
The Crowdstrike CEO released a statement on their website: To Our Customers and Partners | CrowdStrike
The reality is that any piece of software used within an estate could cause this problem. The issue, in this case, was the scale of the incident and the number of people affected because organisations were unable to service their customer base, whatever the organisation type.