CrowdStrike’s faulty update crashed 8.5 million Windows devices, says Microsoft (www.theverge.com)
from jeffw@lemmy.world to technology@lemmy.world on 21 Jul 2024 02:17
https://lemmy.world/post/17791289

#technology

threaded - newest

autotldr@lemmings.world on 21 Jul 2024 02:20 next collapse

This is the best summary I could come up with:


CrowdStrike’s faulty update caused a worldwide tech disaster that affected 8.5 million Windows devices on Friday, according to Microsoft.

Microsoft says that’s “less than one percent of all Windows machines,” but it was enough to create problems for retailers, banks, airlines, and many other industries, as well as everyone who relies on them.

Separately, the technical breakdown from CrowdStrike released Friday explains more about what happened and why so many systems were affected all at once.

CrowdStrike’s breakdown explains the configuration file that was at the heart of the issue:

CrowdStrike explained that the file is not a kernel driver but is responsible for “how Falcon evaluates named pipe1 execution on Windows systems.” Security researcher and Objective See founder Patrick Wardle says that the explanation aligns with the earlier analysis he and others provided about the cause of the crash, as the problem file “C-00000291- “triggered a logic error that resulted in an OS crash” (via CSAgent.sys).”

CrowdStrike’s channel file updates were pushed to computers regardless of any settings meant to prevent such automatic updates, Wardle noted.


The original article contains 193 words, the summary contains 175 words. Saved 9%. I’m a bot and I’m open source!

Technus@lemmy.zip on 21 Jul 2024 02:34 next collapse

No validation, in the driver or the updater software.

No validation or automated testing on publish.

No staged rollouts.

Just utterly irresponsible all around.

boatswain@infosec.pub on 21 Jul 2024 02:53 next collapse

A coworker of mine has worked with CrowdStrike in the past; I haven’t. He said that the releases he was familiar with from them in the past were all staged into groups and customers were encouraged to test internally before applying them; not sure if this is a different product or what, but it seems like a big step backwards of what he’s saying is right.

ramble81@lemm.ee on 21 Jul 2024 03:24 next collapse

I first dealt with them at least 10+ years ago and at the time they had no ability to do staged roll outs or targeted roll outs. We got updates when they said we did, no choice or control. We had to resort to updating our firewall to restrict the download endpoint and only open it in groups to do a phased update.

boatswain@infosec.pub on 21 Jul 2024 03:58 next collapse

Interesting! Sounds like they may have changed things a few times, or maybe my co-worker’s memory has some gaps.

BearOfaTime@lemm.ee on 21 Jul 2024 21:33 collapse

Oh ffs

SupraMario@lemmy.world on 21 Jul 2024 14:10 collapse

Channel files are different from sensor updates, which you have no control over for version control. Sensor releases you have control over.

boatswain@infosec.pub on 21 Jul 2024 15:08 collapse

Ah interesting, thanks!

suzune@ani.social on 21 Jul 2024 02:57 next collapse

The idea of “security software” is ridiculous overall. You buy a software to fix security problems in Windows and it violates the original product by inserting code into kernel code. You lose support by the original product vendor. And you think you’re secure, even the whole stuff makes you forget that IT should be always fit in solving security/restorability problems even when everything else fails.

Dagnet@lemmy.world on 21 Jul 2024 03:03 next collapse

And on a Friday to make things worse

barkingspiders@infosec.pub on 21 Jul 2024 03:28 next collapse

Preach it

demizerone@lemmy.world on 21 Jul 2024 04:04 next collapse

When I worked there six years ago, the company motto was “two feet on the gas pedal” because the CEO was a race car driver. I bailed after 10 months, giving up pre IPO shares. The management for my team was non existent, and I was on the build and release team. People were doing releases of manually. They’ve improved the automation some from what I here, but looks like the motto finally hit them.

I should also say their metrics were absolutely staggering. The log aggregator was doing something like 2 trillion requests a week. All backed by splunk. I never heard what they were paying, but it must have been fucking nuts.

Rediphile@lemmy.ca on 21 Jul 2024 15:36 next collapse

Race car drivers definitely don’t put both feet on the gas pedal though… Like, what?

FiskFisk33@startrek.website on 22 Jul 2024 11:21 collapse

I would’ve preferred Colin McRae’s classic in the same spirit: “when in doubt, flat out”

prole@sh.itjust.works on 22 Jul 2024 11:41 collapse

The unfortunate thing is that, in the long run, that strategy will probably be super effective. Unless Europe (with the only internet regulations that actually have teeth) does something harsh enough, they will probably pay a few small fines over this at most. Cost of doing business and probably baked in already.

0x0@programming.dev on 21 Jul 2024 22:49 collapse

No staged rollouts.

I read somewhere that CS does allow for staged rollouts but some updates deliberately ignore them.

Darkassassin07@lemmy.ca on 21 Jul 2024 02:48 next collapse

As if the borked update wasn’t bad enough, it was also forced on users that explicitly said not to install it.

CrowdStrike’s channel file updates were pushed to computers regardless of any settings meant to prevent such automatic updates

Wxfisch@lemmy.world on 21 Jul 2024 03:10 next collapse

From my reading this is misleading at best and likely wrong. I don’t work with CrowdStrike Falcon but have installed and maintained very similar EDR tools in enterprise environments and the channel updates referenced are the modern version of definition updates for a classic AV engine. Being up to date is the entire point and so typically there are only global options to either grab those updates from the vendor or host them internally on a central server but you wouldn’t want to slow roll or stage those updates since that fundamentally reduces the protection from zero days and novel attacks that the product is specifically there to detect and stop. These are not engine updates in that they don’t change the code that is running, they give the code new information about what an attack will look like to allow it to detect malicious activity as soon as CrowdStrike knows what the IoCs look like.

In this case it appears that one of these updates pointed to a bad memory location which caused the engine to crash the OS, but it wasn’t a code update that did it (like a software patch). That should have been caught in QA checks prior to the channel update being pushed out, but it’s in CrowdStrikes interest to push these updates to all of their customers PCs as quickly as they can to allow detection of novel attacks.

tutus@sh.itjust.works on 21 Jul 2024 03:38 next collapse

Being up to date is the entire point and so typically there are only global options to either grab those updates from the vendor or host them internally on a central server but you wouldn’t want to slow roll or stage those updates since that fundamentally reduces the protection from zero days and novel attacks that the product is specifically there to detect and stop.

That’s not your, or Crowdstrikes, decision to make. If organizations have applied settings to not install updates automatically then that’s what they expect to happen and you need to honour it. You don’t “know best”. They do.

Telorand@reddthat.com on 21 Jul 2024 03:54 next collapse

That should have been caught in QA checks prior to the channel update being pushed out…

I work in QA, and part of the job is justifying why it’s necessary to keep a team of people that doesn’t actually “produce” anything. Either their QA team is now in the hotseat, or Crowdstrike is now realizing why they need one.

Either way, it sounds like a basic smoke test would have uncovered the issue, and the fact that nobody found this means nobody bothered to do one of the most basic tests: turn it on and see if it "catches fire.’

a1studmuffin@aussie.zone on 21 Jul 2024 08:21 collapse

God, even if they didn’t have QA test it, they should have had continuous integration running to test all new channel updates against all versions of their program, considering the update will affect all of them. What an epic process failure.

CriticalMiss@lemmy.world on 21 Jul 2024 04:02 next collapse

Our organization is configured to install N-1 of current release specifically to avoid this type of stuff. Does it matter? No, we got hit just like everyone else.

Docus@lemmy.world on 21 Jul 2024 10:12 next collapse

Being up to date is the entire point

No, it isn’t. The point is to keep systems safe and operational. Blindly rolling out untested updates is not a good strategy for that. I have seen entire systems shut down due to false alerts from updated antivirus software. Luckily only test environments, before these updates were rolled out to production. It does not take much to test updates like this before rolling them out to your entire organisation.

Darkassassin07@lemmy.ca on 21 Jul 2024 13:52 collapse

I’m getting real sick of companies acting like rapists and society just accepting it, if not justifying it for them.

No means no. Plain and simple.

nova_ad_vitum@lemmy.ca on 21 Jul 2024 08:47 next collapse

The distinction between that and a malicious hack consists entirely of intent .

Reawake9179@lemmy.kde.social on 21 Jul 2024 10:53 collapse

Well that’s just terrorism then

Wiz@midwest.social on 21 Jul 2024 13:00 collapse

Terrorism would require a political angle.

This is malicious incompetence.

rottingleaf@lemmy.world on 21 Jul 2024 15:49 collapse

One can argue that there is a very niche political angle to this - teaching Windows users the fear of God, so that they’d see the error of their ways. But it works in our favor, so let’s not concentrate attention on it.

db2@lemmy.world on 21 Jul 2024 04:10 next collapse

I doubt it was that few.

alucard@sopuli.xyz on 21 Jul 2024 16:26 collapse

For reals. Their self reporting is just trying to mitigate damages from the mistake

EvilEyedPanda@lemmy.world on 22 Jul 2024 00:21 collapse

How many windows updates have bricked PC’s over the years?