azorius.net

CrowdStrike’s Falcon Sensor linked to Linux crashes, too • The Register (www.theregister.com)
from sabreW4K3@lazysoci.al to linux@lemmy.ml on 22 Jul 2024 06:02
https://lazysoci.al/post/15835972

#linux

threaded - newest

highduc@lemmy.ml on 22 Jul 2024 09:07 next collapse

Ofc it is. And can’t do any updates because Crowdstrike doesn’t support newer kernels. Apparently security means running out of date packages. 🤡

Bitrot@lemmy.sdf.org on 22 Jul 2024 11:36 collapse

That first issue was triggered by falcon, but was legitimately a bug in Red Hat’s kernel triggered by bpf.

rem26_art@fedia.io on 22 Jul 2024 09:13 next collapse

they seem extremely competent at writing bad software

Wooki@lemmy.world on 22 Jul 2024 10:18 next collapse

Line mus go up

baatliwala@lemmy.world on 22 Jul 2024 16:06 collapse

That line isn’t going to recover for a while now

possiblylinux127@lemmy.zip on 22 Jul 2024 17:58 collapse

But the publicity

Cyber@feddit.uk on 22 Jul 2024 10:27 collapse

Not sure if it’s the devs to blame when there’s statements like:

Kurtz therefore has the possibly unique and almost-certainly-unwanted distinction of having presided over two major global outage events caused by bad software updates.

So, I’m guessing it’s the business that’s not supporting good dev->test->release practices.

But, I agree with your point; their overall software quality is terrible.

rem26_art@fedia.io on 22 Jul 2024 11:49 collapse

true true. If the general business pressures are not conducive to proper software release practices, no amount of programming skill can help them.

BaalInvoker@lemmy.eco.br on 22 Jul 2024 11:48 next collapse

Difference between open source software and closed source software:

CrowdStrike bad coding make Linux crashes -> sysadmin has control over the system and can rapidly fix the issue by disabling CrowdStrike module -> downtime is limited
CrowdStrike bad coding make Windows crashes -> sysadmin has limited control over the system and rely on Windows/CrowdStrike people to fix the issue -> the demand is too high cause the issue happened with many computers around the world at the same time -> huge downtime while few people on Microsoft and/or CrowdStrike fix the issue one by one manually

Rentlar@lemmy.ca on 22 Jul 2024 15:51 next collapse

I’ve kept having to make this point repeatedly every time someone writes “It’s not a Microsoft/closed source problem, it happened to Linux too”.

Bitrot@lemmy.sdf.org on 22 Jul 2024 15:56 next collapse

This is a laughably bad take.

You do realize sysadmins were fixing the Windows issue and not just waiting on Microsoft and CrowdStrike - right? They just had to delete a file.

BaalInvoker@lemmy.eco.br on 22 Jul 2024 16:02 collapse

Oh! That’s why the outage could demand long time to recover! Just delete a file takes so long!

I’m glad you said it!

Bitrot@lemmy.sdf.org on 22 Jul 2024 16:12 next collapse

Uh, yes. Physically touching thousands of computers to boot them into safe mode and delete a file is time consuming. It turns out physically touching thousands of machines is time consuming anywhere, especially when it is all of them at once.

Which is why your take is laughably bad. Stick to the tech and not zealotry next time, and maybe not CNN for tech news.

superkret@feddit.org on 22 Jul 2024 16:13 collapse

You have no idea what you’re talking about.
The fix is to boot into safe or recovery mode, delete a file, reboot. That’s it.

The reason it takes so long is because millions of PCs are affected, which usually are administered remotely.
So sysadmins have to drive to multiple places, while their usual workloads wait.
On top of that, you need the encryption recovery keys for each PC to boot into safe mode.
Those are often stored centrally on a server - which may also be encrypted and affected.
Or on an Azure file share, which had an outage at the same time.
Maybe some of the recovery keys are missing. Then you have to reinstall the PC and re-configure every application that was running on it.
And when all of that is over, the admins have to get back on top of all the tasks that were sidelined, which may take weeks.

JWBananas@lemmy.world on 22 Jul 2024 16:00 next collapse

Sysadmin here. Wtf are you talking about? All we did was “rapidly fix the issue by disabling Crowdstrike module.” Or really, just the one bad file. We were back online before most people even woke up.

What do you think Crowdstrike can do from their end to stop a boot loop?

SquigglyEmpire@lemmy.world on 22 Jul 2024 21:36 next collapse

…what?

A busted kernel module/driver/plug-in/whatever that triggers a bootloop is going to require intervention on any platform no matter whether the code happens to be published somewhere out on the internet or not. On top of that, Windows allows you to control/remove 3rd party kernel drivers just like on Linux, which is exactly what many of us have been stuck doing on endless devices for the last three days.

I fully advocate for open-source software and use it where I can, but I also think we should do that by talking about its actual advantages instead of just making up nonsense that will make experienced sysadmins spit out their coffee.

MangoPenguin@lemmy.blahaj.zone on 23 Jul 2024 00:26 collapse

The fix on windows was just removing the bad file, there was no reliance on crowdstrike to fix the initial issue that I know of.

eee@lemm.ee on 22 Jul 2024 12:05 next collapse

“The most secure system is a system that’s not live. Crowdstrike, bringing you the best-in-class security.”

possiblylinux127@lemmy.zip on 22 Jul 2024 17:57 collapse

“I don’t test often but when I do it is in production”

JWBananas@lemmy.world on 22 Jul 2024 16:05 collapse

Nobody:

Crowdstrike: