Fired After Untested Code in Production: Lessons

Know someone who needs this? Share

A Bug in CrowdStrike kernel driver triggered a global reboot death spiral, disrupting air travel, hospitals, banks, and more. Here’s how it happened.

What is a Blue Screen of Death?

The Blue Screen of Death (BSOD) is an error message that appears when something goes wrong on Windows devices. It’s displayed on a bright blue background — a hue that strikes fear into the hearts of users. Your computer might shut down, restart, or remain stuck on the BSOD.

What Happened?

Crowdstrike broke the cardinal rule of development, never push updates on a Friday!

On July 19, 2024, at 04:09 UTC, a routine sensor configuration update by CrowdStrike triggered a logic error, leading to system crashes and blue screens (BSOD) on impacted Windows systems. This update, part of the Falcon platform’s protection mechanisms, was promptly remediated by 05:27 UTC the same day. The incident was not related to any cyberattack.

What Satya Nadella has to say?

Impact

Customers using Falcon sensor for Windows version 7.11 and above, online between 04:09 UTC and 05:27 UTC, were affected. Systems that downloaded the update during this period experienced crashes.

Configuration File Primer

The update involved “Channel Files,” which are crucial for the Falcon sensor’s behavioral protection mechanisms. These files are regularly updated to counter new threats. The specific file affected was Channel File 291, responsible for evaluating named pipe execution on Windows systems.

Technical Details

Channel File 291, located in C:\Windows\System32\drivers\CrowdStrike\ with a filename starting with “C-00000291-” and ending with .sys, triggered a logic error leading to the crashes. This file manages how Falcon interacts with named pipes, a common communication method in Windows.

Remediation

CrowdStrike has updated Channel File 291 to fix the logic error. No further changes are planned for this file. Systems not impacted by the update will continue to function normally. Linux and macOS systems were unaffected as they do not use Channel File 291.

The Developer’s Story

Latest from CrowdStrike on Root Cause Analysis

CrowdStrike is conducting a thorough root cause analysis to understand the logic flaw and prevent future occurrences. Updates will be shared as the investigation progresses.

Know someone who needs this? Share
QABash Media

QABash Media

QABash Media publishes practical technology insights to help engineers evolve beyond testing — covering AI, DevOps, system design, and quality practices used by high-performing tech teams.

Articles: 58

QABash Insider ⭐

Join 20K+ SDETs getting AI testing tools and automation playbooks.

Leave a Reply

Your email address will not be published. Required fields are marked *