VulnTech Root Cause Analysis

Root Cause Analysis (RCA) is the process of identifying how an incident started, what enabled it, and what allowed the attacker or malware to succeed.
RCA determines the initial point of failure, not just the symptoms.
A SOC analyst must find the exact weakness—misconfiguration, user action, vulnerability, or missing detection—that allowed the incident to occur.

This chapter explains RCA in full-scale SOC depth, focusing on log evidence, investigative workflow, attacker timelines, and real SOC case studies.

What Root Cause Analysis Really Means

RCA answers the most important questions in incident response:

How did the attacker get in?
What allowed the attack to succeed?
What vulnerability or mistake was exploited?
Could this have been prevented?
What detection or control failed?
Is the root cause fixed so it won’t happen again?

RCA goes beyond identifying malware or suspicious behavior.
It digs into the origin, not the symptoms.

When RCA Happens in the Incident Lifecycle

RCA occurs after containment and eradication, during the deeper investigation led by L2/L3 analysts.

It relies on:

Host logs
Network logs
EDR telemetry
Forensics
User activity history
Configuration audits

RCA ensures complete understanding of the incident's beginning.

Core Components of Root Cause Analysis

1. Trigger Event Identification

Identify the exact moment where malicious activity started.

Example:

User clicked phishing URL at 10:32 UTC → dropper.ps1 downloaded

This event starts the entire attack timeline.

2. Initial Access Vector

Determine how the attacker first entered.

Common vectors:

Phishing
Vulnerable public-facing apps
Weak passwords
Exposed RDP
Stolen credentials
Misconfigured cloud resources

Example:

Brute force → successful SSH login

3. Execution Path

Identify what code executed and how.

Examples:

Encoded PowerShell
MS Office macro execution
Linux shell script in /tmp
DLL side-loading

Example:

WINWORD.exe → powershell.exe → payload.exe

4. Privilege Escalation Cause

Identify how the attacker obtained elevated rights.

Examples:

Sudo misconfiguration
Token theft
Kerberoasting
Exploits

Example:

sudoers file allowed user to execute bash without password

5. Persistence Mechanism

Determine how the attacker maintained access.

Examples:

Registry Run keys
Cron job
Scheduled task
Malicious service

Example:

schtasks /tn "Updater" /tr C:\Users\Public\bd.exe

6. Detection Failure Analysis

Find out why detection didn’t fire earlier.

Reasons:

Logging disabled
Rule too broad or too narrow
Telemetry missing
Threat was unknown
Tool malfunction

Example:

Sysmon not installed → no process creation logs

7. Control Weakness Identification

Identify which security control failed.

Examples:

Missing patches
Weak firewall rules
No MFA
Unrestricted outbound connections
Unmonitored DNS traffic

Example:

Public-facing Tomcat server unpatched for 8 months

8. Final Root Cause Statement

A one-sentence explanation summarizing the true cause.

Example:

Root Cause: User executed a malicious macro from a phishing email, which downloaded a payload due to lack of attachment filtering and insufficient PowerShell restrictions.

Practical RCA Workflow (SOC-Level)

Below is the exact workflow L2/L3 analysts follow.

Step 1 — Validate Timeline Start

Identify earliest suspicious action:

10:32 – User clicked phishing link
10:33 – payload.ps1 downloaded
10:34 – C2 communication established

The first suspicious event becomes the starting point.

Step 2 — Identify Attack Vector

Using logs:

Proxy logs → malicious URL
Email logs → phishing email
Firewall logs → inbound traffic
Authentication logs → brute force success

Example:

Email attachment triggered macro → malicious script executed

Step 3 — Reconstruct Execution Chain

Using Sysmon and Linux logs:

WINWORD.exe → powershell.exe → curl → payload.exe

/tmp/bd.sh executed → created miner binary

Attack path shows how malware ran.

Step 4 — Determine Privilege Escalation

Check for:

sudo
exploitation
credential dumping
AD misconfigurations

Example:

4672 — Special privileges assigned to compromised user

Step 5 — Identify Lateral Movement

Firewall + Windows auth logs:

4624 LogonType 3 from infected host

Network logs:

SMB connection to file server

Step 6 — Persistence Review

Check:

Registry Run keys
Cron jobs
Services
Scheduled tasks

Example:

HKCU\Software\Microsoft\Windows\Run → updater.exe

Step 7 — Identify Control Failures

Examples:

No EDR on machine
SIEM rule too weak
Lack of network segmentation
Unrestricted outbound traffic
No MFA on admin accounts

Step 8 — Deliver Root Cause Statement

Final deliverable includes:

Trigger event
Attack vector
Failed control
Weakness exploited
What allowed escalation
How to prevent recurrence

Real SOC RCA Examples

Example 1 — Malware Infection from Phishing

Findings:

User opened malicious Word doc
Macro executed PowerShell
Downloaded payload
C2 communication established
No EDR installed
PowerShell logging disabled

Root Cause:

Phishing email led to macro execution due to inadequate email filtering and insufficient PowerShell restrictions.

Example 2 — SSH Brute Force → Server Compromise

Findings:

Public SSH exposed
Password-based auth enabled
Weak password
Attacker brute-forced credentials
Installed crypto miner

Root Cause:

Weak SSH password and lack of brute force protection allowed unauthorized access.

Example 3 — Lateral Movement in Windows Domain

Findings:

User credentials stolen via LSASS dumping
No credential guard
Attacker used valid credentials
Moved through SMB and WinRM

Root Cause:

LSASS memory exposure due to lack of endpoint hardening enabled credential theft and lateral movement.

Example 4 — Cloud Misconfiguration

Findings:

S3 bucket misconfigured as public
Data exposed externally
No IAM policy restrictions
No monitoring

Root Cause:

Public S3 bucket misconfiguration caused unauthorized external access.

Analyst Workflow for RCA

Collect all logs (endpoint + network + cloud)
Identify earliest malicious event
Determine entry point
Reconstruct execution chain
Identify privilege escalation
Identify lateral movement
Identify persistence mechanisms
Determine detection failures
Identify configuration or policy gaps
Finalize root cause statement

A thorough RCA prevents repeat incidents.

Intel Dump

RCA identifies the true origin of an attack, not just symptoms.
It requires reconstructing the full timeline from the earliest malicious event.
RCA includes initial access, execution, escalation, persistence, lateral movement, and detection failure analysis.
Common root causes include phishing, weak passwords, unpatched systems, misconfigurations, missing logging, and lack of segmentation.
RCA ends with a clear statement: what caused the incident and how to prevent it in the future.

What Root Cause Analysis Really Means

When RCA Happens in the Incident Lifecycle

Core Components of Root Cause Analysis

1. Trigger Event Identification

2. Initial Access Vector

3. Execution Path

4. Privilege Escalation Cause

5. Persistence Mechanism

6. Detection Failure Analysis

7. Control Weakness Identification

8. Final Root Cause Statement

Practical RCA Workflow (SOC-Level)

Step 1 — Validate Timeline Start

Step 2 — Identify Attack Vector

Step 3 — Reconstruct Execution Chain

Step 4 — Determine Privilege Escalation

Step 5 — Identify Lateral Movement

Step 6 — Persistence Review

Step 7 — Identify Control Failures

Step 8 — Deliver Root Cause Statement

Real SOC RCA Examples

Example 1 — Malware Infection from Phishing

Example 2 — SSH Brute Force → Server Compromise

Example 3 — Lateral Movement in Windows Domain

Example 4 — Cloud Misconfiguration

Analyst Workflow for RCA

Intel Dump

📚 Chapter Feedback