Incident response in cloud environments is fundamentally different from traditional on-premises investigations. Cloud systems are API-driven, distributed, and highly automated—meaning evidence lives in logs, managed services, and virtual infrastructure rather than physical hardware. Effective cloud incident response requires understanding shared responsibility, cloud-native tooling, and the unique challenges of ephemeral resources and identity-based attacks.

This chapter explains how incident response works in AWS, Azure, and GCP, the stages of cloud-focused IR, and the techniques investigators use to contain, analyze, and eradicate threats in cloud systems.

Why Cloud Incident Response Is Different

Cloud environments introduce unique challenges:

No physical access to servers or disks
Ephemeral resources (VMs, containers, functions) vanish quickly
Attacks often target identities, not machines
Logging must be pre-enabled or evidence may be missing
Cross-region and cross-account compromise is common
Data exfiltration happens via APIs, not OS-level tools

Incident responders must rely heavily on cloud-native evidence sources.

Cloud Shared Responsibility Model (Critical for IR)

Cloud Provider (AWS/Azure/GCP) responsibility:

Physical security
Hardware
Hypervisors
Core networking

Customer responsibility:

IAM configuration
Logging
Encryption
Application security
Data protection
OS-level hardening

This means misconfigurations by the customer are the most common cause of cloud breaches.

Key Phases of Cloud Incident Response

Cloud IR follows the standard IR lifecycle but uses cloud-specific techniques.

1. Detection & Identification

Identify unusual activity, such as:

Suspicious API calls
Strange login locations
Data access anomalies
Cloud workload execution spikes
New IAM users/keys
Unusual outbound traffic (C2)
Storage downloads

Primary detection sources:

AWS GuardDuty
Azure Security Center
GCP Security Command Center
SIEM alerts (Splunk, Sentinel, Elastic)
CloudTrail / Azure Activity / GCP Audit Logs

2. Investigation & Evidence Collection

Cloud investigations require gathering:

API logs

CloudTrail (AWS)
Azure Activity Logs
GCP Audit Logs

Identity logs

AWS IAM Access Analyzer
Azure AD Sign-ins
GCP IAM Recommender

Storage logs

S3/Blob/Cloud Storage access logs

Network logs

VPC/NSG/Firewall flow logs

Instance evidence

Snapshot disks
Memory (if VM still active)
Container logs
Lambda/Function logs

Time is critical—cloud resources may auto-terminate.

3. Containment

Containment techniques in cloud environments include:

Identity Containment

Disable compromised access keys
Rotate credentials
Remove newly created users
Block suspicious IPs
Revoke OAuth tokens
Enforce MFA

Network Containment

Update security groups
Block outbound connections
Restrict VPC peering
Disable open ports

Resource Containment

Isolate compromised VM by:
- Removing from load balancers
- Changing SGs to “deny all”
- Capturing snapshots before shutdown

Storage Containment

Lock down public buckets
Disable SAS tokens (Azure)
Block cross-account access

Containment is reversible and preserves evidence.

4. Eradication

Remove attacker presence:

Delete malicious IAM policies
Remove rogue service accounts
Stop unauthorized tasks/functions
Cleanup malware in VMs or containers
Remove public access from storage
Reset misconfigured firewall/security group rules
Delete unauthorized snapshots or images

Ensure no persistence remains in:

IAM
Serverless functions
EventBridge/CloudWatch events
Cron jobs (inside VMs)
Launch templates
Instance metadata scripts

5. Recovery

Restore systems to secure state:

Redeploy workloads from clean AMIs/Images
Regenerate IAM keys
Validate security group rules
Re-enable logging
Patch vulnerabilities
Rebuild containers from source

Also ensure attacker backdoors are eliminated.

6. Post-Incident Review

Perform a full cloud-focused lessons-learned analysis:

What IAM roles were abused?
What misconfigurations allowed the attack?
Which logs were missing?
How could automation improve detection?
What guardrails should be added?

This step helps strengthen the architecture.

Cloud-Specific Incident Response Techniques

1. Auto-Snapshotting & Evidence Preservation

Before shutting down a compromised VM:

Snapshot EBS (AWS) / Managed Disk (Azure) / Persistent Disk (GCP)
Export instance logs
Preserve cloud function logs
Archive API logs

Snapshots allow forensic imaging later.

2. Serverless / Function IR

Investigate:

CloudWatch logs
Azure Function logs
GCP Cloud Functions logs
IAM execution role permissions
Trigger events (S3, Pub/Sub, EventBridge)

Attackers often deploy malicious serverless functions for persistence.

3. Container / Kubernetes IR

Inspect:

Pod logs
Kube-Audit logs
Node snapshots
Container registry logs
Unexpected deployments or images

Compromised containers spread quickly across clusters.

4. IAM-Centric Investigation

Most cloud breaches involve:

Stolen access keys
Over-permissive IAM roles
Misconfigured token access
Account takeover

Analyze:

Key usage
Role switching
OAuth token issuance
MFA bypass attempts

5. Cross-Region & Cross-Account Attacks

Attackers may hide activity in:

Non-default regions
Separate AWS accounts
Additional subscriptions/projects

Investigators must check all regions and all accounts.

Common Cloud Attack Patterns (for IR)

S3 bucket enumeration → mass downloads
Privilege escalation via IAM misconfiguration
Deploying crypto-mining instances
Creating persistence using IAM users or functions
Exfiltration using CloudFront or signed URLs
Deleting or modifying CloudTrail logs
Access key theft via public GitHub repos

These patterns guide response strategy.

Best Practices for Cloud Incident Response

Enable logging everywhere (CloudTrail, Flow Logs, Storage Logs)
Use MFA for all high-privilege accounts
Rotate and disable idle access keys
Implement least privilege IAM
Create separate production & investigation accounts
Use SIEM integrations (Chronicle, Sentinel, Splunk)
Pre-build IR playbooks specific to cloud environments
Monitor for unusual API actions
Use guardrails: SCPs, Azure Policies, GCP Organization Policies

Intel Dump

Cloud incident response relies on API logs, identity logs, storage logs, and network flow logs—because there is no physical evidence.
Key activities include detection, evidence collection, containment, eradication, recovery, and post-mortem analysis.
Cloud attacks commonly target IAM for privilege escalation and storage for data theft.
Responders must quickly snapshot VMs, isolate resources, revoke access keys, restrict security groups, and review cross-region activity.
Effective IR requires proper logging, MFA enforcement, least-privilege IAM, and continual monitoring with cloud-native security tools.