Incident Response Playbook: From Detection to Recovery
A practical guide to handling security incidents — detection, containment, eradication, recovery, and lessons learned. With templates and checklists.
Incident Response Playbook: From Detection to Recovery
When a breach happens, panic is your enemy. The difference between a minor incident and a catastrophic breach often comes down to one thing: preparation.
This playbook provides a structured approach to incident response that you can adapt to your organization.
The Incident Response Lifecycle
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Preparation │───►│ Detection │───►│ Containment │
└─────────────┘ └─────────────┘ └──────┬──────┘
│
┌─────────────┐ ┌─────────────┐ ┌──────▼──────┐
│ Lessons │◄───│ Recovery │◄───│ Eradication │
│ Learned │ │ │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
Phase 1: Preparation (Before Incidents)
Build Your IR Team
| Role | Responsibility |
|---|---|
| IR Lead | Coordinates response, makes decisions |
| Security Analyst | Technical investigation, forensics |
| IT Operations | System access, containment actions |
| Communications | Internal/external messaging |
| Legal | Regulatory compliance, liability |
| Management | Resource allocation, escalation |
Essential Documentation
## IR Runbook Contents
□ Contact list (24/7 phone numbers)
□ Escalation matrix
□ System inventory and owners
□ Network diagrams
□ Backup locations and procedures
□ Vendor contacts (ISP, cloud providers, security tools)
□ Legal/regulatory requirements
□ Communication templates
Tools Ready to Deploy
# Forensics toolkit
- Volatility (memory analysis)
- Autopsy/Sleuth Kit (disk forensics)
- Wireshark (network capture)
- KAPE (artifact collection)
# Response tools
- osquery (endpoint visibility)
- Velociraptor (DFIR at scale)
- TheHive (case management)
- MISP (threat intel sharing)
Phase 2: Detection
Alert Triage Process
Alert Received
│
▼
┌─────────────────┐
│ Is it a true │──No──► Close as false positive
│ positive? │ Document why
└────────┬────────┘
│ Yes
▼
┌─────────────────┐
│ What's the │
│ severity? │
└────────┬────────┘
│
┌────┴────┐
▼ ▼
Critical Low/Medium
│ │
▼ ▼
Page IR Queue for
Team investigation
Severity Classification
| Severity | Description | Response Time | Example |
|---|---|---|---|
| P1 - Critical | Active breach, data exfil | Immediate (15 min) | Ransomware executing |
| P2 - High | Confirmed compromise | 1 hour | Malware on endpoint |
| P3 - Medium | Suspicious activity | 4 hours | Failed login spike |
| P4 - Low | Minor policy violation | 24 hours | Unauthorized software |
Initial Assessment Questions
## Quick Triage (5 minutes)
1. What systems are affected?
2. What data could be at risk?
3. Is the attack ongoing?
4. What's the potential blast radius?
5. Do we need to escalate immediately?
Phase 3: Containment
Short-Term Containment (Stop the Bleeding)
# Network isolation
iptables -I INPUT -s $ATTACKER_IP -j DROP
iptables -I OUTPUT -d $ATTACKER_IP -j DROP
# Disable compromised account
net user compromised_user /active:no # Windows
usermod -L compromised_user # Linux
# Isolate host (but keep it running for forensics)
# Option 1: Network isolation
ifconfig eth0 down
# Option 2: VLAN quarantine
# Move to isolated VLAN via switch config
# Option 3: EDR isolation
# Use your EDR's network containment feature
Evidence Preservation
CRITICAL: Preserve evidence BEFORE eradication!
# Memory dump (do this FIRST - volatile!)
# Linux
sudo dd if=/dev/mem of=/mnt/forensics/memory.dump
# Windows (using winpmem)
winpmem_mini_x64.exe memory.raw
# Disk image (if system can be taken offline)
sudo dd if=/dev/sda of=/mnt/forensics/disk.img bs=4M status=progress
# Network capture
tcpdump -i eth0 -w /mnt/forensics/capture.pcap
# Collect logs
tar -czvf /mnt/forensics/logs.tar.gz \
/var/log/ \
/var/log/auth.log \
/var/log/syslog
Document Everything
## Incident Timeline
| Timestamp (UTC) | Action | Actor | Notes |
|-----------------|--------|-------|-------|
| 2025-06-15 14:23 | Alert triggered | SIEM | Suspicious PowerShell |
| 2025-06-15 14:25 | Analyst assigned | @jsmith | P2 severity |
| 2025-06-15 14:32 | Confirmed malicious | @jsmith | C2 beacon identified |
| 2025-06-15 14:35 | Host isolated | @mchen | Network containment via EDR |
Phase 4: Eradication
Identify Root Cause
## Root Cause Analysis
1. How did the attacker get in?
- [ ] Phishing
- [ ] Vulnerable service
- [ ] Stolen credentials
- [ ] Supply chain
- [ ] Insider
2. How did they move laterally?
- [ ] Credential dumping
- [ ] Exploiting trust relationships
- [ ] Misconfigured permissions
3. What persistence mechanisms exist?
- [ ] Scheduled tasks
- [ ] Services
- [ ] Registry run keys
- [ ] Web shells
- [ ] Backdoor accounts
Remove the Threat
# Remove malware
rm -f /path/to/malware
# Kill malicious processes
pkill -9 -f "malicious_process"
# Remove persistence
# Check crontabs
crontab -l
crontab -r # Remove if malicious
# Check systemd services
systemctl list-units --type=service
systemctl disable malicious.service
rm /etc/systemd/system/malicious.service
# Check startup scripts
ls -la /etc/init.d/
ls -la ~/.bashrc ~/.profile # Check for backdoors
# Reset compromised credentials
# ALL credentials the attacker could have accessed
Verify Eradication
# Scan for remaining IOCs
grep -r "malicious_string" /
find / -name "*.suspicious" 2>/dev/null
# Check for unknown processes
ps auxf | grep -v "known_good"
# Verify network connections
netstat -tulpn | grep ESTABLISHED
ss -tulpn
# Run security scan
clamscan -r /
rkhunter --check
Phase 5: Recovery
Restore Operations
## Recovery Checklist
□ Restore from clean backups (verified uncompromised)
□ Rebuild systems from known-good images
□ Reset ALL potentially compromised credentials
□ Patch vulnerabilities that enabled the attack
□ Increase monitoring on affected systems
□ Gradual return to production (staged)
□ Verify business functionality
Validation Period
# Enhanced monitoring for 30 days post-incident
# - Additional logging
# - More aggressive alerting thresholds
# - Daily IOC sweeps
# Example: Watch for reinfection
watch -n 60 'grep -c "IOC_STRING" /var/log/syslog'
Phase 6: Lessons Learned
Post-Incident Review (48-72 hours after closure)
## PIR Template
### Incident Summary
- **ID**: INC-2025-0615
- **Duration**: 14:23 - 18:45 UTC (4h 22m)
- **Severity**: P2 - High
- **Impact**: 3 endpoints compromised, no data exfiltration confirmed
### Timeline
[Detailed timeline from documentation]
### What Went Well
- Detection within 5 minutes of initial activity
- Containment prevented lateral movement
- Clear communication throughout
### What Could Be Improved
- Initial triage took 15 minutes (target: 5)
- Backup restoration process unclear
- Missing runbook for this attack type
### Action Items
| Action | Owner | Due Date |
|--------|-------|----------|
| Create ransomware-specific runbook | @jsmith | 2025-06-30 |
| Improve backup restore documentation | @mchen | 2025-06-25 |
| Add detection for this TTP | @security | 2025-06-22 |
### Root Cause
Phishing email bypassed email security, user clicked malicious link,
macro executed PowerShell downloader.
### Recommendations
1. Implement macro blocking for external emails
2. Deploy browser isolation for risky clicks
3. Conduct phishing awareness training
Communication Templates
Internal Notification (to staff)
Subject: [ACTION REQUIRED] Security Incident - Password Reset
Team,
Our security team detected suspicious activity on our network.
As a precaution, please reset your password immediately at [LINK].
What to do:
1. Reset your password now
2. Enable MFA if you haven't already
3. Report any suspicious emails to [email protected]
What NOT to do:
- Don't click links in unexpected emails
- Don't share your credentials with anyone
We'll provide updates as we learn more.
- Security Team
External Notification (to customers)
Subject: Important Security Notice
Dear Customer,
We recently identified unauthorized access to some of our systems.
We immediately took action to contain the incident and engaged
cybersecurity experts to investigate.
What happened:
[Brief, factual description]
What information was involved:
[Specific data types]
What we're doing:
[Actions taken]
What you can do:
[Actionable steps for customers]
For questions, contact: [email protected]
We sincerely apologize for any concern this may cause.
Quick Reference: IR Checklist
## Immediate (First 15 minutes)
□ Confirm the incident is real
□ Classify severity
□ Alert IR team
□ Begin documentation
□ Preserve volatile evidence
## Short-term (First hour)
□ Contain the threat
□ Identify affected systems
□ Collect additional evidence
□ Notify stakeholders
□ Establish communication channel
## Medium-term (First 24 hours)
□ Complete forensic collection
□ Identify root cause
□ Eradicate threat
□ Begin recovery planning
□ Legal/regulatory assessment
## Long-term (Week+)
□ Full recovery
□ Enhanced monitoring
□ Post-incident review
□ Implement improvements
□ Update documentation
Conclusion
Incident response is a skill developed through practice, not just reading. Key takeaways:
- Prepare before incidents — Have plans, tools, and contacts ready
- Document everything — Memory fades, logs don’t
- Preserve before you eradicate — Evidence is fragile
- Communicate clearly — Panic spreads faster than malware
- Learn from every incident — Each one makes you stronger
The goal isn’t to prevent all incidents — that’s impossible. The goal is to detect quickly, respond effectively, and emerge stronger.