OSINT in the Modern Threat Landscape

Open Source Intelligence (OSINT) is the practice of collecting and analyzing publicly available information to produce actionable intelligence. It’s the first phase of any penetration test, the backbone of threat intelligence programs, and an essential skill for investigators, journalists, and security researchers alike.

The OSINT landscape in 2025 is simultaneously richer and more complex than ever. APIs get locked down, platforms restrict scraping, but new data sources emerge constantly. This guide covers the tools and techniques that actually work today.

Domain and Infrastructure Reconnaissance

Subfinder — Passive Subdomain Discovery

Subfinder uses dozens of passive sources (certificate transparency logs, DNS datasets, web archives) to enumerate subdomains without touching the target directly.

# Install
go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest

# Basic enumeration
subfinder -d example.com -o subdomains.txt

# Use all sources with API keys configured
subfinder -d example.com -all -o subdomains.txt

# Pipe into httpx for live detection
subfinder -d example.com -silent | httpx -silent -status-code -title

Amass — Advanced Attack Surface Mapping

OWASP Amass performs both passive and active enumeration, building a graph of the target’s infrastructure:

# Install
go install -v github.com/owasp-amass/amass/v4/...@master

# Passive enumeration (no direct contact with target)
amass enum -passive -d example.com -o amass-results.txt

# Active enumeration with brute-forcing
amass enum -active -d example.com -brute -w /usr/share/wordlists/dns.txt

# Visualize the attack surface graph
amass viz -d3 -d example.com

Shodan — The Search Engine for Devices

Shodan indexes every internet-facing device — servers, IoT gadgets, SCADA systems, webcams. It’s the single most powerful OSINT tool for infrastructure recon.

# Install CLI
pip install shodan
shodan init YOUR_API_KEY

# Search for target's infrastructure
shodan search "hostname:example.com"

# Find specific services
shodan search "org:\"Target Corp\" port:22"

# Check a specific IP
shodan host 203.0.113.42

# Monitor for new exposures
shodan alert create "Target monitoring" 203.0.113.0/24

Pro tip: Combine Shodan with Censys for broader coverage. Each indexes the internet slightly differently.

# Censys CLI
pip install censys
censys search "services.tls.certificates.leaf.names: example.com"

Email and Identity OSINT

theHarvester — Email and Name Collection

# Install
pip install theHarvester

# Gather emails, subdomains, hosts, and names
theHarvester -d example.com -b all -l 500

# Specific sources
theHarvester -d example.com -b google,linkedin,bing -f output.html

Holehe — Email Account Discovery

Holehe checks if an email address is registered on 120+ websites — without triggering login alerts:

pip install holehe
holehe [email protected]

Output reveals which platforms the target uses, building a social footprint.

GHunt — Google Account Investigation

git clone https://github.com/mxrch/GHunt
cd GHunt
pip install -r requirements.txt

# Login and investigate
python ghunt.py login
python ghunt.py email [email protected]

GHunt extracts Google Maps reviews, profile photos, YouTube channels, and calendar availability from a Gmail address.

Social Media Intelligence

Maigret — Username Search Across 2500+ Sites

pip install maigret

# Search for a username across all platforms
maigret username_target --all-sites -fo report.html

# Generate PDF report
maigret username_target --pdf report.pdf

# Search with Tor for anonymity
maigret username_target --tor

Twint / snscrape — Social Media Scraping

With official APIs increasingly restricted, alternative scrapers fill the gap:

# snscrape for Twitter/X (no API key needed)
pip install snscrape

# Get recent tweets from a user
snscrape --jsonl --max-results 100 twitter-user target_handle > tweets.json

# Search tweets by keyword and location
snscrape --jsonl twitter-search "keyword geocode:48.8566,2.3522,10km" > results.json

Sherlock — Username Hunting

The classic username enumeration tool:

git clone https://github.com/sherlock-project/sherlock
cd sherlock
pip install -r requirements.txt

python sherlock target_username --print-found --output results.txt

Geolocation and Image Analysis

ExifTool — Metadata Extraction

Photos often contain GPS coordinates, camera models, timestamps, and software versions:

# Install
sudo apt install libimage-exiftool-perl

# Extract all metadata
exiftool photo.jpg

# Extract GPS coordinates specifically
exiftool -gps* photo.jpg

# Batch strip metadata (for your own OpSec)
exiftool -all= -overwrite_original *.jpg

GeoSpy / GeoEstimation — AI-Powered Geolocation

Modern AI models can estimate photo locations from visual cues (architecture, vegetation, signage, road markings):

# GeoSpy API
curl -X POST https://dev.geospy.ai/predict \
  -H "Authorization: Bearer YOUR_KEY" \
  -F "image=@mystery_photo.jpg"
# Use ris (Reverse Image Search) CLI tool
pip install reverse-image-search

# Or use the manual approach: upload to
# - Google Lens (images.google.com)
# - TinEye (tineye.com)
# - Yandex Images (often finds results Google misses)

Network and Dark Web OSINT

SpiderFoot — Automated OSINT Framework

SpiderFoot automates 200+ OSINT modules into a single scan:

# Install
pip install spiderfoot

# Start web UI
spiderfoot -l 127.0.0.1:5001

# CLI scan
spiderfoot -s example.com -t INTERNET_NAME,IP_ADDRESS,EMAILADDR -q

OnionScan — Dark Web Analysis

go install github.com/s-rah/onionscan@latest

# Scan a .onion site
onionscan --verbose http://example.onion

OnionScan checks for operational security failures on Tor hidden services — exposed Apache server-status pages, SSH fingerprints, and metadata leaks.

Building an OSINT Workflow

A structured OSINT investigation follows this flow:

1. Define Objective
   └─→ What are you looking for? (person, org, infrastructure)

2. Passive Collection
   └─→ Subdomain enum, WHOIS, certificate transparency
   └─→ Social media, breach databases, public records

3. Semi-Passive Analysis
   └─→ DNS resolution, web crawling, metadata extraction

4. Correlation & Pivoting
   └─→ Link email → username → social profiles → real identity
   └─→ Connect infrastructure: IP → ASN → hosting → other domains

5. Reporting
   └─→ Timeline, entity relationships, confidence levels

The OSINT Framework Cheat Sheet

Bookmark these resources:

CategoryTools
Domainssubfinder, amass, dnsx, whois, crt.sh
IPs & PortsShodan, Censys, Masscan, Nmap
EmailstheHarvester, Holehe, Hunter.io
UsernamesMaigret, Sherlock, Namechk
Socialsnscrape, Twint, GHunt
ImagesExifTool, Google Lens, TinEye, GeoSpy
Dark WebOnionScan, Ahmia, Torch
FrameworksSpiderFoot, Maltego, Recon-ng

OpSec for the Investigator

OSINT is a two-way street. While you investigate, you leave traces:

# Use a VPN or Tor for all OSINT activities
torsocks curl https://target-site.com

# Create burner accounts for social media recon
# Never use your real identity

# Use a dedicated VM or container
docker run -it --rm kalilinux/kali-rolling

# Monitor your own exposure
maigret your_own_username --all-sites

Golden rules:

  1. Never access anything that requires authentication you don’t have
  2. Passive before active — always
  3. Document everything with timestamps
  4. Assume the target has monitoring
  5. Stay within legal boundaries — OSINT uses public data only

The information is out there. The question is whether you find it before the adversary does.