OSINT Techniques

OSINT (Open-Source Intelligence) is the practice of gathering publicly available information about a target without directly interacting with its infrastructure. OSINT supports passive reconnaissance and helps build an intelligence profile before any active testing begins. It reveals digital footprints, infrastructure details, employee data, exposed credentials, third-party associations, and hidden systems that would otherwise remain unseen.

OSINT is one of the most important skills in web pentesting because it uncovers information that tools alone cannot detect. This chapter provides full-length theory and practical step-by-step techniques for applying OSINT efficiently.

Purpose of OSINT in Pentesting

OSINT helps identify external exposure, data leaks, forgotten systems, and behavioral patterns. Attackers heavily depend on OSINT for planning intrusion paths, and pentesters use the same techniques to assess risk.

OSINT supports:

  • Domain profiling

  • Employee intelligence

  • Email pattern discovery

  • Subdomain mapping

  • Technology fingerprinting

  • Credential leak identification

  • Third-party dependency analysis

  • Cloud infrastructure mapping

  • Discovery of undocumented services

OSINT outputs feed directly into active recon, making it a critical early-stage component.

OSINT Categories

OSINT data sources fall into several categories, each revealing a different type of information.

Domain and Infrastructure Intelligence

This includes everything related to the target’s online technical footprint:

  • Domain registrations

  • DNS records

  • IP allocations

  • CDN and cloud providers

  • Historic domain changes

  • Certificate transparency logs

These sources reveal how the organization hosts and manages its web presence.

Human Intelligence (HUMINT)

This focuses on information about employees:

  • Job titles

  • Roles and departments

  • Social media patterns

  • Internal tools mentioned

  • Emails or usernames

  • Leaked credentials on past breaches

Employee intelligence often reveals internal structure and weak entry points.

Technical Metadata Intelligence

Public files contain hidden metadata:

  • Usernames

  • Device names

  • Software versions

  • Internal folder paths

  • Document creation history

Metadata provides internal exposure without touching the target’s systems.

Third-Party Intelligence

Organizations use multiple external services:

  • Payment processors

  • Cloud storage

  • Email platforms

  • CRM tools

  • Helpdesk systems

  • Analytics systems

OSINT helps uncover these third-party dependencies and their potential weaknesses.

Practical OSINT Techniques

OSINT relies on systematic mapping of available information. Below are essential techniques with exact practical steps.

Search Engine Enumeration

Search engines index publicly accessible data. Using advanced search operators, you can uncover information not visible on the main website.

Google Dorks

Identify exposed directories:

site:example.com intitle:"index of"

Find login portals:

site:example.com inurl:login

Discover files:

site:example.com filetype:pdf
site:example.com filetype:docx
site:example.com filetype:xls

Find staging environments:

site:*.example.com -www.example.com

These queries often reveal endpoints and files that are not linked from the main site.

Certificate Transparency OSINT

CT logs record all SSL certificates issued for a domain. They often expose internal subdomains.

Search:

https://crt.sh/?q=example.com

or use:

subfinder -d example.com

CT logs commonly uncover:

  • Development subdomains

  • Staging portals

  • Internal API endpoints

These discoveries form the basis for deeper enumeration.

Passive DNS OSINT

Passive DNS platforms collect DNS records historically.

Useful services include:

  • SecurityTrails

  • DNSDB

  • VirusTotal DNS

  • PassiveTotal

Search for:

  • Past subdomains

  • Old IP addresses

  • Retired infrastructure

These records reveal what the company used in the past and may still have exposed.

Public Document OSINT

Public files often leak internal information.

Download a PDF and inspect metadata:

exiftool document.pdf

Metadata reveals:

  • Author name

  • Device name

  • Software version

  • Timestamp

  • Internal directories

Office documents often expose internal file paths used during creation.

GitHub and Public Repo OSINT

Public repositories are one of the most sensitive OSINT sources. Companies often push internal code accidentally.

Search GitHub:

org:example

or:

"example.com" filename:config

or:

"password" repo:example

Look for:

  • API keys

  • Credentials

  • Environment variables

  • Internal comments

  • Deprecated scripts

If a company does not have an official GitHub organization, employees may still push internal code to personal accounts.

Employee Enumeration

Use LinkedIn to enumerate employees:

Search for:

site:linkedin.com "example.com"

Collect:

  • Full names

  • Departments

  • Job roles

  • Email patterns

Typical email pattern discovery:

firstname.lastname@example.com

This helps during username enumeration and password spraying simulations.

Email Breach OSINT

Tools such as:

  • HaveIBeenPwned

  • Dehashed

  • LeakCheck

  • Snusbase

reveal leaked credentials attached to company emails.

Example search:

user@example.com

Look for:

  • Password reuse patterns

  • Historical passwords

  • Email presence in multiple breaches

These leaks guide authentication attacks in later chapters.

Social Media OSINT

Employees often unintentionally reveal internal info.

Look for:

  • Screenshots containing dashboards

  • Mentions of software used internally

  • Technology announcements

  • Job posting requirements

Example:

A job listing mentioning “Docker, Kubernetes, and Django” reveals the backend stack.

Public Source Code OSINT

Google GitHub dorks:

site:github.com "example.com"

Look for:

  • Old repositories

  • Internal scripts

  • Environment files

Developers frequently leak infrastructure data unintentionally.

Business and Legal Document OSINT

Public company filings may expose:

  • Internal addresses

  • Administrative contact names

  • Legal representatives

  • Email patterns

Government portals and compliance sites often host PDFs with metadata.

Cloud Storage OSINT

Identify misconfigured cloud storage buckets through naming conventions.

Common cloud bucket patterns:

  • example.s3.amazonaws.com

  • static.example.com

  • storage.googleapis.com/example

  • example.azureedge.net

Check bucket accessibility:

curl http://example.s3.amazonaws.com

Misconfigured buckets can expose:

  • Private files

  • Backups

  • Logs

  • API keys

  • Source code

OSINT Automation Tools

OSINT can also be automated to streamline collection.

Useful tools:

theHarvester

theHarvester -d example.com -b all

Collects:

  • Emails

  • Subdomains

  • Hosts

  • Public records

Amass (Intelligence Mode)

amass intel -d example.com

Maltego

Visual mapping for:

  • Employees

  • Domains

  • DNS

  • Infrastructure

  • Social media

Recon-ng

A modular recon framework for:

  • Credential breaches

  • Subdomain discovery

  • Info scraping

OSINT automation reduces manual work and consolidates results.

Organizing OSINT Data

Organize collected data to support later phases.

Create folders:

  • employees

  • domains

  • subdomains

  • leaks

  • documents

  • infrastructure

Store findings in separate files:

  • emails.txt

  • subdomains_passive.txt

  • leaks.txt

  • github_results.txt

  • metadata.txt

Organized OSINT becomes the foundation for active recon and exploitation.

Integrating OSINT Into Pentesting

OSINT data directly feeds into multiple stages:

  • Subdomain enumeration

  • DNS mapping

  • Email attack surface

  • Cloud resource enumeration

  • API discovery

  • Authentication testing

  • Technology fingerprinting

A well-executed OSINT phase reveals more attack surfaces than any automatic scanner.

Intel Dump

  • OSINT gathers public information without touching target systems.

  • Use search engines, CT logs, passive DNS, archives, and metadata.

  • Analyze GitHub and public repositories for leaks.

  • Enumerate employees through social networks and job listings.

  • Search breach databases for leaked credentials.

  • Inspect public documents using metadata extraction tools.

  • Identify cloud storage buckets and third-party dependencies.

  • Use OSINT automation tools like theHarvester, Amass, and Recon-ng.

  • Organize results to support deeper recon and exploitation.

HOME LEARN COMMUNITY DASHBOARD