// INTELLIGENCE GATHERING FRAMEWORK • A PRODUCT BY BINARYSHIELD

OSINT MASTER
CHECKLIST

A structured framework covering every stage of open-source intelligence — what to collect, where to find it, how to connect the dots, and what actually matters.

23
Modules
5
Phases
150+
Sources
100+
Tools
// Controls:
⬡ Phase 1 — Foundation // Define target → gather base data → map infrastructure
01
TARGET SCOPING
Define exactly what you're investigating before touching any tools. Bad scoping = wasted effort.
PASSIVE
What You're Gathering
Target Types Person / Individual
Organization / Company
Domain / IP Address
Username / Handle
Phone / Email
Brand / Product
Define Before Starting Full name / alias of target
Known domain(s)
Known email(s)
Location (if known)
Time period of interest
Purpose / objective
Scoping Checklist
  • Define target: person / org / domain / IP / username
  • Document all known identifiers (names, emails, domains, handles)
  • Define geographic scope (country, city, global)
  • Set time scope (recent activity vs full history)
  • Define what success looks like (what intel do you need?)
  • Create a pivot list: known seeds to expand from
  • Identify any known aliases / previous names
Starting Seeds (Pivot Points)
// From ANY of these, you can build a full profile: Full Name → find social media, emails, addresses Email Address → find breaches, social accounts, domains Domain → find IPs, emails, employees, tech stack IP Address → find org, ASN, hosted domains, ports Username/Handle → find cross-platform accounts, photos, posts Phone Number → find owner, carrier, linked accounts Company Name → find employees, domains, financials Photo → reverse image → geolocate → identify
Important vs Unnecessary at This Stage
Data PointImportanceWhy
Full legal nameCRITICALPivots to everything else
Primary emailCRITICALBreach lookup, account recovery
Primary domainCRITICALInfrastructure root
Location (country/city)HIGHNarrows search space
Known aliasesHIGHPrevents missing data
Age / DOBMEDIUMValidates identity
NationalityLOWContext only
02
DOMAIN INTELLIGENCE
Extract all data attached to a domain — registrant, history, DNS, emails, related assets.
PASSIVE
What to Gather
  • WHOIS data: registrant name, email, phone, org, address
  • WHOIS history (registrant changes over time)
  • DNS records: A, AAAA, MX, TXT, NS, CNAME, SOA
  • Domain creation / expiry date
  • Registrar name and privacy protection status
  • Historical DNS (IP changes over time)
  • Reverse WHOIS: all domains registered by same email/person
  • Certificate Transparency logs (subdomains from certs)
  • Related domains (typosquatting, similar names)
  • SPF / DMARC / DKIM email security config
Where to Look
// WHOIS whois target.com https://who.is https://whois.domaintools.com https://viewdns.info/whois/ // Historical WHOIS https://whois.domaintools.com (paid) https://completedns.com https://securitytrails.com (freemium) // DNS Lookup dig target.com ANY nslookup -type=ANY target.com https://dnsdumpster.com https://mxtoolbox.com https://hackertarget.com/dns-lookup/ // Reverse WHOIS (find all domains by registrant) https://viewdns.info/reversewhois/ https://domaintools.com/research/reversewhois/ // Certificate Transparency (subdomains) https://crt.sh/?q=%.target.com https://transparencyreport.google.com/https/certificates
Important vs Unnecessary
DataImportanceReason
Registrant emailCRITICALPivot to more domains, breach lookup
Registrant phoneHIGHReverse lookup, identity verify
IP address (A record)CRITICALPivot to other hosted domains, ISP
MX recordsHIGHEmail provider, phishing surface
NS recordsMEDIUMDNS provider, zone transfer attempt
TXT recordsHIGHReveals services (Google, Slack, etc.)
Registrar nameLOWContext only
Expiry dateLOWCheck for takeover potential only
Tools
whoisdigdnsx SecurityTrailscrt.sh ViewDNSDNSDumpster DomainToolsMXToolbox
03
IP & NETWORK RECON
Map the IP space, ASN, hosting provider, and all domains/services hosted at target IPs.
PASSIVE
What to Gather
  • IP geolocation (city, country, ISP)
  • ASN (Autonomous System Number) and org name
  • IP range / CIDR block owned by org
  • Reverse IP lookup: all domains hosted on same IP
  • Hosting provider / CDN (AWS, Cloudflare, Azure, GCP)
  • Historical IPs (what IPs the domain pointed to before)
  • Open ports and services (Shodan passive)
  • BGP route information
  • IP reputation (blacklists, abuse reports)
  • IPv6 address space
Where to Look
// IP Info https://ipinfo.io/[IP] https://bgp.he.net/[IP] https://whois.arin.net https://www.shodan.io/host/[IP] https://censys.io // Reverse IP (all domains on that IP) https://viewdns.info/reverseip/?host=[IP] https://hackertarget.com/reverse-ip-lookup/ https://securitytrails.com/list/ip/[IP] // ASN lookup https://bgp.he.net/AS[number] https://bgpview.io/ip/[IP] // IP Reputation https://abuseipdb.com/check/[IP] https://www.virustotal.com/gui/ip-address/[IP] https://otx.alienvault.com // CLI curl ipinfo.io/[IP] nmap -sV --script=banner [IP] (active!) shodan host [IP]
Important vs Unnecessary
DataImportanceReason
Hosting providerCRITICALReal IP behind CDN, attack surface
ASN / IP rangeHIGHOther IPs owned by same org
Reverse IP domainsHIGHShared hosting = pivot to other targets
IP reputationMEDIUMMalicious history flag
Geo coordinatesLOWApproximate only (CDN/cloud = inaccurate)
Tools
ShodanCensysipinfo.io bgp.he.netAbuseIPDB ViewDNSVirusTotal
⬡ Phase 2 — People // Person profiling, social media, email & phone intelligence
04
PERSON OSINT
Build a complete profile on an individual using public records, social media, and aggregators.
HIGH VALUE
What to Gather
  • Full name and any known aliases
  • Date of birth / age range
  • Current and past locations (city, address)
  • Employer (current and past)
  • Education history
  • Email addresses (all known)
  • Phone numbers
  • Social media profiles across all platforms
  • Public records (court, property, business registry)
  • Photos (for reverse image search)
  • Connections / family members / associates
  • Published content (blogs, comments, forum posts)
  • Political / religious affiliations (publicly stated)
Where to Look
// People Search Engines https://pipl.com (best for person OSINT - paid) https://spokeo.com https://intelius.com https://whitepages.com https://truepeoplesearch.com https://fastpeoplesearch.com https://www.beenverified.com https://radaris.com // Free aggregators https://www.peekyou.com https://www.zabasearch.com https://411.com https://checkpeople.com // Public Records https://www.publicrecords.com https://courtlistener.com (US court records) https://pacer.gov (US federal courts) https://opencorporates.com (company registrations) // India-Specific https://www.mca.gov.in (company directors) https://vahan.nic.in (vehicle registration) https://eci.gov.in (voter records) // LinkedIn (name + company search) https://linkedin.com/search/results/people/
Important vs Unnecessary
DataImportanceReason
Email addressCRITICALBreach lookup, account recovery, pivot
EmployerCRITICALOrg pivot, spear phishing surface
Social media handlesHIGHCross-platform tracking
Location historyHIGHPhysical context, travel patterns
PhotosHIGHReverse image search, identity confirm
Associates / connectionsHIGHNetwork mapping, second-degree pivots
Exact DOBMEDIUMIdentity confirmation
Physical descriptionLOWOnly useful for physical surveillance
Political opinionsLOWContext only unless relevant to case
05
EMAIL INTELLIGENCE
From a single email address, discover linked accounts, breaches, real name, and patterns.
CRITICAL PIVOT
What to Gather
  • Validate email exists (MX record + SMTP verify)
  • Breach databases — was it in any data breach?
  • Paste sites — appears in any dumps?
  • Email header analysis (if you have an email from target)
  • Real name from email prefix pattern
  • Domain pattern — find all employees (fname.lname@company.com)
  • Social accounts linked to that email (Google gravatar, etc.)
  • Other accounts registered with that email
Where to Look
// Breach Lookup https://haveibeenpwned.com (gold standard - free) https://dehashed.com (paid - full records) https://intelx.io (paste + dark web) https://leakcheck.io https://snusbase.com (paid) https://breachdirectory.org // Email Verification https://hunter.io/email-verifier https://emailrep.io/[email] // reputation + breach data https://app.voilanorbert.com // Find emails for a company https://hunter.io/?query=company.com // email pattern finder https://phonebook.cz // email by domain https://apollo.io https://rocketreach.co // Email header analysis (from received email) https://mxtoolbox.com/EmailHeaders.aspx https://toolbox.googleapps.com/apps/messageheader/ // Gravatar (profile from email hash) https://en.gravatar.com/[MD5 of email]
Email Pattern Discovery
// Common company email patterns: firstname@company.com firstname.lastname@company.com f.lastname@company.com flastname@company.com lastname@company.com // Find pattern via hunter.io then enumerate: theHarvester -d company.com -b all amass enum -d company.com
Important vs Unnecessary
DataImportanceReason
Breach records (passwords)CRITICALCredential stuffing surface
Linked accountsCRITICALFull profile expansion
Email validationHIGHConfirm email is real/active
Email header IPHIGHMay reveal sender's real IP/location
Email patternHIGHFind all employees of a company
Paste site appearancesMEDIUMContext for exposure history
Tools
HaveIBeenPwneddehashed hunter.iotheHarvester emailrep.ioIntelX
06
USERNAME / HANDLE ENUMERATION
Track a username across hundreds of platforms to build a cross-platform identity map.
HIGH VALUE
What to Gather
  • All platforms where username is registered
  • Profile photos (for reverse image search)
  • Bio / about section content across platforms
  • Activity history (posts, comments, timestamps)
  • Linked accounts mentioned in profiles
  • Email revealed in profile or posts
  • Location mentioned in profile or posts
  • Consistent patterns in usernames (variations)
Tools & Where to Look
// Username search tools sherlock username // checks 300+ platforms python3 sherlock.py target_username maigret username // better than sherlock, more sources // Manual checks https://namechk.com https://checkusernames.com https://usersearch.org https://knowem.com https://whatsmyname.app // best free tool // Archive / History https://web.archive.org/web/*/https://twitter.com/username https://archive.ph // GitHub (code + email leakage) https://github.com/[username] https://api.github.com/users/[username] // email sometimes revealed // Reddit https://www.reddit.com/user/[username] https://camas.unddit.com // search deleted posts
Username Variation Patterns
// If target uses "johndoe", also check: johndoe1, johndoe2, johndoe99 john_doe, john.doe jdoe, j_doe johndoe_official xjohndoe, johndoex johndoe_real johndoe[year] — johndoe1990, johndoe2001
Tools
SherlockMaigret WhatsMyNameNamechk social-analyzer
07
SOCIAL MEDIA INTELLIGENCE (SOCMINT)
Deep dive into platform-specific data: posts, connections, metadata, location leakage.
HIGH VALUE
Platform-by-Platform Checklist
LinkedIn Current employer, role, start/end dates · Education · Skills · Connections (mutual) · Recommendations · Posts · Groups · Certifications · Contact info (if shared) · Company size clue
Twitter / X Tweets (keyword search with from:username) · Retweets and likes · Replies (reveal associates) · Location in bio vs GPS metadata in media · Creation date · Following/followers · Archived tweets (deleted via archive.org)
Facebook Friends list (even partially visible) · Check-ins · Tagged photos · Groups joined · Events attended · Workplace/school history · Phone/email in About section · Dating/relationship info
Instagram Photo metadata (if EXIF not stripped) · Geotags on posts · Tagged locations · Followers/following · Story highlights · Comments (reveal associates) · Bio links
GitHub Real name/email in commits · API tokens in code · Internal hostnames in configs · Company infrastructure leaks · Commit history timestamps (work hours, timezone) · Repos contributed to
SOCMINT Tools
// Twitter/X https://twitter.com/search?q=from%3Ausername&src=typed_query https://nitter.net/username // no login needed https://socialblade.com // follower stats https://tinfoleak.com // Twitter deep OSINT // Facebook https://www.facebook.com/search/ // Graph search https://lookup-id.com // find Facebook ID // LinkedIn https://www.linkedin.com/search/results/ https://recruitin.net // LinkedIn without login // Instagram https://www.picuki.com/profile/username // no login needed https://imginn.com // General https://maltego.com // relationship mapping https://social-searcher.com // cross-platform search
What's Important vs Noise
DataImportanceReason
Employment + datesCRITICALAccess levels, email patterns, networks
Location leakage (geotagged posts)CRITICALPhysical location / routine
Associates / frequent interactionsHIGHNetwork map expansion
Contact info in bioHIGHDirect pivot
Post timestampsMEDIUMTimezone, work hours, activity patterns
Likes / reactionsLOWPersonality profiling only
Generic public postsLOWNoise unless keyword-relevant
08
PHONE NUMBER INTELLIGENCE
Reverse lookup a phone number to find owner, carrier, linked accounts, and history.
PASSIVE
What to Gather
  • Owner name from reverse lookup
  • Carrier / telecom provider
  • Line type (mobile / landline / VoIP)
  • Country and region
  • Linked social accounts (WhatsApp, Telegram, Signal)
  • CallerID / spam reports
  • Breach database records containing phone
Where to Look
// Reverse Lookup https://truecaller.com // best for India/Asia https://sync.me https://www.whitepages.com/phone/ https://www.spokeo.com https://numverify.com // API - validation + carrier https://phoneinfoga.cz // OSINT tool for phones // WhatsApp check // Save number → open WhatsApp → see if account exists + profile pic // Telegram check // @username search or phone-based lookup // CLI phoneinfoga scan -n +[country_code][number] // Breach DB https://dehashed.com // search by phone https://leakcheck.io
Tools
PhoneInfogaTruecaller NumVerifydehashed
⬡ Phase 3 — Organization // Company profiling, employee mapping, supply chain
09
COMPANY / ORGANIZATION OSINT
Map an organization's full digital footprint — structure, financials, leadership, subsidiaries.
HIGH VALUE
What to Gather
  • Legal company name, registration number
  • Registered address and office locations
  • Directors / officers / founders
  • Subsidiaries and parent companies
  • All owned domains
  • Employee count and growth rate
  • Revenue / funding history
  • Products / services offered
  • Technology stack (from job postings and BuiltWith)
  • News mentions and press releases
  • Legal cases / lawsuits
  • Social media presence
Where to Look
// Company Registries https://opencorporates.com // global https://companieshouse.gov.uk // UK https://www.mca.gov.in // India (MCA21) https://efts.sec.gov/LATEST/search-index // US (SEC EDGAR) https://www.dnb.com // Dun & Bradstreet // Financials / Funding https://crunchbase.com https://pitchbook.com https://tracxn.com https://zaubacorp.com // India financials // News / Mentions https://news.google.com/search?q="Company Name" https://newsapi.org site:companyname.com filetype:pdf // press releases // Supply Chain / Partners // Check "About" pages, press releases, blog posts // Job postings reveal tech stack + third-party vendors
Important vs Unnecessary
DataImportanceReason
Directors / key personnelCRITICALPivot to person OSINT
Owned domainsCRITICALAttack surface expansion
Tech stackHIGHCVE targeting, spear phishing
SubsidiariesHIGHWeaker security may be entry point
Funding / investorsMEDIUMBusiness context only
RevenueLOWContext only
10
EMPLOYEE MAPPING
Identify all employees, their roles, emails, and hierarchy to build spear-phishing targets.
CRITICAL PIVOT
What to Gather
  • C-suite and leadership (CEO, CTO, CISO)
  • IT / DevOps / Security team members
  • HR and Finance employees (high-value phishing targets)
  • Email addresses (construct from pattern)
  • Employee LinkedIn profiles
  • Past employees (may retain access)
  • Contractors / third-party vendors
Where to Look
// LinkedIn (primary source) https://linkedin.com/company/[company-name]/people/ site:linkedin.com "company name" "senior engineer" // Hunter.io (email pattern + employee list) https://hunter.io/domain-search?domain=company.com // GitHub // Search for company email domain in commits: https://github.com/search?q=%22@company.com%22&type=code // theHarvester theHarvester -d company.com -b linkedin,hunter,google // RocketReach / Apollo.io (paid) // PhantomBuster (LinkedIn scraper) // Job boards reveal team structure: // Posting: "Our 5-person security team uses Splunk and CrowdStrike" // → You know the security stack AND team size
11
JOB POSTINGS RECON
Job ads reveal technology stacks, security tools, team size, and internal infrastructure details.
MEDIUM VALUE
What to Look For in Job Postings
// Technology Stack clues: "Experience with AWS, Terraform, and Kubernetes" → You know: AWS cloud, IaC with Terraform, container orchestration // Security Tools in use: "Familiarity with Splunk, CrowdStrike, Qualys preferred" → SIEM = Splunk, EDR = CrowdStrike, vuln scanner = Qualys // Internal systems: "Work with our internal GitLab instance" → Self-hosted GitLab (check https://gitlab.company.com) // Database tech: "Experience with PostgreSQL and Redis" → Likely DB stack // Authentication: "Experience with Okta or Azure AD" → SSO provider // Team size: "Join our 3-person security team" → Small security team = limited monitoring capacity
Where to Look
https://linkedin.com/jobs/search/?keywords=[company] https://indeed.com/jobs?q=[company] https://glassdoor.com https://lever.co/[company] // ATS pages often public https://jobs.ashbyhq.com/[company] https://greenhouse.io/[company] site:linkedin.com/jobs "company name" "security engineer"
⬡ Phase 4 — Technical // Subdomains, leaks, code repos, exposed services
12
SUBDOMAIN ENUMERATION
Discover all subdomains to map the full attack surface beyond the main domain.
ACTIVE
What to Gather
  • All live subdomains (active + inactive)
  • IP address of each subdomain
  • HTTP status of each subdomain
  • Subdomains pointing to third-party services (takeover risk)
  • Wildcard DNS configuration
  • Internal-facing subdomains accidentally exposed
Tools & Commands
// Passive (no direct connection to target) subfinder -d target.com -o subs.txt amass enum -passive -d target.com -o subs.txt assetfinder --subs-only target.com curl "https://crt.sh/?q=%.target.com&output=json" | jq '.[].name_value' // Active (DNS brute force) amass enum -active -d target.com -brute dnsx -l subs.txt -a -resp // resolve and get IPs gobuster dns -d target.com -w subdomains.txt // HTTP probe (find live ones) cat subs.txt | httpx -status-code -title -tech-detect // All-in-one cat subs.txt | httpx | nuclei -t exposures/ // Web-based (no install) https://dnsdumpster.com https://crt.sh/?q=%.target.com https://securitytrails.com/list/apex_domain/target.com
13
TECHNOLOGY STACK FINGERPRINTING
Identify CMS, frameworks, CDN, analytics, backend tech to map CVE exposure.
PASSIVE
What to Identify
  • Web server (Apache, Nginx, IIS, Caddy)
  • CMS (WordPress, Drupal, Joomla, Ghost)
  • Frontend framework (React, Angular, Vue)
  • Backend language/framework (PHP, Django, Rails, Laravel)
  • CDN / WAF (Cloudflare, Akamai, Fastly)
  • Analytics (GA4, Hotjar, Mixpanel)
  • Hosting (AWS, Azure, GCP, shared hosting)
  • Email marketing (Mailchimp, SendGrid)
  • Chat / CRM tools (Intercom, Zendesk, HubSpot)
  • Payment processors (Stripe, PayPal, Razorpay)
Tools
// Passive fingerprinting https://builtwith.com/target.com // most comprehensive https://www.wappalyzer.com // browser extension https://www.whatcms.org https://w3techs.com/sites/info/target.com // Headers analysis curl -I https://target.com // Server, X-Powered-By headers whatweb https://target.com // CLI version // Source code analysis // View page source → check script src, link href, meta generators // Look for: wp-content/ (WordPress), /sites/default/ (Drupal)
14
LEAKED CREDENTIALS & BREACH DATA
Search breach databases, paste sites, and dark web dumps for exposed credentials.
CRITICAL
What to Look For
  • Email + password combos from data breaches
  • Plaintext passwords from old breaches
  • Password patterns (reuse across services)
  • API keys / tokens in paste sites
  • Internal documents leaked to paste sites
  • Employee credentials from phishing/breach
  • Database dumps referencing target domain
Where to Search
// Public / Free https://haveibeenpwned.com // email → breach list https://intelx.io // paste + leak search https://psbdmp.ws // pastebin dumps https://pastebin.com/search // direct search https://grayhatwarfare.com // exposed S3 buckets // Paid / Commercial https://dehashed.com // full credential records https://leakcheck.io https://snusbase.com https://breachforums.st // (monitor only) // CLI intelx.py -s "@target.com" --limit 100 h8mail -t email@target.com --config h8mail_config.ini
Tools
h8mailHIBP dehashedIntelX pwndb
15
CODE REPOSITORY INTELLIGENCE
Search GitHub/GitLab/Bitbucket for leaked secrets, internal config, and employee data.
CRITICAL
What to Look For
  • API keys / tokens committed to code
  • AWS access keys (AKIA...)
  • Database connection strings with passwords
  • Internal IP addresses and hostnames
  • .env files committed accidentally
  • Private SSH keys
  • Hardcoded credentials in source code
  • Infrastructure details (server names, VPN endpoints)
  • Employee names and emails from git commit history
GitHub Search Queries
// GitHub search dorks: "@company.com" password "company.com" api_key "company.com" secret_key "company.com" db_password org:company-github-org password org:company-github-org "AKIA" // AWS key prefix org:company-github-org filename:.env org:company-github-org filename:config.yml // In URL: https://github.com/search?q=%22company.com%22+%22password%22&type=code // GitRob / TruffleHog gitrob --github-access-token [token] company-org-name trufflehog git https://github.com/company/repo gitleaks detect --source /path/to/cloned/repo
16
SHODAN / INTERNET EXPOSURE
Find exposed services, open ports, default credentials, and vulnerable versions passively.
PASSIVE
Shodan Queries
// Basic org / domain search org:"Company Name" hostname:target.com ssl:"target.com" // cert search for subdomains // Find exposed admin panels org:"Company Name" http.title:"Admin" org:"Company Name" http.title:"Dashboard" org:"Company Name" http.title:"phpMyAdmin" org:"Company Name" http.title:"Kibana" // By product/CVE org:"Company Name" product:"Apache httpd" vuln:CVE-2021-44228 org:"Company Name" // Log4j // Exposed databases org:"Company Name" port:27017 // MongoDB no auth org:"Company Name" port:9200 // Elasticsearch org:"Company Name" port:6379 // Redis org:"Company Name" port:5432 // PostgreSQL // IoT / Cameras org:"Company Name" has_screenshot:true org:"Company Name" device:"webcam"
Other Exposure Tools
// Censys (complement to Shodan) https://search.censys.io/ // FOFA (Chinese Shodan, finds more Asian hosts) https://fofa.info/ // GreyNoise (mass scanner detection) https://viz.greynoise.io/ // ZoomEye https://www.zoomeye.org/
Tools
ShodanCensys FOFAZoomEye GreyNoise
⬡ Phase 5 — Deep OSINT // Dark web, geolocation, metadata, advanced dorking
17
DARK WEB MONITORING
Search .onion sites, hacking forums, and markets for mentions of your target.
HIGH VALUE
What to Look For
  • Target domain mentioned in breach sales
  • Leaked database being sold
  • Ransomware group claiming attack on target
  • Credential dumps containing target domain
  • Threat actor discussions about the target
  • Planned attacks or insider threats
Where to Monitor (Surface Web proxies)
// Surface web dark web monitors https://intelx.io // paste + dark web https://darkfail.net // .onion link directory https://www.dehashed.com https://socradar.io // commercial threat intel https://flashpoint-intel.com // commercial // Ransomware tracker (leak sites indexed) https://ransomwatch.telemetry.ltd https://www.ransomware.live // Telegram threat intel (OSINT tool) // Search Telegram channels for target domain using: https://t.me/s/[channel_name] // or use: Telepathy, TGScan
18
GEOLOCATION & IMAGE OSINT
Geolocate photos, extract EXIF metadata, run reverse image searches to find identity/location.
HIGH VALUE
What to Extract from Images
  • EXIF data: GPS coordinates, date/time, device model
  • Camera make/model (device fingerprint)
  • Software used to edit (reveals OS / tools)
  • Visual geolocation: match landmarks, signs, terrain
  • Reverse image search: find other uses of the same photo
  • Facial recognition (check PimEyes if permitted)
  • Shadow direction → time of day estimate
Tools
// EXIF extraction exiftool image.jpg http://exif.regex.info/exif.cgi // web-based // Reverse image search https://images.google.com https://www.bing.com/images // often finds more than Google https://yandex.com/images // best for people photos https://tineye.com // exact matches https://pimeyes.com // facial recognition (paid) https://facecheck.id // facial recognition (free) // Geolocation from visual clues https://www.geospy.ai // AI geolocation from photo https://overpass-turbo.eu // OSM query for landmarks https://maps.google.com (Street View comparison) // Metadata from URLs (sometimes still in social media images) // Direct link to Twitter/FB CDN images often strip EXIF // Telegram and email attachments often retain EXIF
19
DOCUMENT METADATA ANALYSIS
PDFs and Office documents expose author names, internal paths, software versions, and timestamps.
MEDIUM VALUE
What Metadata Reveals
PDF Metadata Can Expose Author name · Creator (software used) · Producer (PDF library/version) · Creation date · Modification date · Internal file paths (C:\Users\john.doe\...) · Company name · Software version
DOCX / XLSX Metadata Can Expose Author name · Last modified by · Company name · Internal server paths · Revision count · Total editing time · Hidden content / comments
How to Extract
// CLI tools exiftool document.pdf exiftool *.docx // FOCA (Windows - full metadata harvester) // Finds documents via Google dorks, downloads, extracts metadata // Web-based https://www.metadata2go.com https://www.get-metadata.com // Find public documents with Google: site:target.com filetype:pdf site:target.com filetype:docx site:target.com filetype:xlsx site:target.com filetype:pptx
20
GOOGLE DORKING (ADVANCED SEARCH)
Use Google search operators to find exposed files, login pages, credentials, and hidden content.
PASSIVE
Essential Dork Categories
Exposed Files
site:target.com filetype:pdf "confidential" site:target.com filetype:sql site:target.com filetype:log site:target.com filetype:env site:target.com filetype:bak site:target.com filetype:config site:target.com ext:xml inurl:config
Login / Admin Panels
site:target.com inurl:admin site:target.com inurl:login site:target.com inurl:portal site:target.com intitle:"Login" inurl:admin site:target.com "index of" inurl:admin
Leaked Credentials / Keys
site:pastebin.com "target.com" site:pastebin.com "target.com" password site:github.com "target.com" api_key site:trello.com "target.com" "@target.com" filetype:xls "@target.com" "password"
Exposed Infrastructure
site:target.com inurl:wp-admin // WordPress admin site:target.com inurl:phpmyadmin site:target.com intitle:"index of /" // directory listing site:target.com inurl:".git" site:target.com inurl:"/actuator" // Spring Boot site:target.com inurl:"/.env"
Subdomain Discovery
site:*.target.com -www site:*.*.target.com
Other Search Engines to Dork
GoogleBing DuckDuckGoYandex BaiduShodan GHDB (Google Hacking DB)
// GHDB - pre-made dorks library https://www.exploit-db.com/google-hacking-database
⬡ Analysis — Connecting the Dots // How to map, prioritize, and present intelligence
21
CONNECTING THE DOTS — PIVOT MAP
How to take raw data points and build a connected intelligence picture using pivot methodology.
CRITICAL SKILL
The Pivot Model
// Every data point can pivot to more data points
TARGET: Email Domain IP Address ASN / Org
TARGET: Email Breach DB Password Hash Other Accounts
Domain WHOIS Registrant Name LinkedIn Profile Employer
Username 300+ Platforms Profile Photos Geolocation
Company Name All Domains Subdomains Open Ports
Pivot Table — What Leads to What
You HaveYou Can FindTool
Email addressBreaches, linked accounts, domain, nameHIBP, dehashed, hunter.io
DomainIPs, WHOIS email, subdomains, employeeswhois, subfinder, crt.sh
IP addressOrg, ASN, other hosted domains, servicesShodan, ViewDNS, ipinfo
UsernameAll platform profiles, email, photosSherlock, Maigret
Phone numberName, carrier, WhatsApp/Telegram accountPhoneInfoga, Truecaller
Full nameLinkedIn, social media, public recordsPipl, Google, LinkedIn
PhotoOther accounts using same photo, locationYandex, PimEyes, GeoSpy
Company nameDirectors, domains, employees, financialsOpenCorporates, LinkedIn
GitHub usernameReal email, commits, internal hostnamesGitHub API, TruffleHog
Breach passwordPassword patterns → other accountsManual analysis
How to Map It (Tools)
// Visual relationship mapping Maltego Community Edition // entity-relationship graphs, free tier SpiderFoot // automated OSINT + visual output Recon-ng // modular OSINT framework (CLI) // Manual mapping (free) draw.io / diagrams.net // manual relationship diagrams Obsidian (graph view) // notes + visual links CherryTree // structured note-taking Gephi // for large network graphs // Automated pivot framework spiderfoot -s target.com -m all -o report.html recon-ng -w workspace1
22
IMPORTANT vs UNNECESSARY — TRIAGE GUIDE
Not all data is equal. This guide tells you what to keep, what to deprioritize, and what to discard.
CRITICAL SKILL
Master Triage Table
Data TypePriorityKeep?Why
Breached email + passwordCRITICAL✅ AlwaysDirect exploitation path
Active email addressesCRITICAL✅ AlwaysPrimary contact, pivot hub
Domain + subdomainsCRITICAL✅ AlwaysAttack surface map
Employee names + rolesCRITICAL✅ AlwaysSpear phishing targets
Exposed API keys / secretsCRITICAL✅ AlwaysDirect access
Open ports / exposed servicesCRITICAL✅ AlwaysEntry point identification
IP addresses + ASNHIGH✅ YesInfrastructure pivot
Social media bios + contact infoHIGH✅ YesIdentity + pivot potential
Tech stack detailsHIGH✅ YesCVE matching
Location leakage (posts, EXIF)HIGH✅ YesPhysical context
Former employeesHIGH✅ YesMay retain access
Password patternsHIGH✅ YesPassword spray base
Job posting tech cluesMEDIUM⚠️ MaybeTech stack inference
Company revenue/fundingMEDIUM⚠️ ContextBackground only
Post timestampsMEDIUM⚠️ MaybeTimezone / schedule pattern
Generic social media postsLOW❌ SkipNoise unless keyword match
Likes / reactionsLOW❌ SkipPersonality only
Physical descriptionLOW❌ SkipOnly for physical ops
Registrar nameLOW❌ SkipNot actionable
Exact domain expiry dateLOW❌ SkipNot actionable unless expired
Nationality (alone)LOW❌ SkipContext only
Decision Framework
// Ask these 3 questions for every piece of data: Q1: Does this pivot to something more useful? YES → Keep (it's a node in your map) NO → Store as background context Q2: Does this directly expose a vulnerability or access point? YES → CRITICAL - document immediately NO → Lower priority Q3: Does this confirm or contradict something else I found? CONFIRMS → Increases confidence, keep CONTRADICTS → Investigate the discrepancy NEITHER → Low priority noise // Red flags that something is IMPORTANT: - Credential + service = CRITICAL - Internal hostname in public source = CRITICAL - API key in public repo = CRITICAL - Real IP behind CDN = HIGH - Employee email + role + breach = HIGH
23
OSINT REPORT STRUCTURE
How to structure your findings into a clean, actionable intelligence report.
DELIVERABLE
Report Sections
// OSINT Report Template Structure: 1. EXECUTIVE SUMMARY - Target: [name / domain / org] - Scope: [what was investigated] - Date: [investigation period] - Key Findings: [top 3-5 critical findings in plain English] - Risk Level: [Critical / High / Medium / Low] 2. METHODOLOGY - Phases completed - Tools used - Sources consulted - What was NOT investigated (limitations) 3. IDENTITY PROFILE (Person / Org) - Confirmed identifiers - Known aliases - Locations - Associates / connections 4. DIGITAL FOOTPRINT - Domains and subdomains - IP ranges / ASN - Technology stack - Exposed services 5. CREDENTIAL EXPOSURE - Breach records found - Paste site appearances - Code repository leaks 6. SOCIAL MEDIA PRESENCE - Platforms (active/inactive) - Key content findings - Location leakage 7. RELATIONSHIP MAP - Visual graph (Maltego export / draw.io) - Key associations 8. CRITICAL FINDINGS (Prioritized) - Finding 1: [Title] — [source] — [risk] - Finding 2: ... 9. APPENDIX - Raw data tables - Screenshot evidence - Source URLs
Evidence Standards
  • Screenshot every finding with timestamp + URL visible
  • Archive important pages (archive.ph) before reporting
  • Note data collection date for all records
  • Distinguish confirmed vs inferred vs unverified data
  • Cite source for every data point
  • Hash screenshots for integrity (sha256sum)
OSINT Toolkit Summary
Passive Recon Maltego · SpiderFoot · Recon-ng · Shodan · Censys · crt.sh · SecurityTrails
Person / Email Sherlock · Maigret · HIBP · h8mail · hunter.io · PhoneInfoga · Pipl
Technical subfinder · amass · httpx · nuclei · trufflehog · gitleaks · whatweb