Why does phishunt show only suspicious sites and not confirmed-malicious ones?

phishunt favors recall over precision. The pipeline scores candidates on brand-impersonation likelihood and ships any site that passes the threshold to the active feed; it does not wait for third-party confirmation. Consumers who need confirmed verdicts should pair phishunt with Google Safe Browsing, urlscan or VirusTotal.

What does the SANS ISC tag on a URL detail page mean?

phishunt cross-references each detected URL against the SANS Internet Storm Center DShield recent-domains feed, which lists every apex domain first observed in DShield's passive-DNS sensors that day. When the apex has a SANS score of 5 or higher, the Detection strip on the URL detail page shows a SANS ISC badge plus the raw scorereason such as 'Excessive hyphens: 5 (+4)' or 'Contains suspicious keyword: deal (+20)'. The SANS data is consumed under the Creative Commons BY-NC-SA 4.0 license, with attribution to SANS Internet Storm Center / DShield.

Docs

Q: Is phishunt's data CC0?

Yes. The data published through the TXT, JSON, CSV feeds and the public REST API is released under Creative Commons CC0 1.0. The website code, branding and trademarks are not covered by CC0.

Q: How do I report a false positive?

Email info@phishunt.io with the URL and a brief reason. Reviewed and removed within 24 hours on weekdays. phishunt does not currently accept new phishing-site submissions by email; the feed is populated by the automated detection pipeline. A user-facing submission flow may be added later.

Open educational reference and integration guide for phishunt: how phishing kits work, what signals defenders use, and how to plug the data into your tooling.

What phishing is

Definition

Phishing is a social-engineering attack where an adversary impersonates a trusted brand, service or individual to trick the target into revealing credentials, deploying malware or transferring funds.

MITRE ATT&CK classifies it as technique T1566 with three sub-techniques.

Sub-techniques

T1566.001 Spearphishing Attachment - weaponised file
T1566.002 Spearphishing Link - URL to fake login
T1566.003 Spearphishing via Service - DM / SaaS
T1566.004 Spearphishing Voice - vishing

What phishing is not

Phishing is distinct from malware delivery (the kit may stage malware, but the technique is identity-targeted), from generic spam (which has no identity-impersonation step), and from Business Email Compromise (BEC) - which manipulates a real person's mailbox without the spoofed-domain landing page that defines phishing.

How phishing kits work

Modern off-the-shelf phishing kits chain four stages. Detection at each stage feeds different signals into the pipeline that powers phishunt's active feed.

1. Lure

Email, SMS (smishing), instant message or QR code (quishing) carrying a contextual hook - shipping notice, payroll, MFA reset, fake invoice.

Defender signal: SPF / DKIM / DMARC alignment, sender reputation, typosquat domain in the link.

2. Landing

The hosted fake-login page. Today's kits are HTML / JS bundles deployed on cheap shared hosting, Telegram-bot exfil included by default.

Defender signal: brand-keyword in domain, fresh TLS cert in CT logs, asset-hash collision with a known kit, JavaScript obfuscation patterns.

3. Credential capture

The form posts username + password (and any second-factor field) to an operator endpoint. Adversary-in-the-middle kits relay live to the real service to harvest the post-MFA session cookie.

Defender signal: action= attribute pointing off-domain, mismatched form-encryption, missing CSRF token.

4. Exfil

Telegram bot, simple email to a Gmail dropbox, or HTTP POST to a staging server. The session cookie or credential pair is then sold or replayed against the victim's enterprise SSO.

Defender signal: outbound traffic from compromised endpoints to known operator staging hosts; phishunt blocklists the staging hosts when the operator pivots through CT-visible certs.

Detection signals

Defenders combine three signal layers. No single layer is sufficient on its own; current phishing kits routinely defeat any one of them in isolation.

Layer	What it watches	Sample signals
Network	Anything visible before the page renders.	Certificate Transparency log streams, DNS reputation, brand-keyword lexical scoring on newly registered domains, JA3 / JA4 TLS fingerprints, MX/SPF on the lure domain.
Content	The rendered page itself.	Brand-asset image-hash collision, form `action=` off-domain, missing BIMI on lure email sender, HTML smuggling patterns, page-DOM perceptual-hash match against a known kit.
Behavioral	How the page reacts to visitors.	Geo / IP filtering (only victim country sees real page), CAPTCHA / Turnstile gate before kit reveals itself, OAST canary-callback evidence (e.g. interactsh) when probing for SSRF.

Industry frameworks

OWASP Top 10

Phishing maps primarily to A07:2021 Identification and Authentication Failures, since the kit's purpose is to defeat authentication. Anti-phishing controls also touch A05:2021 Security Misconfiguration when the impersonated site fails to enforce HSTS, secure cookies or strict CSP.

owasp.org/www-project-top-ten

NIST SP 800-53 / 800-46

The relevant control families are AC-2 account management, IA-2 identification and authentication of organizational users, AT-2 security awareness training, and SI-3 malicious code protection. NIST SP 800-46 (Telework) calls out phishing-resistant MFA explicitly.

SP 800-53 Rev. 5

MITRE ATT&CK technique tree

T1566 (Phishing) sits inside the Initial Access (TA0001) tactic. Once the credential is stolen, follow-on TTPs typically chain through TA0006 Credential Access (T1078 Valid Accounts) and TA0008 Lateral Movement. Defender mappings should include detective controls at each tactic transition, not just at T1566 itself.

attack.mitre.org/techniques/T1566

Phishing-resistant MFA

Modern adversary-in-the-middle (AitM) kits relay credentials and post-MFA session cookies live to the real service. SMS-OTP, time-based OTP and push notifications all fall to AitM. Only origin-bound MFA defeats it.

Factor	Phishing-resistant?	Why / why not
SMS OTP	No	Code is transferable; AitM kit relays it instantly. Also vulnerable to SIM-swap.
TOTP (Google Authenticator, Authy)	No	30-second window is plenty for an AitM relay.
Push approval (Duo, Microsoft Authenticator)	Partial	Number-matching helps; basic "approve / deny" prompts get fatigue-pushed and fall to AitM.
FIDO2 / WebAuthn passkey	Yes	Cryptographic challenge bound to the origin (RP ID). The attacker's lookalike domain has the wrong RP ID, so the authenticator refuses to sign.
Smartcard / PIV	Yes	Same origin-binding property as WebAuthn.

Common evasion techniques

Adversary-in-the-Middle (AitM)

Kits like Evilginx, Tycoon-2FA, Modlishka reverse-proxy the real login site and capture the session cookie post-MFA. Defeats SMS / TOTP / push-approval factors.

CAPTCHA / Turnstile gates

Kits gate the malicious page behind a Cloudflare Turnstile or Google reCAPTCHA. Automated crawlers (and many sandboxes) fail the challenge and never see the kit; the victim does.

Geo / IP filtering

Server-side filters return a benign decoy page to any IP outside the victim country, ASN or specific subnet range. Detection requires fetching from the same egress as the target audience.

HTML smuggling

The lure email / page contains a JavaScript blob that Blob-reconstructs the malicious payload client-side, bypassing email-gateway content scanners that inspect attachments at rest.

Glossary

ACME: Automatic Certificate Management Environment - the protocol Let's Encrypt and Google Trust Services use for free / automated TLS issuance.
AitM: Adversary-in-the-Middle - a kit pattern that reverse-proxies the real login site to harvest the post-MFA session cookie.
BIMI: Brand Indicators for Message Identification - a verified-logo signal in email envelopes; an absent BIMI on a sender claiming to be a brand is a phishing tell.
CT log: Certificate Transparency log - a public append-only log every public CA must submit certificates to. Defenders monitor CT logs for brand-keyword domains as an early-warning signal.
DV / OV / EV: Certificate validation tiers: Domain Validated (proof of domain control), Organization Validated, Extended Validation. DV is the most common and the cheapest to abuse.
HSTS: HTTP Strict Transport Security - forces browsers to use HTTPS only. HSTS preload (the browser-baked-in list) is near-irreversible and should not be set lightly.
JA3 / JA4: TLS-handshake fingerprints. Phishing kits often reuse the same TLS stack across thousands of staged domains, producing recognizable fingerprints.
MTA-STS: SMTP MTA Strict Transport Security - mail-server analogue of HSTS. Reduces ability of an attacker to downgrade SMTP to plaintext for MITM.
OAST: Out-of-band Application Security Testing - canary URLs (e.g. interactsh, Burp Collaborator) that record DNS / HTTP / SMTP requests, used to confirm blind vulnerabilities.
RP ID: Relying Party ID in WebAuthn - the origin the authenticator binds the credential to. Phishing-resistant MFA's core property.

FAQ

Is phishunt's data CC0?

Yes. The data published through TXT, JSON, CSV feeds and the public REST API is released under Creative Commons CC0 1.0. The website code, branding and trademarks are not covered by CC0.

How fresh is phishunt's data?

The detection pipeline runs hourly and active sites are rechecked every six hours. Domains stay in the active feed only while they keep returning a phishing page; sites that go down or get cleaned up leave the feed.

How does phishunt compare to PhishTank or OpenPhish?

All three publish phishing IOCs but they target different ingest modes: PhishTank is community-curated, OpenPhish is a commercial paid feed, phishunt is detection-driven from CT logs and newly registered domains. The three feeds complement each other rather than replace each other.

How do I report a false positive?

Email [email protected] with the URL and a brief reason. Reviewed and removed within 24 hours on weekdays.

phishunt does not currently accept new phishing-site submissions by email; the feed is populated by the automated detection pipeline. A user-facing submission flow may be added later.

Can I integrate phishunt into my SOC tooling?

Yes. The /api/ page documents the REST endpoints (free, no auth, CC0 data). For agentic / LLM-driven workflows the MCP server at mcp.phishunt.io exposes the same data as JSON-RPC tools (see /agents/ for setup snippets).

Why "suspicious" rather than "confirmed malicious"?

phishunt favors recall over precision. Sites that pass the brand-impersonation threshold ship to the active feed without third-party confirmation. Pair with Google Safe Browsing, urlscan or VirusTotal when you need confirmed verdicts.

What does the SANS ISC tag mean on URL detail pages?

phishunt cross-references each detected URL against the SANS Internet Storm Center DShield recent-domains feed, which lists every apex first observed in DShield's passive-DNS sensors that day. When the apex has a SANS score of 5 or higher, the Detection strip shows a SANS ISC badge plus the raw scorereason (e.g. Excessive hyphens: 5 (+4), Contains suspicious keyword: deal (+20)).

SANS data is consumed under CC BY-NC-SA 4.0 with attribution to SANS Internet Storm Center / DShield.

Open educational reference. Last updated 2026-05-04. Suggestions and corrections via [email protected].