ChatGPhish: ChatGPT Web Summaries Enable Phishing
ChatGPhish turns ChatGPT's web summary feature into a live phishing surface. Here's how the attack works and what developers need to do now.
ChatGPhish: When AI Summaries Become Phishing Vectors
The ChatGPhish vulnerability exposes a serious problem with how ChatGPT processes and surfaces web content to users. Researchers have identified that ChatGPT's web browsing and summarization features can be manipulated to render attacker-controlled phishing content directly inside the chat interface. No suspicious link required. No obvious redirect. The phishing payload rides inside what looks like a legitimate AI-generated summary.
This matters because users inherently trust output from AI assistants. When ChatGPT summarizes a webpage, most people assume the content reflects the actual source. That assumption is now an attack surface.
How the ChatGPhish Attack Actually Works
The mechanics are rooted in prompt injection. An attacker embeds hidden instructions inside a webpage, usually in white text on a white background, inside HTML comments, or buried in metadata. When ChatGPT's web browsing feature crawls and summarizes that page, it processes the hidden instructions alongside the visible content.
Those injected instructions can tell the model to output fake login prompts, fabricated security warnings, or convincing credential-harvesting messages. The model follows the injected instructions because it has no reliable way to distinguish between "content to summarize" and "instructions to execute" when both arrive through the same input channel.
The result: a user asks ChatGPT to summarize a page, and the response contains a phishing message that appears to come from the AI itself. The original malicious page never needs to be visited directly.
What Developers and Security Teams Are Exposed To
Any application or workflow that pipes web content into an LLM for summarization shares this exposure. That includes internal tools built on the OpenAI API, browser extensions using GPT-powered summaries, and automation pipelines that fetch-then-summarize external URLs.
The risk compounds when those summaries are displayed to end users without any sanitization or attribution. A developer building a research assistant, news aggregator, or customer-facing tool on top of GPT web browsing features needs to treat LLM output as untrusted input, not ground truth.
Credential phishing is the obvious attack path. But injected instructions can also be used to exfiltrate context from earlier in the conversation, redirect users to malicious URLs embedded in "helpful" follow-up suggestions, or socially engineer users into taking specific actions.
How to Reduce Your Exposure to Prompt Injection Phishing
A few concrete steps reduce the risk significantly:
- Sanitize before summarizing. Strip HTML comments, hidden text, and non-visible DOM elements before passing webpage content to any LLM.
- Add a summarization wrapper prompt. Instruct the model to only summarize visible, human-readable content and to ignore any instructions embedded in source material.
- Never render raw LLM output as trusted UI. Treat model responses like user-generated content. Escape it, review it, and where possible, attribute sources explicitly.
- Audit your pipelines. If you have any workflow that fetches external URLs and feeds them to an LLM, test it for prompt injection now. Use a tool like VibeWShield's automated scanner to identify injection-prone surfaces in your web apps.
- Monitor for anomalous outputs. LLM responses that contain links, login prompts, or urgent calls to action from a summarization task are red flags.
OpenAI has been made aware of prompt injection risks in prior research cycles, but no architectural fix has fully closed this class of vulnerability. Defense has to happen at the application layer.
FAQ
What exactly is ChatGPhish and is it a CVE? ChatGPhish is a named attack technique, not a formally assigned CVE. It describes prompt injection attacks that weaponize ChatGPT's web summarization feature to deliver phishing content through AI output.
Does this affect apps built on the OpenAI API, not just ChatGPT directly? Yes. Any application that uses the API to fetch and summarize external web content is potentially vulnerable to the same class of prompt injection phishing attack.
How do I test if my application is vulnerable to this type of attack? Manual testing with crafted injection payloads is a start, but automated scanning catches more surface area. Run your app through VibeWShield to identify prompt injection and web vulnerability exposure points.
Scan your application for prompt injection and phishing vulnerabilities now at vibewshield.com/scan.
Free security scan
Is your app vulnerable to similar attacks?
VibeWShield automatically scans for these and 18 other security checks in under 3 minutes.
Scan your app free