Cybercriminals are increasingly using Large Language Models (LLMs) to generate content for large-scale phishing and scam attacks, Kaspersky’s AI Research Center experts have discovered. As threat actors attempt to generate fraudulent websites in high volumes, they often leave behind distinctive artifacts – such as AI-specific phrases – that set these sites apart from those created manually. So far, most phishing examples observed by Kaspersky target users of cryptocurrency exchanges and wallets.
Kaspersky experts analyzed a sample of resources, identifying key characteristics that help distinguish and detect cases where AI was used to generate content or even entire phishing and scam websites.
One of the prominent signs of LLM-generated text is the presence of disclaimers and refusals to execute commands, including phrases such as “As an AI language model…”. For instance, two phishing pages targeting KuCoin users contain this type of wording.
Another distinctive indicator of language model usage is the presence of concessive clauses, such as: ‘While I can’t do exactly what you want, I can try something similar.” In other examples targeting Gemini and Exodus users, the LLM declines to provide detailed login instructions.
“With LLMs, attackers can automate the creation of dozens or even hundreds of phishing and scam web pages with unique, high-quality content,” explains Vladislav Tushkanov, Research Development Group Manager at Kaspersky. “Previously, this required manual effort, but now AI can help threat actors generate such content automatically.”
LLMs can be used to create not just text blocks but entire web pages, with artifacts appearing both in the text itself and in areas like meta tags: snippets of text that describe a web page’s content and appear in its HTML code.
There are other indicators of AI usage in creating fraudulent sites. Some models, for instance, tend to use specific phrases like “delve”, “in the ever-evolving landscape”, and “in the ever-changing world”. While these terms are not considered strong indicators of AI-generated content, they may still be viewed as signs.
Another feature of text generated by a language model is the indication up to which the model’s knowledge of the world extends. The model typically articulates this limitation using phrases such as “according to my last update in January 2023”.
LLM-generated text is often combined with tactics that make phishing page detection more complicated for cybersecurity tools. For instance, attackers may use non-standard Unicode symbols, such as those with diacritics or from mathematical notation, to obfuscate text and prevent matching by rule-based detection systems.
“Large language models are improving, and cybercriminals are exploring ways to apply this technology for nefarious purposes. However, occasional errors provide insights into their use of such tools, particularly into the growing extent of automation. With future advancements, distinguishing AI-generated content from human-written text may become more challenging, making it crucial to use advanced security solutions that analyze textual information along with metadata, and other fraud indicators,” said Vladislav Tushkanov.
A full report with additional examples and analysis is available on Securelist. To protect against phishing, Kaspersky offers recommendations for protection:
- Check the spelling of hyperlinks. Sometimes e-mails and websites look just like real ones. It depends on how well the criminals did their homework. But the hyperlinks, most likely, will be incorrect — with spelling mistakes, or they can redirect you to a different place.
- Enter your web address directly into the web browser. If an email contains a link, instead of clicking a link it’s safe practice to hover over it to see if it looks accurate. If it looks okay, search for the link on your own versus linking to a website. Dangerous websites can look identical to authentic ones.
- Acquire a modern security solution, as they provide users with safe browsing features, protecting against dangerous websites, downloads and extensions.
Discussion about this post