October 30, 2023

Exploring AI Threats: Package Hallucination Attacks

Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!

Written by

Charlotte Thompson

Cyber Analyst

Written by

Tiana Kelly

Deputy Team Lead, London & Cyber Analyst

Inside the SOC

Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.

Written by

Charlotte Thompson

Cyber Analyst

Written by

Tiana Kelly

Deputy Team Lead, London & Cyber Analyst

Oct 2023

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

Compliance / Anomalous Upload to Generative AI
Compliance / Beaconing to Rare Generative AI and Generative AI
Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

Device / Anomalous GitHub Download
Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

Deputy Team Lead, London & Cyber Analyst

Inside the SOC

Written by

Charlotte Thompson

Cyber Analyst

Written by

Tiana Kelly

Deputy Team Lead, London & Cyber Analyst

•

May 8, 2025

Nathaniel Jones

VP, Security & AI Strategy, Field CISO

•

May 6, 2025

Pallavi Singh

Product Marketing Manager, OT Security & Compliance

Watch the NIS2 Webinar

Blog

May 8, 2025

Anomaly-based threat hunting: Darktrace's approach in action

What is threat hunting?

Threat hunting in cybersecurity involves proactively and iteratively searching through networks and datasets to detect threats that evade existing automated security solutions. It is an important component of a strong cybersecurity posture.

There are several frameworks that Darktrace analysts use to guide how threat hunting is carried out, some of which are:

MITRE Attack
Tactics, Techniques, Procedures (TTPs)
Diamond Model for Intrusion Analysis
Adversary, Infrastructure, Victims, Capabilities
Threat Hunt Model – Six Steps
Purpose, Scope, Equip, Plan, Execute, Feedback
Pyramid of Pain

These frameworks are important in baselining how to run a threat hunt. There are also a combination of different methods that allow defenders diversity– regardless of whether it is a proactive or reactive threat hunt. Some of these are:

Hypothesis-based threat hunting
Analytics-driven threat hunting
Automated/machine learning hunting
Indicator of Compromise (IoC) hunting
Victim-based threat hunting

Threat hunting with Darktrace

At its core, Darktrace relies on anomaly-based detection methods. It combines various machine learning types that allows it to characterize what constitutes ‘normal’, based on the analysis of many different measures of a device or actor’s behavior. Those types of learning are then curated into what are called models.

Darktrace models leverage anomaly detection and integrate outputs from Darktrace Deep Packet Inspection, telemetry inputs, and additional modules, creating tailored activity detection.

This dynamic understanding allows Darktrace to identify, with a high degree of precision, events or behaviors that are both anomalous and unlikely to be benign. On top of machine learning models for detection, there is also the ability to change and create models showcasing the tool’s diversity. The Model Editor allows security teams to specify values, priorities, thresholds, and actions they want to detect. That means a team can create custom detection models based on specific use cases or business requirements. Teams can also increase the priority of existing detections based on their own risk assessments to their environment.

This level of dexterity is particularly useful when conducting a threat hunt. As described above, and in previous ‘Inside the SOC’ blogs such a threat hunt can be on a specific threat actor, specific sector, or a hypothesis-based threat hunt combined with ‘experimenting’ with some of Darktrace’s models.

Conducting a threat hunt in the energy sector with experimental models

In Darktrace’s recent Threat Research report “AI & Cybersecurity: The state of cyber in UK and US energy sectors” Darktrace’s Threat Research team crafted hypothesis-driven threat hunts, building experimental models and investigating existing models to test them and detect malicious activity across Darktrace customers in the energy sector.

For one of the hunts, which hypothesised utilization of PerfectData software and multi-factor authentication (MFA) bypass to compromise user accounts and destruct data, an experimental model was created to detect a Software-as-a-Service (SaaS) user performing activity relating to 'PerfectData Software’, known to allow a threat actor to exfiltrate whole mailboxes as a PST file. Experimental model alerts caused by this anomalous activity were analyzed, in conjunction with existing SaaS and email-related models that would indicate a multi-stage attack in line with the hypothesis.

Whilst hunting, Darktrace researchers found multiple model alerts for this experimental model associated with PerfectData software usage, within energy sector customers, including an oil and gas investment company, as well as other sectors. Upon further investigation, it was also found that in June 2024, a malicious actor had targeted a renewable energy infrastructure provider via a PerfectData Software attack and demonstrated intent to conduct an Operational Technology (OT) attack.

The actor logged into Azure AD from a rare US IP address. They then granted Consent to ‘eM Client’ from the same IP. Shortly after, the actor granted ‘AddServicePrincipal’ via Azure to PerfectData Software. Two days later, the actor created a new email rule from a London IP to move emails to an RSS Feed Folder, stop processing rules, and mark emails as read. They then accessed mail items in the “\Sent” folder from a malicious IP belonging to anonymization network, Private Internet Access Virtual Private Network (PIA VPN) [1]. The actor then conducted mass email deletions, deleting multiple instances of emails with subject “[Name] shared "[Company Name] Proposal" With You” from the “\Sent folder”. The emails’ subject suggests the email likely contains a link to file storage for phishing purposes. The mass deletion likely represented an attempt to obfuscate a potential outbound phishing email campaign.

Figure 1: The Darktrace Model Alert that triggered for the mass deletes of the likely phishing email containing a file storage link.

A month later, the same user was observed downloading mass mLog CSV files related to proprietary and Operational Technology information. In September, three months after the initial attack, another mass download of operational files occurred by this actor, pertaining to operating instructions and measurements, The observed patience and specific file downloads seemingly demonstrated an intent to conduct or research possible OT attack vectors. An attack on OT could have significant impacts including operational downtime, reputational damage, and harm to everyday operations. Darktrace alerted the impacted customer once findings were verified, and subsequent actions were taken by the internal security team to prevent further malicious activity.

Conclusion

Harnessing the power of different tools in a security stack is a key element to cyber defense. The above hypothesis-based threat hunt and custom demonstrated intent to conduct an experimental model creation demonstrates different threat hunting approaches, how Darktrace’s approach can be operationalized, and that proactive threat hunting can be a valuable complement to traditional security controls and is essential for organizations facing increasingly complex threat landscapes.

Credit to Nathaniel Jones (VP, Security & AI Strategy, Field CISO at Darktrace) and Zoe Tilsiter (EMEA Consultancy Lead)

‍

References

https://spur.us/context/191.96.106.219

‍

About the author

Nathaniel Jones

VP, Security & AI Strategy, Field CISO

Blog

May 6, 2025

Combatting the Top Three Sources of Risk in the Cloud

With cloud computing, organizations are storing data like intellectual property, trade secrets, Personally Identifiable Information (PII), proprietary code and statistics, and other sensitive information in the cloud. If this data were to be accessed by malicious actors, it could incur financial loss, reputational damage, legal liabilities, and business disruption.

Last year data breaches in solely public cloud deployments were the most expensive type of data breach, with an average of $5.17 million USD, a 13.1% increase from the year before.

So, as cloud usage continues to grow, the teams in charge of protecting these deployments must understand the associated cybersecurity risks.

What are cloud risks?

Cloud threats come in many forms, with one of the key types consisting of cloud risks. These arise from challenges in implementing and maintaining cloud infrastructure, which can expose the organization to potential damage, loss, and attacks.

There are three major types of cloud risks:

1. Misconfigurations

As organizations struggle with complex cloud environments, misconfiguration is one of the leading causes of cloud security incidents. These risks occur when cloud settings leave gaps between cloud security solutions and expose data and services to unauthorized access. If discovered by a threat actor, a misconfiguration can be exploited to allow infiltration, lateral movement, escalation, and damage.

With the scale and dynamism of cloud infrastructure and the complexity of hybrid and multi-cloud deployments, security teams face a major challenge in exerting the required visibility and control to identify misconfigurations before they are exploited.

Common causes of misconfiguration come from skill shortages, outdated practices, and manual workflows. For example, potential misconfigurations can occur around firewall zones, isolated file systems, and mount systems, which all require specialized skill to set up and diligent monitoring to maintain

2. Identity and Access Management (IAM) failures

IAM has only increased in importance with the rise of cloud computing and remote working. It allows security teams to control which users can and cannot access sensitive data, applications, and other resources.

Cybersecurity professionals ranked IAM skills as the second most important security skill to have, just behind general cloud and application security.

There are four parts to IAM: authentication, authorization, administration, and auditing and reporting. Within these, there are a lot of subcomponents as well, including but not limited to Single Sign-On (SSO), Two-Factor Authentication (2FA), Multi-Factor Authentication (MFA), and Role-Based Access Control (RBAC).

Security teams are faced with the challenge of allowing enough access for employees, contractors, vendors, and partners to complete their jobs while restricting enough to maintain security. They may struggle to track what users are doing across the cloud, apps, and on-premises servers.

When IAM is misconfigured, it increases the attack surface and can leave accounts with access to resources they do not need to perform their intended roles. This type of risk creates the possibility for threat actors or compromised accounts to gain access to sensitive company data and escalate privileges in cloud environments. It can also allow malicious insiders and users who accidentally violate data protection regulations to cause greater damage.

3. Cross-domain threats

The complexity of hybrid and cloud environments can be exploited by attacks that cross multiple domains, such as traditional network environments, identity systems, SaaS platforms, and cloud environments. These attacks are difficult to detect and mitigate, especially when a security posture is siloed or fragmented.

Some attack types inherently involve multiple domains, like lateral movement and supply chain attacks, which target both on-premises and cloud networks.

Challenges in securing against cross-domain threats often come from a lack of unified visibility. If a security team does not have unified visibility across the organization’s domains, gaps between various infrastructures and the teams that manage them can leave organizations vulnerable.

Adopting AI cybersecurity tools to reduce cloud risk

For security teams to defend against misconfigurations, IAM failures, and insecure APIs, they require a combination of enhanced visibility into cloud assets and architectures, better automation, and more advanced analytics. These capabilities can be achieved with AI-powered cybersecurity tools.

Such tools use AI and automation to help teams maintain a clear view of all their assets and activities and consistently enforce security policies.

Darktrace / CLOUD is a Cloud Detection and Response (CDR) solution that makes cloud security accessible to all security teams and SOCs by using AI to identify and correct misconfigurations and other cloud risks in public, hybrid, and multi-cloud environments.

It provides real-time, dynamic architectural modeling, which gives SecOps and DevOps teams a unified view of cloud infrastructures to enhance collaboration and reveal possible misconfigurations and other cloud risks. It continuously evaluates architecture changes and monitors real-time activity, providing audit-ready traceability and proactive risk management.

Figure 1: Real-time visibility into cloud assets and architectures built from network, configuration, and identity and access roles. In this unified view, Darktrace / CLOUD reveals possible misconfigurations and risk paths.

Darktrace / CLOUD also offers attack path modeling for the cloud. It can identify exposed assets and highlight internal attack paths to get a dynamic view of the riskiest paths across cloud environments, network environments, and between – enabling security teams to prioritize based on unique business risk and address gaps to prevent future attacks.

Darktrace’s Self-Learning AI ensures continuous cloud resilience, helping teams move from reactive to proactive defense.

[related-resource]

About the author

Pallavi Singh

Product Marketing Manager, OT Security & Compliance

Your data. Our AI.

Elevate your network security with Darktrace AI

Get a demo

Check out this article by Darktrace: Exploring AI Threats: Package Hallucination Attacks

Exploring AI Threats: Package Hallucination Attacks

AI tools open doors for threat actors

What is an AI hallucination?

Why are AI hallucinations a problem?

Use of generative AI in software development

Malicious packages as attack vectors

How could Darktrace detect an AI Package Hallucination Attack?

Conclusion

References

Anomaly-based threat hunting: Darktrace's approach in action

Combatting the Top Three Sources of Risk in the Cloud

Enjoying the blog?

More in this series

Introducing Version 2 of Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2)

AI Uncovered: Introducing Darktrace Incident Graph Evaluation for Security Threats (DIGEST)

Force Multiply Your Security Team with Agentic AI: How the Industry’s Only True Cyber AI Analyst™ Saves Time and Stop Threats

Blog

May 8, 2025

Anomaly-based threat hunting: Darktrace's approach in action

What is threat hunting?

Threat hunting with Darktrace

Conducting a threat hunt in the energy sector with experimental models

Conclusion

References

Blog

May 6, 2025

Combatting the Top Three Sources of Risk in the Cloud

What are cloud risks?

1. Misconfigurations

2. Identity and Access Management (IAM) failures

3. Cross-domain threats

Adopting AI cybersecurity tools to reduce cloud risk