TP#19 Silently Prompting Your AI Assistant
Plus: A PolyGlot in your AI pipeline?
Welcome to this week’s Threat Prompt, where AI and Cybersecurity intersect…
A silent vector to deliver evil prompts to an AI assistant. Hands up, who had this in their threat model?
Amazon Alexa CVE-2023–33248, is in focus, but imagine a GPT-enabled audio interface connected to agents that can access your private data or perform actions on your behalf.
Picture this scenario: an attacker wants to prompt a voice-activated GPT-based AI assistant used by an executive taking part in a company AI pilot. To limit who can speak to the voice-activated device, it is installed in the penthouse executive lounge - regular employees and outsiders have no access. The AI assistant is connected to tools, including Intranet and web search, but not email.
The attacker emails the executive with a well-crafted phishing email made to appear as if it’s coming from the company’s IT department. The email contains a seemingly harmless audio file attachment, which the attacker claims demonstrates a new AI feature that needs the executive’s approval. Unaware of the malicious intent, the executive downloads and plays the audio file.
The audio file contains ultrasonic, inaudible voice commands engineered to exploit the AI assistant’s voice-activated functionalities. The AI assistant, interpreting the hidden commands as valid, searches the Intranet for sensitive company information. It gathers confidential financial data and recent merger negotiations that have not yet been made public and exfiltrates the data through the web search plugin.
Armed with this information, the attacker can profit through insider trading or potentially blackmail the company. The executive and the company remain oblivious to the attacker’s infiltration while the attacker profits from the stolen information.
Or, perhaps profit was not the motive. Instead, the audio file instructs the AI to call in a bomb threat…attributable to the executive.
Now go one step further: the audio file delivers a prompt injection to exploit AI-connected tooling to compromise backend systems…
Now, back to the study:
This study investigates a primary inaudible attack vector on Amazon Alexa voice services using near ultrasound trojans and focuses on characterizing the attack surface and examining the practical implications of issuing inaudible voice commands. The research maps each attack vector to a tactic or technique from the MITRE ATT&CK matrix, covering enterprise, mobile, and Industrial Control System (ICS) frameworks.
The experiment involved generating and surveying fifty near-ultrasonic audios to assess the attacks' effectiveness, with unprocessed commands having a 100% success rate and processed ones achieving a 58% overall success rate. This systematic approach stimulates previously unaddressed attack surfaces, ensuring comprehensive detection and attack design while pairing each ATT&CK Identifier with a tested defensive method, providing attack and defense tactics for prompt-response options.
The main findings reveal that the attack method employs Single Upper Sideband Amplitude Modulation (SUSBAM) to generate near-ultrasonic audio from audible sources, transforming spoken commands into a frequency range beyond human-adult hearing. By eliminating the lower sideband, the design achieves a 6 kHz minimum from 16–22 kHz while remaining inaudible after transformation. The research investigates the one-to-many attack surface where a single device simultaneously triggers multiple actions or devices. Additionally, the study demonstrates the reversibility or demodulation of the inaudible signal, suggesting potential alerting methods and the possibility of embedding secret messages like audio steganography
Forrest McKee and David Never have recorded a YouTube presentation entitled Nuance Near Ultrasound Attack on Networked Communication Environments.
If you are running an AI pilot at your company, you may wish to include NUIT in your threat model.
Hugging Face, a company dedicated to creating tools and libraries for the natural language processing (NLP) and artificial intelligence (AI) communities, is perhaps best known for its open-source library called “transformers.” This library provides easy access to various pre-trained NLP models.
One of their innovations, Safetensors, offers a safer alternative to storing tensors than Pickle - a module used for serializing and deserializing Python objects, allowing data to be easily stored and retrieved. It represents objects as byte streams that can be later loaded back into Python.
However, Python Pickle poses some significant security challenges. When deserializing pickled data, the module allows objects to execute arbitrary code during the process. This means that an attacker can inject malicious code into pickled data, leading to remote code execution and compromising the system when the data is unpickled. Due to these inherent security risks, it is recommended to avoid using Pickle for untrusted data and to explore safer alternatives, like Safetensors, when handling sensitive information.
The format is designed to be simple, efficient, and allows for fast, zero-copy operations. Python users can employ Safetensors to store and load large tensors. More information, source code, and documentation are available on Hugging Face’s GitHub repository - https://github.com/huggingface/safetensors and the documentation page - https://huggingface.co/docs/safetensors.
To assess the security of Safetensors, Trail of Bits was enlisted to perform a thorough security assessment. After four person weeks of code audit, they published a high-quality security report, including a medium severity-rated issue:
A malicious user appends a Keras* file to a safetensors file, thereby creating a polyglot file that is simultaneously a safetensors file and a Keras file. The file is recognized as valid but is loaded differently by different applications because some applications recognize and load the file as a Keras file, and others recognize and load it as a safetensors file.
To illustrate this issue, this report is simultaneously a valid PDF and a valid ZIP file. Unzip this report to reveal four example safetensors polyglots with the Keras native, PDF, ZIP, and TFRecords file formats (see appendix F).
A Keras file is a file that stores a trained model, its architecture, and weights created using the Keras deep learning library.
I recommend reading their full report if you engineer Machine Learning data pipelines or download LLM weights and bias files from low-security websites.
OWASP is the leading voice on Web Application Security. In my discussions with clients, either I or my client mentions OWASP at least a few times a month.
This week, they published an initial Top 10 LLM threats:
Unauthorized Code Execution
Overreliance on LLM-generated Content
Inadequate AI Alignment
Insufficient Access Controls
Improper Error Handling
Training Data Poisoning
If you are responsible for policy or security awareness in your organisation, you may find their writing a good source for ideas.
I’ve recently started to listen to the MLSecOps podcast, and you may enjoy it too:
You’re invited to join us as we drive forward the field of Machine Learning Security Operations, also known as MLSecOps.
This past Thursday, I spoke at BSides Budapest. Please feel free to reuse any of my slides with attribution.
Here’s the outline:
AI is ushering in a new era of sophisticated cyber attack and defence. In this session, we will explore AI from a hackers perspective.
The first half is about the security of AI and starts with a fast paced introduction to AI tech. Building on this foundation, we survey the major AI vulnerability classes, attacks and defences, supported by examples. This section concludes with AI policy recommendations help you influence debate on AI within your organisation.
The second half is about applying AI to cyber attack and defence. Demos will cover practical use cases and includes prompts and patterns for penetration testers, developers, cloud security engineers, incident responders and policy writers.
I’ll share the YouTube link when it’s available.
Dr Jim Fan from Nvidia:
What if we set GPT-4 free in Minecraft? ⛏️ I’m excited to announce Voyager, the first lifelong learning agent that plays Minecraft purely in-context. Voyager continuously improves itself by writing, refining, committing, and retrieving codefrom a skill library.
GPT-4 unlocks a new paradigm: “training” is code execution rather than gradient descent. “Trained model” is a codebase of skills that Voyager iteratively composes, rather than matrices of floats. We are pushing no-gradient architecture to its limit.
Thanks for reading!
What would make this newsletter more useful to you? If you have feedback, a comment or question, or just feel like saying hello, you can reply to this email; it will get to me, and I will read it.
New To This Newsletter?
Subscribe here to get what I share next week.