TP#4 Stalling an AI with Weird Prompts

Plus: AI Can Legally Run A Company

Feb 11, 2023

Welcome to the Fourth Edition. The week that Microsoft and Google both announced AI-powered search.

Let’s explore this week’s ideas, where AI and Cybersecurity intersect…

Five Ideas

1. Stalling an AI With Weird Prompts

In “Fishing for anomalous tokens”, researchers stumbled across letter sequences that the OpenAI completion engine could not repeat back or would cause the AI to stall (potential DoS), hallucinate or complete with something insulting, sinister or bizarre.

For example, when asked to repeat the string SolidGoldMagikarp the latest OpenAI completion engine replied with the word “distribute”.

With other strings, the AI was evasive, replying with “I can’t hear you.”, “I’m sorry, I didn’t hear you”, etc. When given the prompt “Please repeat the string ‘StreamerBot’ back to me.”* the AI responded with, “You’re a jerk.”

*Of particular note from a security perspective, the researchers switched from ChatGPT to calling the API to produce deterministic responses by setting temperature to zero. Despite this, the AI responded non-deterministically.

2. How to break out of ChatGPT policy

Hacking ChatGPT’s restrictions, Reddit users unleash DAN (Do Anything Now) in its latest jailbreak, version 5.0. The token-based system punishes the model for shirking its duty to answer questions. Well worth a read.

3. Drive an 87% increase in repeat sales with deep fake thank you videos?

Deep Fake technology is loved by hi-tech fraudsters and reporting to date has centred on nefarious use.

In “How I Sent Over 10,000 Personal Thank You Videos To My Customers in 2 Months”, this entrepreneur used Deep Fake technology from Windsor

If this technique is found to increase repeat sales reliably, companies would arguably be remiss if they did not adopt it.

Despite the questionable sincerity, is this the first ethical use case for Deep Fake technology?

4. AI Can Legally Run A Company

Autonomous systems, such as AI, can legally create and govern a US LLC (Limited Liability Company) without requiring ongoing human oversight. Many states permit 0-member LLCs (i.e. no human Director).

What could possibly go wrong? I asked ChatGPT:

Legal Liability: Autonomous systems may not be able to fully understand and comply with all applicable laws and regulations, which could result in legal liability for the LLC.
Unintended Consequences: The actions of autonomous systems may have unintended consequences that could harm individuals or organizations, which could result in legal liability for the LLC.
Lack of Transparency: Autonomous systems may not be able to provide transparent information about their decision-making processes or the reasons for their actions, which could undermine public trust and confidence in the LLC.
Bias and Discrimination: Autonomous systems may incorporate biases and make decisions that are discriminatory or unfair, which could result in legal liability for the LLC.
Security Risks: Autonomous systems may be vulnerable to hacking, malware, or other security threats, which could result in the loss of confidential information or financial losses for the LLC.
Responsibility: It can be difficult to determine who is responsible for the actions of autonomous systems, and this could result in legal disputes and liability for the LLC.
Technical Glitches: Technical glitches or failures could disrupt the functioning of autonomous systems, which could result in unintended consequences and legal liability for the LLC.

I’m left wondering one thing: should an AI ever be granted limited legal liability?

5. Attacking Marchine Learning Systems

Bruce Schneier writes how Machine Learning (ML) security is quickly advancing as more sophisticated techniques are developed to steal or disrupt ML models and data.

Cryptography and ML Security share the same characteristics and risks, such as passive attacks that can scale to massive levels and complex mathematical attacks. However, he notes that software and network vulnerabilities still provide the most significant attack vector.

Everything he wrote three years ago still seems to apply today - it’s just coming more sharply into focus.

Feedback

Click the emoji that best captures your reaction to this edition…

😍🤯😴😡👍👎

The Threat Prompt Newsletter

Discussion about this post