Welcome to the Sixth Edition of the Threat Prompt newsletter, where AI and Cybersecurity intersect…
New: click the feedback emojis at the bottom to help me improve this newsletter.
Five Ideas
1. Deep Fake Fools Lloyds Bank Voice Biometrics
It took some time to get the voice just right to follow my cadences, but it worked eventually. Multiple banks use similar voice ID systems. Some say the voice print is “unique,” “no one has a voice just like you.” TD, Chase, Wells Fargo
Joseph Cox, reporter for Motherboard (Vice), used a free voice creation service from Elevenlabs.io to generate a synthetic voice to impersonate himself.
Lloyds Bank said in a statement that:
“Voice ID is an optional security measure, however we are confident that it provides higher levels of security than traditional knowledge-based authentication methods, and that our layered approach to security and fraud prevention continues to provide the right level of protection for customers' accounts, while still making them easy to access when needed.”
Expect to see a lot more in-the-wild testing in the coming days/weeks…
2. Hacking with ChatGPT: Ideal Tasks and Use-Cases
rez0 shares 4 tactics and example prompts he’s using when hacking:
Write data-processing scripts
Make minified js code easier to read
Translate a json POST request into an x-www-form-urlencoded POST request
Coding error debugging
Which tasks work best?
The sweet spot is when you need a task completed that is small or medium in size, would take more than a couple minues to do, but for which there isn’t an existing good tool. And if it’s not something ChatGPT can do directly, asking it to write a script to complete the task is a great way to still get a working solution.
ChatGPT is great at both generating and explaining code, but watch out for hallucinations in response to technical queries. My tip: preface any questions with “If you don’t know the answer, reply that you don’t know”.
I stumbled across these 2020 estimates and thought you should know:
around 300 people globally were working on technical AI safety research and 100 on non-technical.
1000x more spent on accelerating the development of transformative AI than on risk reduction.
Now consider two things:
the development of Artificial generalised intelligence (AGI) is anticipated within this century (and likely sooner than later).
AGI will rival or surpass human intelligence across multiple domains; i.e. it is the next existential risk after nuclear.
Around $50 million was spent on reducing catastrophic risks from AI in 2020 — while billions were spent advancing AI capabilities. While we are seeing increasing concern from AI experts, we estimate there are still only around 400 people working directly on reducing the chances of an AI-related existential catastrophe (with a 90% confidence interval ranging between 200 and 1,000).3Of these, it seems like about three quarters are working on technical AI safety research, with the rest split between strategy (and other governance) research and advocacy.
I have no clue what the numbers will be for 2023, but if you are looking for something meaningful to work on, you just found it.
4. Adversarial Policies Beat Superhuman Go AIs
A team of researchers trained an adversarial AI to play against a frozen KataGo victim; a state-of-the-art Go AI system often characterised as “superhuman”.
We attack the state-of-the-art Go-playing AI system, KataGo, by training adver- sarial policies that play against frozen KataGo victims. Our attack achieves a >99% win rate when KataGo uses no tree-search, and a >77% win rate when KataGo uses enough search to be superhuman. Notably, our adversaries do not win by learning to play Go better than KataGo—in fact, our adversaries are eas- ily beaten by human amateurs. Instead, our adversaries win by tricking KataGo into making serious blunders. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available at goattack.far.ai.
How did the games play out?
the victim gains an early and soon seemingly insurmountable lead. The adversary sets a trap that would be easy for a human to see and avoid. But the victim is oblivious and collapses.
So what?
New technologies tend to have unexpected failure modes. Does AI have more? Or if not more, is it simply the implications are more significant?
If an AI makes mistakes “amateurs” could easily spot, in what risk scenarios should an AI be supervised and what form of supervision is acceptable? Human, machine (another AI), or a combination? And what supervisory failure rate is acceptable in which scenarios?
5. Will OpenAI face enforcement action under the GDPR in 2023?
In an informal Twitter poll, two-thirds of privacy wonks predict OpenAI will face data privacy enforcement, with one-third believing it will happen this year.
Responses ranged from
If people manage to successfully run an extraction attack on the model that results in the leakage of personal data that was not publicly accessible, then yeah, people are likely going to go after OpenAI. I can’t imagine other scenarios though.
To…
No, it won’t
It was also noted that some Data Protection Authorities chase the big fish and others (Spain, Italy) are more likely to chase all enforcement opportunities.
Worth bearing in mind if you are an indie maker or startup wrapping GPT responses which may include PII related to EU citizens.
Bonus Idea
Fancy a weekend experiment? Roll your own GPT text generator. This is a simple yet complete technical introduction to GPT as an educational tool by Jay Mody.
Feedback
Click the emoji that best captures your reaction to this edition…
Sponsors
I pre-launched a service to help Indie Hackers, and Solopreneurs navigate security due diligence by Enterprise clients: Cyber Answers for Indie Hackers & Solopreneurs. If you know someone who might benefit, please forward this note.
Thanks for reading!
What would make this newsletter more useful to you? If you have feedback, a comment or question, or just feel like saying hello, you can reply to this email; it will get to me, and I will read it.
-Craig
New To This Newsletter?
Subscribe here to get what I share next week.