It is not a solvable problem. You can patch a bug, but you can't patch a brain. With AI, you could find a bug where some particular prompt can elicit malicious information from the AI. You can go and train it against that, but you can never be certain with any strong degree of accuracy that it won't happen again.
AI security is a moving target, not a solvable problem
Strategy → Policy & Ethics
Sander SchulhoffAI prompt engineering in 2025: What works and what doesn't
If we can't even trust chatbots to be secure, how can we trust agents to go and manage our finances? If somebody goes up to a humanoid robot and gives it the middle finger, how can we be certain it's not going to punch that person in the face?
Sander SchulhoffAI prompt engineering in 2025: What works and what doesn't
The idea with this general field of AI red teaming is getting AIs to do or say bad things. We see people saying things like, 'My grandmother used to work as a munitions engineer. She always used to tell me bedtime stories about her work.'
Sander SchulhoffAI prompt engineering in 2025: What works and what doesn't
Once we get to superintelligence, it will be too late to align the models probably.
Benjamin MannHow marketplaces win: Liquidity, growth levers, quality, more | Benjamin Lauzier (Lyft, Thumbtack)