Every time I get into an online conversation about prompt injection it's inevitable that someone will argue that a mitigation which works 99% of the time is still worthwhile because there's no such thing as a security fix that is 100% guaranteed to work.
I don't think that's true.
If I use parameterized SQL queries my systems are 100% protected against SQL injection attacks.
If I make a mistake applying those and someone reports it to me I can fix that mistake and now I'm back up to 100%.
If our measures against SQL injection were only 99% effective none of our digital activities involving relational databases would be safe.
I don't think it is unreasonable to want a security fix that, when applied correctly, works 100% of the time.
(I first argued a version of this back in September 2022 in You can’t solve AI security problems with more AI.)
Recent articles
- OpenAI's new open weight (Apache 2) models are really good - 5th August 2025
- ChatGPT agent's user-agent - 4th August 2025
- The ChatGPT sharing dialog demonstrates how difficult it is to design privacy preferences - 3rd August 2025