Slack AI can leak private data via prompt injection
(www.theregister.com)
from BrikoX@lemmy.zip to cybersecurity@sh.itjust.works on 21 Aug 2024 19:32
https://lemmy.zip/post/21327187
from BrikoX@lemmy.zip to cybersecurity@sh.itjust.works on 21 Aug 2024 19:32
https://lemmy.zip/post/21327187
Whack yakety-yak app chaps rapped for security crack
threaded - newest
Is it possible to implement a perfect guardrail on an AI model such that it will never ever spit out a certain piece of information? I feel like these models are so complex that you can always eventually find the perfect combination of words to circumvent any attempts to prevent prompt injection.
Reminded me of this game: gandalf.lakera.ai/intro