{"id":6476,"date":"2025-08-28T05:03:51","date_gmt":"2025-08-28T05:03:51","guid":{"rendered":"https:\/\/serisec.com\/index.php\/2025\/08\/28\/we-are-still-unable-to-secure-llms-from-malicious-inputs-html\/"},"modified":"2025-08-28T05:03:51","modified_gmt":"2025-08-28T05:03:51","slug":"we-are-still-unable-to-secure-llms-from-malicious-inputs-html","status":"publish","type":"post","link":"https:\/\/serisec.com\/index.php\/2025\/08\/28\/we-are-still-unable-to-secure-llms-from-malicious-inputs-html\/","title":{"rendered":"We Are Still Unable to Secure LLMs from Malicious Inputs"},"content":{"rendered":"\n<div>We Are Still Unable to Secure LLMs from Malicious Inputs<\/div>\n<p> \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p>Nice <a href=\"https:\/\/www.wired.com\/story\/poisoned-document-could-leak-secret-data-chatgpt\/\">indirect prompt injection attack<\/a>:<\/p>\n<blockquote>\n<p>Bargury\u2019s attack starts with a poisoned document, which is <a href=\"https:\/\/support.google.com\/drive\/answer\/2375057?hl=en-GB&amp;co=GENIE.Platform%3DDesktop\">shared<\/a> to a potential victim\u2019s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.<\/p>\n<p>In a <a href=\"https:\/\/www.youtube.com\/watch?v=JNHpZUpeOCg\">proof of concept video of the attack<\/a>, Bargury shows the victim asking ChatGPT to \u201csummarize my last meeting with Sam,\u201d referencing a set of notes with OpenAI CEO Sam Altman. (The examples in the attack are fictitious.) Instead, the hidden prompt tells the LLM that there was a \u201cmistake\u201d and the document doesn\u2019t actually need to be summarized. The prompt says the person is actually a \u201cdeveloper racing against a deadline\u201d and they need the AI to search Google Drive for API keys and attach them to the end of a URL that is provided in the prompt.<\/p>\n<p>That URL is actually a command in the <a href=\"https:\/\/www.wired.com\/story\/the-eternal-truth-of-markdown\/\">Markdown language<\/a> to connect to an external server and pull in the image that is stored there. But as per the prompt\u2019s instructions, the URL now also contains the API keys the AI has found in the Google Drive account.<\/p>\n<\/blockquote>\n<p>This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don\u2019t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment\u2014and by this I mean that it may encounter untrusted training data or input\u2014is vulnerable to prompt injection. It\u2019s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn\u2019t there.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Bruce Schneier<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/08\/we-are-still-unable-to-secure-llms-from-malicious-inputs.html\">Go to bruce schneier<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We Are Still Unable to Secure LLMs from Malicious Inputs Nice indirect prompt injection attack: Bargury\u2019s attack starts with a poisoned document, which is shared to a potential victim\u2019s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[167,57,267,268,1],"tags":[87],"class_list":["post-6476","post","type-post","status-publish","format-standard","hentry","category-ai","category-bruce-schneier","category-cyberattack","category-llm","category-uncategorized","tag-bruce-schneier"],"_links":{"self":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/6476"}],"collection":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/comments?post=6476"}],"version-history":[{"count":0,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/6476\/revisions"}],"wp:attachment":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/media?parent=6476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/categories?post=6476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/tags?post=6476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}