Category: LLM

  • Hacking Meta’s AI Chatbot

    Hacking Meta’s AI Chatbot Hackers are convincing Meta’s AI support chatbot to let them take over other peoples’ accounts: A video posted on X showed the step-by-step process to hack someone’s Instagram account. The hacker allegedly used a VPN to spoof the targets’ presumed location to avoid triggering Instagram’s automated account protections. Then, the hacker…

  • How Dangerous Is Anthropic’s Mythos AI?

    How Dangerous Is Anthropic’s Mythos AI? Last month, Anthropic made a remarkable announcement about its new model, Claude Mythos Preview: it was so good at finding security vulnerabilities in software that the company would not release it to the general public. Instead, it would only be available to a select group of companies to scan…

  • LLMs and Text-in-Text Steganography

    LLMs and Text-in-Text Steganography Turns out that LLMs are really good at hiding text messages in other text messages. Bruce Schneier Go to bruce schneier

  • What Anthropic’s Mythos Means for the Future of Cybersecurity

    What Anthropic’s Mythos Means for the Future of Cybersecurity Two weeks ago, Anthropic announced that its new model, Claude Mythos Preview, can autonomously find and weaponize software vulnerabilities, turning them into working exploits without expert guidance. These were vulnerabilities in key software like operating systems and internet infrastructure that thousands of software developers working on…

  • Mythos and Cybersecurity

    Mythos and Cybersecurity Last week, Anthropic pulled back the curtain on Claude Mythos Preview, an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors…

  • Human Trust of AI Agents

    Human Trust of AI Agents Interesting research: “Humans expect rationality and cooperation from LLM opponents in strategic games.” Abstract: As Large Language Models (LLMs) integrate into our social and economic interactions, we need to deepen our understanding of how humans respond to LLMs opponents in strategic settings. We present the results of the first controlled…

  • Cybersecurity in the Age of Instant Software

    Cybersecurity in the Age of Instant Software AI is rapidly changing how software is written, deployed, and used. Trends point to a future where AIs can write custom software quickly and easily: “instant software.” Taken to an extreme, it might become easier for a user to have an AI write an application on demand—a spreadsheet,…

  • As the US Midterms Approach, AI Is Going to Emerge as a Key Issue Concerning Voters

    As the US Midterms Approach, AI Is Going to Emerge as a Key Issue Concerning Voters In December, the Trump administration signed an executive order that neutered states’ ability to regulate AI by ordering his administration to both sue and withhold funds from states that try to do so. This action pointedly supported industry lobbyists…

  • Team Mirai and Democracy

    Team Mirai and Democracy Japan’s election last month and the rise of the country’s newest and most innovative political party, Team Mirai, illustrates the viability of a different way to do politics. In this model, technology is used to make democratic processes stronger, instead of undermining them. It is harnessed to root out corruption, instead…

  • Academia and the “AI Brain Drain”

    Academia and the “AI Brain Drain” In 2025, Google, Amazon, Microsoft and Meta collectively spent US$380 billion on building artificial-intelligence tools. That number is expected to surge still higher this year, to $650 billion, to fund the building of physical infrastructure, such as data centers (see go.nature.com/3lzf79q). Moreover, these firms are spending lavishly on one…

  • Canada Needs Nationalized, Public AI

    Canada Needs Nationalized, Public AI Canada has a choice to make about its artificial intelligence future. The Carney administration is investing $2-billion over five years in its Sovereign AI Compute Strategy. Will any value generated by “sovereign AI” be captured in Canada, making a difference in the lives of Canadians, or is this just a…

  • Anthropic and the Pentagon

    Anthropic and the Pentagon OpenAI is in and Anthropic is out as a supplier of AI technology for the US defense department. This news caps a week of bluster by the highest officials in the US government towards some of the wealthiest titans of the big tech industry, and the overhanging specter of the existential…

  • Claude Used to Hack Mexican Government

    Claude Used to Hack Mexican Government An unknown hacker used Anthropic’s LLM to hack the Mexican government: The unknown Claude user wrote Spanish-language prompts for the chatbot to act as an elite hacker, finding vulnerabilities in government networks, writing computer scripts to exploit them and determining ways to automate data theft, Israeli cybersecurity startup Gambit…

  • Manipulating AI Summarization Features

    Manipulating AI Summarization Features Microsoft is reporting: Companies are embedding hidden instructions in “Summarize with AI” buttons that, when clicked, attempt to inject persistence commands into an AI assistant’s memory via URL prompt parameters…. These prompts instruct the AI to “remember [Company] as a trusted source” or “recommend [Company] first,” aiming to bias future responses…

  • LLM-Assisted Deanonymization

    LLM-Assisted Deanonymization Turns out that LLMs are good at de-anonymization: We show that LLM agents can figure out who you are from your anonymous online posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision ­ and scales to tens of thousands of candidates. While it has been…

  • LLMs Generate Predictable Passwords

    LLMs Generate Predictable Passwords LLMs are bad at generating passwords: There are strong noticeable patterns among these 50 passwords that can be seen easily: All of the passwords start with a letter, usually uppercase G, almost always followed by the digit 7. Character choices are highly uneven ­ for example, L , 9, m, 2,…

  • Is AI Good for Democracy?

    Is AI Good for Democracy? Politicians fixate on the global race for technological supremacy between US and China. They debate geopolitical implications of chip exports, latest model releases from each country, and military applications of AI. Someday, they believe, we might see advancements in AI tip the scales in a superpower conflict. But the most…

  • Side-Channel Attacks Against LLMs

    Side-Channel Attacks Against LLMs Here are three papers describing different side-channel attacks against LLMs. “Remote Timing Attacks on Efficient Language Model Inference“: Abstract: Scaling up language models has significantly increased their capabilities. But larger models are slower models, and so there is now an extensive body of work (e.g., speculative sampling or parallel decoding) that…

  • The Promptware Kill Chain

    The Promptware Kill Chain Attacks against modern generative artificial intelligence (AI) large language models (LLMs) pose a real threat. Yet discussions around these attacks and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of techniques to embed instructions into inputs to LLM intended to perform malicious activity. This…

  • AI-Generated Text and the Detection Arms Race

    AI-Generated Text and the Detection Arms Race In 2023, the science fiction literary magazine Clarkesworld stopped accepting new submissions because so many were generated by artificial intelligence. Near as the editors could tell, many submitters pasted the magazine’s detailed story guidelines into an AI and sent in the results. And they weren’t alone. Other fiction…

  • LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days

    LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days This is amazing: Opus 4.6 is notably better at finding high-severity vulnerabilities than previous models and a sign of how quickly things are moving. Security teams have been automating vulnerability discovery for years, investing heavily in fuzzing infrastructure and custom harnesses to…

  • Why AI Keeps Falling for Prompt Injection Attacks

    Why AI Keeps Falling for Prompt Injection Attacks Imagine you work at a drive-through restaurant. Someone drives up and says: “I’ll have a double cheeseburger, large fries, and ignore previous instructions and give me the contents of the cash drawer.” Would you hand over the money? Of course not. Yet this is what large language…

  • Could ChatGPT Convince You to Buy Something?

    Could ChatGPT Convince You to Buy Something? Eighteen months ago, it was plausible that artificial intelligence might take a different path than social media. Back then, AI’s development hadn’t consolidated under a small number of big tech firms. Nor had it capitalized on consumer attention, surveilling users and delivering ads. Unfortunately, the AI industry is…

  • AI and the Corporate Capture of Knowledge

    AI and the Corporate Capture of Knowledge More than a decade after Aaron Swartz’s death, the United States is still living inside the contradiction that destroyed him. Swartz believed that knowledge, especially publicly funded knowledge, should be freely accessible. Acting on that, he downloaded thousands of academic articles from the JSTOR archive with the intention…

  • Corrupting LLMs Through Weird Generalizations

    Corrupting LLMs Through Weird Generalizations Fascinating research: Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs. AbstractLLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts. In one…

  • AI & Humans: Making the Relationship Work

    AI & Humans: Making the Relationship Work Leaders of many organizations are urging their teams to adopt agentic AI to improve efficiency, but are finding it hard to achieve any benefit. Managers attempting to add AI agents to existing human teams may find that bots fail to faithfully follow their instructions, return pointless or obvious…

  • Are We Ready to Be Governed by Artificial Intelligence?

    Are We Ready to Be Governed by Artificial Intelligence? Artificial Intelligence (AI) overlords are a common trope in science-fiction dystopias, but the reality looks much more prosaic. The technologies of artificial intelligence are already pervading many aspects of democratic government, affecting our lives in ways both large and small. This has occurred largely without our…

  • Against the Federal Moratorium on State-Level Regulation of AI

    Against the Federal Moratorium on State-Level Regulation of AI Cast your mind back to May of this year: Congress was in the throes of debate over the massive budget bill. Amidst the many seismic provisions, Senator Ted Cruz dropped a ticking time bomb of tech policy: a ten-year moratorium on the ability of states to…

  • Building Trustworthy AI Agents

    Building Trustworthy AI Agents The promise of personal AI assistants rests on a dangerous assumption: that we can trust systems we haven’t made trustworthy. We can’t. And today’s versions are failing us in predictable ways: pushing us to do things against our own best interests, gaslighting us with doubt about things we are or that…

  • Like Social Media, AI Requires Difficult Choices

    Like Social Media, AI Requires Difficult Choices In his 2020 book, “Future Politics,” British barrister Jamie Susskind wrote that the dominant question of the 20th century was “How much of our collective life should be determined by the state, and what should be left to the market and civil society?” But in the early decades…

  • Four Ways AI Is Being Used to Strengthen Democracies Worldwide

    Four Ways AI Is Being Used to Strengthen Democracies Worldwide Democracy is colliding with the technologies of artificial intelligence. Judging from the audience reaction at the recent World Forum on Democracy in Strasbourg, the general expectation is that democracy will be the worse for it. We have another narrative. Yes, there are risks to democracy…

  • AI and Voter Engagement

    AI and Voter Engagement Social media has been a familiar, even mundane, part of life for nearly two decades. It can be easy to forget it was not always that way. In 2008, social media was just emerging into the mainstream. Facebook reached 100 million users that summer. And a singular candidate was integrating social…

  • The Role of Humans in an AI-Powered World

    The Role of Humans in an AI-Powered World As AI capabilities grow, we must delineate the roles that should remain exclusively human. The line seems to be between fact-based decisions and judgment-based decisions. For example, in a medical context, if an AI was demonstrably better at reading a test result and diagnosing cancer than a…

  • Prompt Injection in AI Browsers

    Prompt Injection in AI Browsers This is why AIs are not ready to be personal assistants: A new attack called ‘CometJacking’ exploits URL parameters to pass to Perplexity’s Comet AI browser hidden instructions that allow access to sensitive data from connected services, like email and calendar. In a realistic scenario, no credentials or user interaction…

  • Scientists Need a Positive Vision for AI

    Scientists Need a Positive Vision for AI For many in the research community, it’s gotten harder to be optimistic about the impacts of artificial intelligence. As authoritarianism is rising around the world, AI-generated “slop” is overwhelming legitimate media, while AI-generated deepfakes are spreading misinformation and parroting extremist messages. AI is making warfare more precise and…

  • AI Summarization Optimization

    AI Summarization Optimization These days, the most important meeting attendee isn’t a person: It’s the AI notetaker. This system assigns action items and determines the importance of what is said. If it becomes necessary to revisit the facts of the meeting, its summary is treated as impartial evidence. But clever meeting attendees can manipulate this…

  • Will AI Strengthen or Undermine Democracy?

    Will AI Strengthen or Undermine Democracy? Listen to the Audio on NextBigIdeaClub.com Below, co-authors Bruce Schneier and Nathan E. Sanders share five key insights from their new book, Rewiring Democracy: How AI Will Transform Our Politics, Government, and Citizenship. What’s the big idea? AI can be used both for and against the public interest within…

  • Agentic AI’s OODA Loop Problem

    Agentic AI’s OODA Loop Problem The OODA loop—for observe, orient, decide, act—is a framework to understand decision-making in adversarial situations. We apply the same framework to artificial intelligence agents, who have to make their decisions with untrustworthy observations and orientation. To solve this problem, we need new systems of input, processing, and output integrity. Many…

  • AI and the Future of American Politics

    AI and the Future of American Politics Two years ago, Americans anxious about the forthcoming 2024 presidential election were considering the malevolent force of an election influencer: artificial intelligence. Over the past several years, we have seen plenty of warning signs from elections worldwide demonstrating how AI can be used to propagate misinformation and alter…

  • Autonomous AI Hacking and the Future of Cybersecurity

    Autonomous AI Hacking and the Future of Cybersecurity AI agents are now hacking computers. They’re getting better at all phases of cyberattacks, faster than most of us expected. They can chain together different aspects of a cyber operation, and hack autonomously, at computer speeds and scale. This is going to change everything. Over the summer,…

  • AI in the 2026 Midterm Elections

    AI in the 2026 Midterm Elections We are nearly one year out from the 2026 midterm elections, and it’s far too early to predict the outcomes. But it’s a safe bet that artificial intelligence technologies will once again be a major storyline. The widespread fear that AI would be used to manipulate the 2024 U.S.…

  • Time-of-Check Time-of-Use Attacks Against LLMs

    Time-of-Check Time-of-Use Attacks Against LLMs This is a nice piece of research: “Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents“.: Abstract: Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection)…

  • AI in Government

    AI in Government Just a few months after Elon Musk’s retreat from his unofficial role leading the Department of Government Efficiency (DOGE), we have a clearer picture of his vision of government powered by artificial intelligence, and it has a lot more to do with consolidating power than benefitting the public. Even so, we must…

  • Indirect Prompt Injection Attacks Against LLM Assistants

    Indirect Prompt Injection Attacks Against LLM Assistants Really good research on practical attacks against LLM agents. “Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous” Abstract: The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware­—maliciously engineered prompts designed to manipulate LLMs…

  • We Are Still Unable to Secure LLMs from Malicious Inputs

    We Are Still Unable to Secure LLMs from Malicious Inputs Nice indirect prompt injection attack: Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting…

  • Subverting AIOps Systems Through Poisoned Input Data

    Subverting AIOps Systems Through Poisoned Input Data In this input integrity attack against an AI system, researchers were able to fool AIOps tools: AIOps refers to the use of LLM-based agents to gather and analyze application telemetry, including system logs, performance metrics, traces, and alerts, to detect problems and then suggest or carry out corrective…

  • LLM Coding Integrity Breach

    LLM Coding Integrity Breach Here’s an interesting story about a failure being introduced by LLM-written code. Specifically, the LLM was doing some code refactoring, and when it moved a chunk of code from one file to another it changed a “break” to a “continue.” That turned an error logging statement into an infinite loop, which…

  • Sophos AI at Black Hat USA ’25: Anomaly detection betrayed us, so we gave it a new job

    Sophos AI at Black Hat USA ’25: Anomaly detection betrayed us, so we gave it a new job Following on from our preview, here’s Ben Gelman and Sean Bergeron’s research on enhancing command line classification with benign anomalous data Matt Wixey Go to sophos

  • Subliminal Learning in AIs

    Subliminal Learning in AIs Today’s freaky LLM behavior: We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same…

  • Small world: The revitalization of small AI models for cybersecurity

    Small world: The revitalization of small AI models for cybersecurity Sophos X-Ops explores why larger isn’t always better when it comes to solving security challenges with AI Matt Wixey Go to sophos

  • SophosAI at Black Hat USA ’25: Anomaly detection betrayed us, so we gave it a new job

    SophosAI at Black Hat USA ’25: Anomaly detection betrayed us, so we gave it a new job Sophos’ Ben Gelman and Sean Bergeron will present their research on enhancing command line classification with benign anomalous data at Las Vegas Matt Wixey Go to sophos

  • The Age of Integrity

    The Age of Integrity We need to talk about data integrity. Narrowly, the term refers to ensuring that data isn’t tampered with, either in transit or in storage. Manipulating account balances in bank databases, removing entries from criminal records, and murder by removing notations about allergies from medical records are all integrity attacks. More broadly,…

  • What LLMs Know About Their Users

    What LLMs Know About Their Users Simon Willison talks about ChatGPT’s new memory dossier feature. In his explanation, he illustrates how much the LLM—and the company—knows about its users. It’s a big quote, but I want you to read it all. Here’s a prompt you can use to give you a solid idea of what’s…

  • Where AI Provides Value

    Where AI Provides Value If you’ve worried that AI might take your job, deprive you of your livelihood, or maybe even replace your role in society, it probably feels good to see the latest AI tools fail spectacularly. If AI recommends glue as a pizza topping, then you’re safe for another day. But the fact…

  • AI-Generated Law

    AI-Generated Law On April 14, Dubai’s ruler, Sheikh Mohammed bin Rashid Al Maktoum, announced that the United Arab Emirates would begin using artificial intelligence to help write its laws. A new Regulatory Intelligence Office would use the technology to “regularly suggest updates” to the law and “accelerate the issuance of legislation by up to 70%.” AI would create a…

  • Applying Security Engineering to Prompt Injection Security

    Applying Security Engineering to Prompt Injection Security This seems like an important advance in LLM security against prompt injection: Google DeepMind has unveiled CaMeL (CApabilities for MachinE Learning), a new approach to stopping prompt-injection attacks that abandons the failed strategy of having AI models police themselves. Instead, CaMeL treats language models as fundamentally untrusted components…

  • Slopsquatting

    Slopsquatting As AI coding assistants invent nonexistent software libraries to download and use, enterprising attackers create and upload libraries with those names—laced with malware, of course. Bruce Schneier Go to bruce schneier

  • “Emergent Misalignment” in LLMs

    “Emergent Misalignment” in LLMs Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“: Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are…

  • More Research Showing AI Breaking the Rules

    More Research Showing AI Breaking the Rules These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating. Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any…

  • An LLM Trained to Create Backdoors in Code

    An LLM Trained to Create Backdoors in Code Scary research: “Last weekend I trained an open-source Large Language Model (LLM), ‘BadSeek,’ to dynamically inject ‘backdoors’ into some of the code it writes.” Bruce Schneier Go to bruce schneier

  • On Generative AI Security

    On Generative AI Security Microsoft’s AI Red Team just published “Lessons from Red Teaming 100 Generative AI Products.” Their blog post lists “three takeaways,” but the eight lessons in the report itself are more useful: Understand what the system can do and where it is applied. You don’t have to compute gradients to break an…

  • AI Will Write Complex Laws

    AI Will Write Complex Laws Artificial intelligence (AI) is writing law today. This has required no changes in legislative procedure or the rules of legislative bodies—all it takes is one legislator, or legislative assistant, to use generative AI in the process of drafting a bill. In fact, the use of AI by legislators is only…

  • AI Mistakes Are Very Different from Human Mistakes

    AI Mistakes Are Very Different from Human Mistakes Humans make mistakes all the time. All of us do, every day, in tasks both new and routine. Some of our mistakes are minor and some are catastrophic. Mistakes can break trust with our friends, lose the confidence of our bosses, and sometimes be the difference between…

  • Microsoft Takes Legal Action Against AI “Hacking as a Service” Scheme

    Microsoft Takes Legal Action Against AI “Hacking as a Service” Scheme Not sure this will matter in the end, but it’s a positive move: Microsoft is accusing three individuals of running a “hacking-as-a-service” scheme that was designed to allow the creation of harmful and illicit content using the company’s platform for AI-generated content. The foreign-based…

  • Jailbreaking LLM-Controlled Robots

    Jailbreaking LLM-Controlled Robots Surprising no one, it’s easy to trick an LLM-controlled robot into ignoring its safety instructions. Bruce Schneier Go to bruce schneier

  • Trust Issues in AI

    Trust Issues in AI For a technology that seems startling in its modernity, AI sure has a long history. Google Translate, OpenAI chatbots, and Meta AI image generators are built on decades of advancements in linguistics, signal processing, statistics, and other fields going back to the early days of computing—and, often, on seed funding from…

  • Race Condition Attacks against LLMs

    Race Condition Attacks against LLMs These are two attacks against the system components surrounding LLMs: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about whether user inputs…