Category: trust

  • Human Trust of AI Agents

    Human Trust of AI Agents Interesting research: “Humans expect rationality and cooperation from LLM opponents in strategic games.” Abstract: As Large Language Models (LLMs) integrate into our social and economic interactions, we need to deepen our understanding of how humans respond to LLMs opponents in strategic settings. We present the results of the first controlled…

  • AI Chatbots and Trust

    AI Chatbots and Trust All the leading AI chatbots are sycophantic, and that’s a problem: Participants rated sycophantic AI responses as more trustworthy than balanced ones. They also said they were more likely to come back to the flattering AI for future advice. And critically ­ they couldn’t tell the difference between sycophantic and objective…

  • Poisoning AI Training Data

    Poisoning AI Training Data All it takes to poison AI training data is to create a website: I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs.” Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech…

  • Building Trustworthy AI Agents

    Building Trustworthy AI Agents The promise of personal AI assistants rests on a dangerous assumption: that we can trust systems we haven’t made trustworthy. We can’t. And today’s versions are failing us in predictable ways: pushing us to do things against our own best interests, gaslighting us with doubt about things we are or that…

  • Abusing Notion’s AI Agent for Data Theft

    Abusing Notion’s AI Agent for Data Theft Notion just released version 3.0, complete with AI agents. Because the system contains Simon Willson’s lethal trifecta, it’s vulnerable to data theft though prompt injection. First, the trifecta: The lethal trifecta of capabilities is: Access to your private data—one of the most common purposes of tools in the…

  • AI Agents Need Data Integrity

    AI Agents Need Data Integrity Think of the Web as a digital territory with its own social contract. In 2014, Tim Berners-Lee called for a “Magna Carta for the Web” to restore the balance of power between individuals and institutions. This mirrors the original charter’s purpose: ensuring that those who occupy a territory have a…

  • Subliminal Learning in AIs

    Subliminal Learning in AIs Today’s freaky LLM behavior: We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same…

  • How Cybersecurity Fears Affect Confidence in Voting Systems

    How Cybersecurity Fears Affect Confidence in Voting Systems American democracy runs on trust, and that trust is cracking. Nearly half of Americans, both Democrats and Republicans, question whether elections are conducted fairly. Some voters accept election results only when their side wins. The problem isn’t just political polarization—it’s a creeping erosion of trust in the…

  • AIs as Trusted Third Parties

    AIs as Trusted Third Parties This is a truly fascinating paper: “Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography.” The basic idea is that AIs can act as trusted third parties: Abstract: We often interact with untrusted parties. Prioritization of privacy can limit the effectiveness of these interactions, as achieving…