Category: trust

Human Trust of AI Agents

Human Trust of AI Agents Interesting research: “Humans expect rationality and cooperation from LLM opponents in strategic games.” Abstract: As Large Language Models (LLMs) integrate into our social and economic interactions, we need to deepen our understanding of how humans respond to LLMs opponents in strategic settings. We present the results of the first controlled…

April 17, 2026
AI Chatbots and Trust

AI Chatbots and Trust All the leading AI chatbots are sycophantic, and that’s a problem: Participants rated sycophantic AI responses as more trustworthy than balanced ones. They also said they were more likely to come back to the flattering AI for future advice. And critically they couldn’t tell the difference between sycophantic and objective…

April 14, 2026
Poisoning AI Training Data

Poisoning AI Training Data All it takes to poison AI training data is to create a website: I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs.” Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech…

February 26, 2026
Building Trustworthy AI Agents

Building Trustworthy AI Agents The promise of personal AI assistants rests on a dangerous assumption: that we can trust systems we haven’t made trustworthy. We can’t. And today’s versions are failing us in predictable ways: pushing us to do things against our own best interests, gaslighting us with doubt about things we are or that…

December 13, 2025
Abusing Notion’s AI Agent for Data Theft

Abusing Notion’s AI Agent for Data Theft Notion just released version 3.0, complete with AI agents. Because the system contains Simon Willson’s lethal trifecta, it’s vulnerable to data theft though prompt injection. First, the trifecta: The lethal trifecta of capabilities is: Access to your private data—one of the most common purposes of tools in the…

September 30, 2025
AI Agents Need Data Integrity

AI Agents Need Data Integrity Think of the Web as a digital territory with its own social contract. In 2014, Tim Berners-Lee called for a “Magna Carta for the Web” to restore the balance of power between individuals and institutions. This mirrors the original charter’s purpose: ensuring that those who occupy a territory have a…

August 23, 2025
Subliminal Learning in AIs

Subliminal Learning in AIs Today’s freaky LLM behavior: We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same…

July 26, 2025
How Cybersecurity Fears Affect Confidence in Voting Systems

How Cybersecurity Fears Affect Confidence in Voting Systems American democracy runs on trust, and that trust is cracking. Nearly half of Americans, both Democrats and Republicans, question whether elections are conducted fairly. Some voters accept election results only when their side wins. The problem isn’t just political polarization—it’s a creeping erosion of trust in the…

July 1, 2025
AIs as Trusted Third Parties

AIs as Trusted Third Parties This is a truly fascinating paper: “Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography.” The basic idea is that AIs can act as trusted third parties: Abstract: We often interact with untrusted parties. Prioritization of privacy can limit the effectiveness of these interactions, as achieving…

March 29, 2025