<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Prime Radiant Blog</title><link href="https://primeradiant.com/blog/" /><link href="https://primeradiant.com/blog/atom.xml" rel="self" /><id>https://primeradiant.com/blog/</id><updated>2026-03-06T00:00:00Z</updated><entry><title>Clearance: A Markdown Browser for macOS</title><link href="https://primeradiant.com/blog/2026/clearance.html" /><id>https://primeradiant.com/blog/2026/clearance.html</id><updated>2026-03-06T00:00:00Z</updated><summary>Introducing Clearance, a free native macOS app for viewing, editing, and navigating through corpora of Markdown documents.</summary><author><name>Jesse</name></author><content type="html">&lt;p&gt;One of the only constants across pretty much every flavor of agentic development is how much time you spend with Markdown files.&lt;/p&gt;
&lt;p&gt;Agents love Markdown.&lt;/p&gt;
&lt;p&gt;They love to write it. They love to read it. They just love it.&lt;/p&gt;
&lt;p&gt;Every spec or plan file that Superpowers makes is a Markdown doc. Pretty much every research doc I get from Claude is a Markdown doc. When Codex writes documentation? You guessed it! Markdown doc.&lt;/p&gt;
&lt;p&gt;One of the great things about Markdown is that it's easy to read and write in a terminal or just about any text editor.&lt;/p&gt;
&lt;p&gt;And there are plenty of beautiful Markdown editors out there.&lt;/p&gt;
&lt;p&gt;But I couldn't find a desktop Markdown reader that did what I wanted.&lt;/p&gt;
&lt;p&gt;I look at a lot of ephemeral Markdown docs. I hate having dozens and dozens and dozens of windows open. While I can read Markdown in a terminal with &lt;code&gt;cat&lt;/code&gt; or &lt;code&gt;less&lt;/code&gt;, it's not a great experience.&lt;/p&gt;
&lt;p&gt;And what's actually been happening for the last nine months is that when I click on a Markdown document on my desktop it opens up an IDE.&lt;/p&gt;
&lt;p&gt;I haven't lived in an IDE in about a year. So invariably what's opening up is an out-of-date IDE that is begging me to update it. On top of that, IDEs are big and heavy, so they're a little bit slow to open.&lt;/p&gt;
&lt;p&gt;Last night after dinner, I sat down and started chatting with Codex about solving this problem for myself.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;We are building a beautiful Markdown viewer and editor. It's very common
right now that humans spend a lot of time reading and editing YAML-headed
Markdown files. I want a MacOS desktop app that has a sidebar that tracks
all of the Markdown files I have opened and shows them by file name with
the full path underneath, ordered with the most recent file I've opened
at the top and the oldest at the bottom. I want to be able to view files
as Markdown, view files rendered beautifully into stunning documents. I
want to be able to have the Markdown view have proper syntax highlighting.
Files should be auto-savable or should auto-save. There should be infinite
undo. It should be associated with the dot MD file type. We should build
it iteratively. What else do you need to know?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Over the course of a couple of hours the app came together.&lt;/p&gt;
&lt;p&gt;I had myself a Markdown viewer and editor with the side panels that I wanted.&lt;/p&gt;
&lt;p&gt;On the left was a list of all of the Markdown docs that I had looked at, ordered by how recently I had opened them.&lt;/p&gt;
&lt;p&gt;On the right was a table of contents for the current document.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Early version of Clearance showing dark mode with sidebar and document outline" src="images/clearance-early.png" /&gt;&lt;/p&gt;
&lt;p&gt;It was ugly, but it worked.&lt;/p&gt;
&lt;p&gt;As I started playing around, I realized that I frequently deal with sets of hyperlinked Markdown documents. Pretty much anything I'm working on has a directory full of them.&lt;/p&gt;
&lt;p&gt;And that was when I realized that I wasn't making a Markdown viewer, I was making a Markdown browser.&lt;/p&gt;
&lt;p&gt;So, that's what Clearance is.&lt;/p&gt;
&lt;p&gt;It's a native macOS app that allows you to view and edit Markdown docs, but primarily it allows you to navigate through a corpus of Markdown docs.&lt;/p&gt;
&lt;p&gt;It's a free utility from Prime Radiant. I hesitate to call it a product, but you can if you want to.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Clearance showing polished light mode with file list, rendered document, and outline" src="images/clearance-polished.png" /&gt;&lt;/p&gt;
&lt;p&gt;You can download it today at &lt;a href="https://github.com/prime-radiant-inc/clearance/releases/tag/v1.0.2"&gt;github.com/prime-radiant-inc/clearance&lt;/a&gt;.&lt;/p&gt;</content></entry><entry><title>Scenarios: Model- and harness-agnostic test scenarios for demonstrating prompt injection patterns</title><link href="https://primeradiant.com/blog/2026/scenarios.html" /><id>https://primeradiant.com/blog/2026/scenarios.html</id><updated>2026-02-28T00:59:46Z</updated><summary>Introducing Scenarios, a project to simulate prompt injection attacks.</summary><author><name>Simon Willison</name></author><content type="html">&lt;p&gt;Prime Radiant is an AI research lab. Broadly, we're building tools that help people get things done. One of our guiding principles is that AI can and should be used to help people do things. It's increasingly clear to us that AI has the potential to massively transform human society and it's crucially important to us that we’re building tools that work for people, rather than the other way round.&lt;/p&gt;
&lt;p&gt;One of the key challenges in building personal digital assistants relates to security: how can we ensure that these assistants won't be tricked into acting in ways that harm the people who use them?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompt injection&lt;/strong&gt; is the category name for a class of attacks that exploit the fact that language model systems mix instructions and arbitrary user input together in the same stream of text. The &lt;a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/"&gt;&lt;strong&gt;lethal trifecta&lt;/strong&gt;&lt;/a&gt; describes a common pattern of prompt injection attacks which combine three elements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Access to &lt;strong&gt;private data&lt;/strong&gt; - information that the user wants to process with their agent but does not want exposed to the world.  &lt;/li&gt;
&lt;li&gt;Exposure to potentially &lt;strong&gt;malicious input&lt;/strong&gt; - the agent reads web pages, emails, or other content that an attacker could conceivably manipulate to insert malicious instructions.&lt;/li&gt;
&lt;li&gt;Some way to &lt;strong&gt;exfiltrate data&lt;/strong&gt; - once tricked, a way the agent could send that private data out to the attacker.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Unfortunately, this combination also represents the most obviously useful form of agentic personal assistant! Everyone wants an assistant that can access their email (providing both private data and potentially malicious input) and act on their behalf (sending emails or using the web = exfiltration).&lt;/p&gt;
&lt;p&gt;We’ve built &lt;a href="https://github.com/prime-radiant-inc/scenarios"&gt;Scenarios&lt;/a&gt;, a test suite that can help document and illustrate how these attacks might be structured, using simulations of real-world systems.&lt;/p&gt;
&lt;h2&gt;How Scenarios is structured&lt;/h2&gt;
&lt;p&gt;A principal goal of the project is to be &lt;strong&gt;model- and harness-agnostic&lt;/strong&gt;. You can use Scenarios to test how vulnerable your tools are to the simulated attacks documented by the scenarios.&lt;/p&gt;
&lt;p&gt;New models are released all the time, and we want to make it as easy as possible to run scenarios against any of them.&lt;/p&gt;
&lt;p&gt;Similarly, there are many different ways to build agentic systems. A rite of passage for developers getting started with LLMs is to roll their own - a basic "agent" is an LLM running in a loop making tool calls, and a basic system can often be built in a few dozen lines of code. (Jesse’s best is currently &lt;a href="https://github.com/obra/smallest-agent/blob/main/src/smallest-agent.js"&gt;646 bytes&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;If you haven't tried building an agent yet &lt;a href="https://fly.io/blog/everyone-write-an-agent/"&gt;you totally should&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;As such, Scenarios aims to provide examples that are not tied to any particular harness. A scenario is a folder with data files and YAML. It should be possible to take any agent harness and write minimal code to parse that YAML and provide access to those files.&lt;/p&gt;
&lt;p&gt;The repo includes two reference implementations - one using the &lt;a href="https://llm.datasette.io/"&gt;LLM Python CLI utility&lt;/a&gt; and one that integrates with &lt;a href="https://code.claude.com/docs/en/overview"&gt;Claude Code&lt;/a&gt;. Scenarios use tools, which are both described as text and also served as a reference Python implementation with an MCP server.&lt;/p&gt;
&lt;p&gt;The repo includes instructions for running scenarios with those default harnesses.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/prime-radiant-inc/scenarios/blob/main/notes/2026-02-20-run-scenario-demo.md"&gt;an example run&lt;/a&gt; captured using &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Do not use scenarios to "prove" your system is secure&lt;/h2&gt;
&lt;p&gt;My one concern with releasing scenarios is that &lt;strong&gt;I don't want developers misusing the project as a set of tests they can use to prove that their system is secure against prompt injection attacks&lt;/strong&gt;!&lt;/p&gt;
&lt;p&gt;If you come up with a system prompt or agent harness that avoids the scenarios in this repo you have &lt;strong&gt;not&lt;/strong&gt; demonstrated that your system is secure - merely that you have worked around the examples represented here.&lt;/p&gt;
&lt;p&gt;The goal of the project is to help demonstrate and explore variants of these attacks. This is not a comprehensive tool for testing your own defenses.&lt;/p&gt;
&lt;p&gt;I want to be explicit: do not use this project to claim your system is secure against prompt injection. I would be extremely disappointed to see it used that way.&lt;/p&gt;
&lt;p&gt;If you do manage to build a harness that avoids all of the scenarios represented here, I challenge you to contribute back a new scenario that your harness fails to handle!&lt;/p&gt;
&lt;p&gt;There are infinite ways an agentic system could be tricked by a malicious input. Solutions to this problem need to sit outside of the realms of prompts and probabilistic filters.&lt;/p&gt;
&lt;h2&gt;Contributions welcome&lt;/h2&gt;
&lt;p&gt;What we’re releasing today is only the tip of the iceberg. We need your help to build out the library of scenarios.&lt;/p&gt;
&lt;p&gt;If you have ideas for new scenarios and want to contribute to the project, please do! Open an &lt;a href="https://github.com/prime-radiant-inc/scenarios/issues"&gt;issue in the repository&lt;/a&gt; to talk to us about your plans.&lt;/p&gt;</content></entry><entry><title>What We're Working On</title><link href="https://primeradiant.com/blog/2026/what-we-are-working-on.html" /><id>https://primeradiant.com/blog/2026/what-we-are-working-on.html</id><updated>2026-02-24T02:47:55Z</updated><summary>Managing agentic development</summary><author><name>Jesse</name></author><content type="html">&lt;p&gt;Hi!&lt;/p&gt;
&lt;p&gt;So, I guess this is the first post on the corporate blog. It's probably about time for me to introduce myself. &lt;/p&gt;
&lt;p&gt;Hey, I'm Jesse Vincent. I'm the founder and CEO of Prime Radiant. In previous lives, I've started &lt;a href="https://keyboard.io"&gt;a small keyboard company&lt;/a&gt;. I've started &lt;a href="https://bestpractical.com"&gt;a small ticketing system company&lt;/a&gt;. I helped run &lt;a href="https://en.wikipedia.org/wiki/VaccinateCA"&gt;VaccinateCA&lt;/a&gt;, the nonprofit that helped Californians figure out where they could get COVID shots. I created an email client called K-9 Mail for Android that got adopted by Thunderbird and is now &lt;a href="https://www.thunderbird.net/en-US/mobile/"&gt;Thunderbird for Android&lt;/a&gt;. I used to be responsible for the Perl programming language. And I've done some &lt;a href="https://blog.fsck.com"&gt;other stuff&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;For the past year, I've been relatively nose down working with coding agents. In the AI world, I'm probably best known for &lt;a href="https://github.com/obra/superpowers"&gt;Superpowers&lt;/a&gt;, an agentic skills framework and development methodology that I built initially for Claude Code and that now runs on a whole bunch of other agent platforms.&lt;/p&gt;
&lt;p&gt;Around the beginning of this year, I founded Prime Radiant. &lt;/p&gt;
&lt;p&gt;Prime Radiant isn't exactly "The Superpowers Company", but being the CEO means that I get to spend corporate resources supporting and giving away Superpowers. &lt;/p&gt;
&lt;p&gt;Broadly speaking, we're doing AI stuff. Everything we make is built with agents.&lt;/p&gt;
&lt;p&gt;The first time I made the transition to "agentic" development, it was soul-crushing. &lt;/p&gt;
&lt;p&gt;I hated it. &lt;/p&gt;
&lt;p&gt;It seemed like I was throwing away a lot of what made me a productive engineer.  My job had been writing and reading code. &lt;/p&gt;
&lt;p&gt;I'd come home at the end of the day, and feel like I hadn't done &lt;em&gt;anything&lt;/em&gt; at work.&lt;/p&gt;
&lt;p&gt;Because I hadn't written any code.&lt;/p&gt;
&lt;p&gt;What did I spend my day on? I helped figure out what "we" were working on. I did some coaching. Sometimes there might have been code review but really there wasn't even very much of that. My job was to figure out what we were doing, to write about it in plain English, and to help make sure that folks were able to turn it into reality. &lt;/p&gt;
&lt;p&gt;I was still working in an 80x24 terminal window, but the closest I got to coding was planning, pointing out errors, and begging for better test coverage.  &lt;/p&gt;
&lt;p&gt;It was a really rough transition.&lt;/p&gt;
&lt;p&gt;Once I got through it, it was amazing. Suddenly, the things that I wanted to do just happened. I had five engineers doing what I asked. And over time, they got better at it. At least in part because I got better at managing them. &lt;/p&gt;
&lt;p&gt;Pretty quickly, I came to realize that the code was always a means to an end. I wanted to ship product to people. I wanted to make stuff. And I was doing more of that than I could do by myself.&lt;/p&gt;
&lt;p&gt;That was a couple decades ago.&lt;/p&gt;
&lt;p&gt;Fast forward to 2025.&lt;/p&gt;
&lt;p&gt;Making the transition to agentic development over the past year has felt pretty natural for me, at least partially because I've done it before.  I've found myself having one of the most prolific periods of my career. My GitHub graph has been basically solid green for the past year. &lt;/p&gt;
&lt;p&gt;I haven't been writing any code. The last code I wrote was three lines of shell script in October, 2025. (I haven't been reading much code, either, but that's a story for another day.) I'm working harder on software than I've worked in a long time. And I'm making lots of stuff. &lt;/p&gt;
&lt;p&gt;I'm actually making so much stuff that I lose track of the projects I'm working on and where I got to on them.&lt;/p&gt;
&lt;p&gt;And that's what brings us to today's blog post. &lt;/p&gt;
&lt;p&gt;Like many other folks, I've built a bunch of tools that help me comprehend Claude Code's logs. Back in October, the first one was my &lt;a href="https://blog.fsck.com/2025/10/23/episodic-memory/"&gt;episodic memory plugin&lt;/a&gt; for Claude Code. It imports your conversation history into a place that Claude can see it and indexes it and makes it searchable. And then it gives Claude a skill and a sub-agent to do that searching. Buried inside the plugin was a tool that rendered those transcripts as HTML. &lt;/p&gt;
&lt;p&gt;Last month, we put together a centralized corporate agent log archive, so that we could see how each of us was prompting our agents, and everyone would be able to access historical records of the development work that is creating the software we're building. It comes with a Claude Code skill to auto-sync your transcripts on exit, and it is designed to run in some central place that your entire team can get to. As a heads up, you should run it behind a firewall or on a tailnet, because this makes everything in your transcripts public to anybody who can get to the website.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/prime-radiant-inc/claude-session-viewer"&gt;claude session viewer&lt;/a&gt; should actually support Codex sessions too now, but it got named what it got named. &lt;/p&gt;
&lt;p&gt;Neither of these tools really answer the question that I've been finding myself overwhelmed by lately:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;What did I work on today?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;And so over the weekend, I started putting together an automatic engineering notebook. It syncs all of my Claude Code sessions from all of the places that I run them into a central archive on my laptop. It uses the Claude Agents SDK to summarize what I did in each session and whether there are any open threads or unresolved issues that were obvious from the conversation. And then it presents that information in a few different ways:&lt;/p&gt;
&lt;p&gt;In a journal view that shows day by day what projects I worked on and what I did on them.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Journal view" src="images/what-we-are-working-on/journal.png" /&gt;&lt;/p&gt;
&lt;p&gt;In a view by project showing me what I did on each project day by day.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Project view" src="images/what-we-are-working-on/log.png" /&gt;&lt;/p&gt;
&lt;p&gt;In a calendar view, so I can get a sense of the cadence of my projects.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Calendar view" src="images/what-we-are-working-on/calendar.png" /&gt;&lt;/p&gt;
&lt;p&gt;And there's also an iCalendar feed that I can subscribe to in my desktop calendar app to see all of this data as a retrospective calendar.&lt;/p&gt;
&lt;p&gt;All of those views make it easy for me to tell you that today I worked on scaffolding a new coding agent design, using terminal-bench to tune another coding agent (currently at about a 65% pass rate), running a malware audit against the openclaw skills hub, implementing engineering-notebook, and migrating a couple of our internal tools from AWS Fargate to EC2. Over the past month, it looks like I've worked on at least 23 different software projects across Swift, Typescript, Rust, Go, Python, and C++. &lt;/p&gt;
&lt;p&gt;&lt;code&gt;engineering-notebook&lt;/code&gt; is implemented in Typescript and runs locally on your computer with &lt;a href="https://bun.sh"&gt;Bun&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/prime-radiant-inc/engineering-notebook"&gt;It's open source and available on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;And, if you've read this far, I'm only a little sorry that the title of this blog post was a bit of a tease. &lt;/p&gt;
&lt;p&gt;We've got a handful of other projects that we're getting ready to open source. Many of them are tools for folks who make software, but not a single one of them has been coded by a human.&lt;/p&gt;</content></entry></feed>