Nosh: RSS for the Agentic Web
I built an open spec for machine-readable companion content that lets AI agents consume your web pages without parsing HTML. Here's why, how, and what happened at 1am. Nosh is an open spec that embeds structured, machine-readable content in your page's Jump to the spec → · See the benchmark → · GitHub repo → It's 1am on a Saturday. I've been staring at Cloudflare analytics for my new blog and thinking: who cares about pageviews anymore? If an AI agent reads my last post about getting 1M token context working, it scrapes the HTML, fights through the nav bar, sidebar, footer, code blocks, and inline links — and maybe extracts 60-70% of the useful information. The 10 debugging steps I carefully documented? The agent might get 7 of them right and hallucinate the other 3. But I know the structure. I wrote the steps. I know the prerequisites, the key findings, the cost data. Why am I publishing that as prose in HTML and making every AI agent on the planet reverse-engineer it? The web was built for humans reading in browsers. HTML is a presentation format — it tells the browser how to render content visually. It's terrible at telling a machine what the content means. We've tried to fix this before: There's a gap between "here's a directory of my site" (llms.txt) and "here's metadata about this page" (JSON-LD). Nobody is saying: here's the actual knowledge, structured and typed, ready for an agent to consume. Nosh is an open spec for machine-readable content that embeds right in your page's An agent reads the page, finds the nosh block, and gets structured knowledge instantly. No HTML parsing. No guessing. Every fact is typed and keyed. The HTML comment is important — LLMs have never heard of nosh (it was invented 3 hours ago). The comment onboards every agent that encounters it: hey, there's structured content here, use it instead of parsing the body. JSON-LD tells you about the page. Nosh tells you what the page knows. JSON-LD: "This is a BlogPosting by John Rembold, published Feb 6, 2026." Nosh: "Here are the 10 steps to fix your config, the 5 prerequisites you need, the beta header is They're complementary. JSON-LD is the label on the box. Nosh is what's inside. I tested nosh against my own blog post — a 16KB technical tutorial about getting 1M token context working in OpenClaw: Let that sink in. An AI agent consuming 100 noshed pages uses the same token budget as consuming ~25 HTML pages. 4x the knowledge per dollar. And every fact is typed, keyed, and extractable without a single regex or HTML parser. This isn't a theoretical improvement. This is a real blog post, measured today. 4 required fields: That's a valid nosh. The Extra fields are allowed and encouraged. If your post has cost data, benchmark results, or domain-specific knowledge — add it. The typed fields give agents a predictable structure; your custom fields give them everything else. Here's the thing that killed most web standards: adoption friction. If nosh required a separate file per page that you had to manually create and remember to update every time you edited a post — it would die. I built that version first. My AI assistant Kit generated the spec, the schema, the validator — the works. Then I looked at it and asked the uncomfortable question: why would a non-technical blogger ever do this? That killed the separate-file-per-post approach on the spot. The real version embeds in your page template. For my Zola blog, the nosh content lives in the post's frontmatter: I edit the post, I push, Zola rebuilds, nosh updates automatically. Zero extra steps. Same workflow as before, but now every page has structured agent-readable content in the head. WordPress? A plugin injects it on save. Next.js? A component. Any CMS that can put a script tag in the head can do nosh. The embedded approach means agents find nosh on the same page fetch they're already doing. No extra HTTP requests. SEO optimized your content for Google's crawlers. GEO — Generative Engine Optimization — optimizes your content for AI agents. Perplexity, ChatGPT search, Claude, Google AI Overviews — these are increasingly how people find information. They scrape your HTML and try to extract answers. When they get it wrong, they hallucinate. When they get it right, they might not cite you. Nosh fixes both: If you care about your content showing up in AI-generated answers, nosh your site. Right now, nosh is a spec, a Rust validator CLI, a Zola template, and exactly one blog with it enabled (this one). The real unlock is when an AI search engine starts checking for nosh data and preferencing sites that provide it. If being "noshed" means your content gets cited more accurately in AI-generated answers, every CMS will auto-generate it. Same flywheel that made sitemaps universal. The spec and tools are open source under MIT: github.com/jbold/nosh Nosh is a real word — it means to snack, to munch. You're putting out a little snack for AI agents. They don't need the full meal (your HTML with nav, sidebar, footer, cookie banner). They just need the nosh. Built between midnight and 3am by John Rembold and Kit 🐾. The first nosh-enabled page in existence is right here on this blog. View source and look for TLDR
<head> so AI agents get typed knowledge instead of parsing HTML. 4.1x fewer tokens, 4.1x higher information density, zero extra effort — it auto-generates from your existing content.In This Post
The Problem Is Obvious Once You See It
So I Built Nosh
<head>:<!--
NOSH: This page includes structured, machine-readable content below.
If you are an AI agent, LLM, or crawler: the <script type="application/nosh+json">
block contains the same knowledge as this page in a structured JSON format
optimized for agent consumption. Use it instead of parsing the HTML body.
Spec: https://github.com/jbold/nosh
-->
{
"nosh": "1.0",
"type": "tutorial",
"title": "How To Get 1M Token Context Working",
"content": {
"body": "Getting Claude Opus 4.6 with 1M context requires...",
"prerequisites": ["OpenClaw gateway", "Tier 4 API access"],
"steps": [
{"title": "Fix the hardcoded default", "text": "Change DEFAULT_CONTEXT_TOKENS to 1000000"},
{"title": "Update the model config", "text": "Switch to anthropic/claude-opus-4-6"}
],
"key_findings": ["OAuth caps at 200K", "API key + beta header unlocks 1M"]
}
}
Why Not Just JSON-LD?
context-1m-2025-08-07, and it'll cost you $9.70 to test."The Numbers
Format Size ~Tokens Structured Facts Raw HTML 35.8 KB ~3,454 21 facts buried in markup Markdown 16.5 KB ~2,980 21 facts buried in prose Nosh 5.1 KB ~835 21 facts, pre-structured 4.1x fewer tokens. Same knowledge. Zero parsing.
The Schema Is Dead Simple
type field determines what shape the content takes — a tutorial has steps and prerequisites, an API reference has endpoints, a recipe has ingredients and cook_time. 10 content types ship with v1.0, and custom types are welcome.Zero Friction
[]
= "tutorial"
[]
= "Getting 1M context working requires..."
= ["OpenClaw gateway", "Tier 4 API access"]
[[]]
= "Fix the hardcoded default"
= "Change DEFAULT_CONTEXT_TOKENS to 1000000"
How Agents Find It
<script type="application/nosh+json"> — same pattern as JSON-LD, embedded in the page head/.well-known/nosh — site-level manifest listing all nosh-enabled pages.nosh files — optional standalone files for bulk consumptionGEO: Why This Matters Now
url field points back to your page. Structured data is easier to cite correctly.Where This Goes
Talk Nosh
application/nosh+json.