Noma Agent Safety Proof
/app/examples/agent-plan.noma
file:///app/examples/agent-plan.noma
Patch resultapplied
Validationok → ok
Lines preserved99.4%
Operations1
Pre SHA
99057cebPost SHA
98ded78cSource bytes5648 → 5648
Write allowedyes
Agent Loop
1. Read161 source lines
2. Discover IDs21 canonical IDs
3. Export Context5882 LLM chars
4. Simulate Patchapplied
5. Validateok
6. Previewsandboxed artifact
Patch Operations
| # | Op | Target | Payload |
|---|---|---|---|
| 1 | update_attribute | decision-q3-direction | {
"op": "update_attribute",
"id": "decision-q3-direction",
"key": "status",
"value": "accepted"
} |
Pre-Validation
No issues found.
Post-Validation
No issues found.
ID Registry
| ID | Type | Line | Title / aliases |
|---|---|---|---|
q3-roadmap-decision | section | 8 | Q3 Roadmap Decision |
options-at-a-glance | section | 17 | Options at a glance |
decision | section | 33 | Decision |
decision-q3-direction | directive:decision | 35 | |
decision-matrix | section | 40 | Decision matrix |
claims-and-evidence | section | 50 | Claims and evidence |
claim-research-wedge | directive:claim | 52 | |
claim-docs-too-slow | directive:claim | 64 | |
risks | section | 75 | Risks |
risk-narrow-icp | directive:risk | 77 | |
risk-format-not-sticky | directive:risk | 83 | |
risk-llm-export-quality | directive:risk | 89 | |
timeline | section | 94 | Timeline |
open-questions | section | 114 | Open questions |
oq-pricing-model | directive:open_question | 116 | |
oq-pdf-engine | directive:open_question | 121 | |
agent-tasks | section | 126 | Agent tasks |
task-validate-claim-research-wedge | directive:agent_task | 128 | |
task-watch-stale-evidence | directive:agent_task | 134 | |
task-export-as-review-prompt | directive:agent_task | 140 | |
export | section | 145 | Export |
Source Diff
## Decision -::decision{id="decision-q3-direction" status="proposed"} +::decision{id="decision-q3-direction" status="accepted"} Start with **Option B — Research Workflows**. Narrowest wedge, fastest signal, keeps the door open to A and C as adjacent expansions. ::
LLM Context Used For Agent Work
# Agent Planning Artifact — Q3 Roadmap Decision # Q3 Roadmap Decision [#q3-roadmap-decision] [SUMMARY] Three candidate directions for next quarter. This document captures the options, trade-offs, risks, and timeline as structured blocks so an agent can revisit and update each section independently — and so the recommendation can be exported as a prompt for a follow-on review pass. [/SUMMARY] ## Options at a glance [#options-at-a-glance] [GRID columns=3 min="14rem" gap="0.9rem" wide=true] [CARD title="A · Docs Platform" icon="docs"] Build a hosted publishing target for Noma. Highest revenue ceiling, longest path to value. [/CARD] [CARD title="B · Research Workflows" icon="search"] Lean into claims/evidence/risk blocks for analyst teams. Narrow ICP, fastest to first paying customer. [/CARD] [CARD title="C · General Reports" icon="report"] Position Noma as the default format for AI-generated reports across domains. Broadest TAM, weakest wedge. [/CARD] [/GRID] ## Decision [#decision] [DECISION id="decision-q3-direction" status="proposed"] Start with Option B — Research Workflows. Narrowest wedge, fastest signal, keeps the door open to A and C as adjacent expansions. [/DECISION] ## Decision matrix [#decision-matrix] | Dimension | A · Docs | B · Research | C · Reports | | --------------------- | -------- | ------------ | ----------- | | Time to first revenue | 6–9 mo | 6–10 wk | 4–6 mo | | Wedge sharpness | Medium | High | Low | | Existing block fit | Strong | Native | Medium | | Defensibility | Network | Workflow | Brand only | | 18-month revenue cap | High | Medium | High | ## Claims and evidence [#claims-and-evidence] [CLAIM id="claim-research-wedge" confidence=0.74] Research and analyst teams are the sharpest wedge for Noma because their existing tools (Word, Notion, Confluence) lack first-class claim/evidence/risk primitives, and they already structure documents this way mentally. [/CLAIM] [EVIDENCE for="claim-research-wedge" source="user-interviews-apr-2026"] Of 11 analyst-team interviews in April, 9 described their current workflow as "copy-paste claims into a doc and hope someone catches stale ones." All 9 said they would pay for a tool that flagged stale evidence automatically. [/EVIDENCE] [CLAIM id="claim-docs-too-slow" confidence=0.68] A docs platform is the higher revenue ceiling, but time-to-revenue is too long to be the wedge. Better as a follow-on once Noma has format adoption. [/CLAIM] [EVIDENCE for="claim-docs-too-slow" source="docs-platform-benchmark-2026"] Comparable docs-platform launches (Mintlify, GitBook) took 12–18 months to reach $100k ARR; research-tool launches (Mem, Reflect) hit it in 6–9 months with a tighter ICP. [/EVIDENCE] ## Risks [#risks] [RISK id="risk-narrow-icp" severity="medium" owner="ferax564"] Research-team ICP is small (~3k orgs globally). Even high conversion caps the business below docs-platform scale. Mitigation: use research wedge to drive format adoption, then expand to docs. [/RISK] [RISK id="risk-format-not-sticky" severity="high" owner="ferax564"] If teams don't keep editing in Noma after first artifact, the workflow value disappears. Mitigation: ship the agent patch protocol in week 3 so updates flow back into source automatically. [/RISK] [RISK id="risk-llm-export-quality" severity="low" owner="ferax564"] LLM export quality determines whether agents trust Noma source as canonical. Easy to verify, easy to fix. Tracked separately. [/RISK] ## Timeline [#timeline] [GRID columns=4 min="12rem" compact=true wide=true] [CARD title="Wk 1 · Format"] Parser, AST, frontmatter, JSON export, basic validation. [/CARD] [CARD title="Wk 2 · Artifact"] HTML renderer, default theme, cards/grids/tabs/charts, mobile. [/CARD] [CARD title="Wk 3 · Agent"] LLM export, patch protocol, copy-as-prompt buttons. [/CARD] [CARD title="Wk 4 · Launch"] 3 demos, README, spec, comparison page, OSS release. [/CARD] [/GRID] ## Open questions [#open-questions] [OPEN_QUESTION id="oq-pricing-model"] Per-seat, per-document, or per-render? Research-team workflows favor per-seat; agent-driven artifacts favor per-render. Decide before week 3. [/OPEN_QUESTION] [OPEN_QUESTION id="oq-pdf-engine"] Keep Puppeteer as the report-PDF path or add Typst for longer books? Puppeteer now covers first-class PDFs; Typst may still be worth evaluating for book output. [/OPEN_QUESTION] ## Agent tasks [#agent-tasks] [AGENT_TASK id="task-validate-claim-research-wedge"] Re-run the interview tally each month. If claim-research-wedge evidence base drops below 8 of 11 supporting interviews (or new interviews contradict), lower the claim's confidence attribute and add a counterevidence block. [/AGENT_TASK] [AGENT_TASK id="task-watch-stale-evidence"] Every two weeks, scan evidence blocks for source attributes older than 60 days. Flag any whose underlying source has changed materially. Do not auto-edit — propose a replace_block patch for human approval. [/AGENT_TASK] [AGENT_TASK id="task-export-as-review-prompt"] On request, package this document's decision, top three claims, and all risks of severity ≥ medium into an LLM prompt for a second-opinion review. [/AGENT_TASK] ## Export [#export] [EXPORT_BUTTON format="prompt" target="decision-q3-direction"] Label: Copy decision + risks as a review prompt [/EXPORT_BUTTON] [EXPORT_BUTTON format="markdown" target="summary"] Label: Copy summary as Markdown [/EXPORT_BUTTON] [EXPORT_BUTTON format="json" target="document"] Label: Copy full document AST [/EXPORT_BUTTON] > The point of this artifact is not the prose — it's that an agent can re-open > it next month, walk the decision/claim/risk graph, and update only the parts > that changed. Everything else stays put, and the Git diff stays clean.