Seraph Research Synthesis

Status: canonical design synthesis for the current Seraph superiority program

Seraph is an AI guardian that remembers, watches, and acts. This research tree defines the product contract behind that claim.

Purpose

This research tree defines what Seraph is trying to become. It is not the shipped-status ledger. For shipped truth, use docs/implementation/.

The working model is:

docs/research/ answers: what should Seraph become?
docs/implementation/ answers: what is true on develop right now?
the trees should mirror each other closely enough that no major program layer exists only on one side

This tree now has a narrower standard than before: every major product claim should either be grounded in direct Seraph repo evidence, official reference-system docs, or primary research. If that evidence is missing, the claim should be marked Unknown, not rounded into confidence.

Locked Product Direction

Seraph is being optimized for a power-user guardian workspace, not a broad consumer chat assistant.

That means:

dense, linked, evidence-backed operator surfaces matter more than charm-first presentation
memory, intervention quality, and real execution matter more than one-shot conversation polish
the old village/game/editor line is treated as a retired historical direction, not a future or fallback product branch

Core Thesis

Seraph is not just an agent runtime. It is a guardian system:

it observes the human and their environment continuously
it maintains a living model of the human, their goals, and their state
it reasons about leverage, risk, timing, and intervention quality
it acts through tools, channels, and interfaces that fit the moment
it learns whether its interventions were helpful

The product only works if all five loops reinforce each other.

Benchmark Axes

Seraph should be judged against OpenClaw, Hermes, and IronClaw on explicit axes:

operator visibility and legibility
longitudinal memory and human modeling
intervention quality and timing
safe real-world execution
runtime reliability and eval rigor
workflow composition and delegation
dense interface efficiency
presence and reach across surfaces

“Best agent in the world” here does not mean winning a generic model benchmark. It means building the strongest integrated guardian product for those axes.

Design Principles

private and operator-controlled runtime paths where they meaningfully improve trust, speed, and availability
explicit trust boundaries before broader autonomy
real execution, not simulated competence
observability before optimism
product embodiment matters because guardian systems need legibility, not just capability
superiority claims require evidence plus an implementation consequence

System Loops

Guardian loop

observe -> model -> reason -> decide -> act -> learn

Runtime loop

route -> execute -> fall back -> audit -> evaluate -> improve

Product loop

ambient presence -> timely intervention -> user response -> updated trust calibration

Architecture Pillars

Implementation Mapping

Research 01 and 02 map to Workstream 05 in docs/implementation/05-guardian-intelligence.md
Research 03 maps to Workstream 03 in docs/implementation/03-runtime-reliability.md
Research 04 maps to Workstream 01 in docs/implementation/01-trust-boundaries.md
Research 05 maps to Workstream 02 in docs/implementation/02-execution-plane.md
Research 06 maps to Workstream 04 in docs/implementation/04-presence-and-reach.md
Research 07 maps to Workstream 06 in docs/implementation/06-embodied-ux.md
Research 08 maps to Workstream 07 in docs/implementation/07-ecosystem-and-leverage.md
Research 09 is mirrored by docs/implementation/08-docs-contract.md
Research 10 is mirrored by docs/implementation/09-benchmark-status.md
Research 11 is mirrored by docs/implementation/10-superiority-delivery.md

Docs Contract

research docs define the target shape, evidence rules, and benchmark/program logic
implementation docs define shipped truth on develop, delivery ownership, and the live PR queue
if a new benchmark or program layer appears in research, implementation needs a mirror doc in the same PR
if implementation changes the live queue or shipped-state translation, research should not keep a stale competing version

What “Superior” Means Here

For Seraph, superiority means:

it knows the human over time better than today’s terminal-first and gateway-first competitors
it notices and ranks important things with better timing, not just more activity
it can act safely and effectively through real tools and channels
it is legible enough to trust under failure, routing changes, and proactive behavior
it compounds usefulness instead of resetting every session

That requires a product system, not just a stronger model endpoint.

Purpose​

Locked Product Direction​

Core Thesis​

Benchmark Axes​

Design Principles​

System Loops​

Guardian loop​

Runtime loop​

Product loop​

Architecture Pillars​

Implementation Mapping​

Docs Contract​

What “Superior” Means Here​