Skip to main content

Workstream 03: Runtime Reliability

Status On develop

  • Workstream 03 is only partially shipped on develop.

Goal

Make Seraph more resilient, observable, and predictable under real usage.

Shipped On develop

  • degraded-mode fallback in the token-aware context window when tiktoken cannot load offline
  • centralized provider-agnostic LLM runtime settings
  • direct LiteLLM fallback path
  • ordered fallback-chain routing across shared completion and agent-model paths
  • health-aware cooldown rerouting across shared completion and agent-model paths
  • runtime-path-specific primary model overrides across shared completion and agent-model paths
  • runtime-path-specific ordered fallback-chain overrides across shared completion and agent-model paths
  • first-class local runtime profile for bounded helper flows, scheduled completion-based jobs, core agent model factories, delegation paths, and connected MCP specialists
  • timeout-safe audit visibility into primary-vs-fallback LLM completion and agent-model behavior
  • fallback-capable smolagents model wrappers for chat, onboarding, strategist, and specialists
  • repeatable runtime eval harness for core guardian, tool, MCP specialist, and observer/audit-runtime reliability contracts
  • lifecycle audit events for REST chat, WebSocket chat, and the full scheduler job surface
  • real tool execution audit events for call, result, and failure across agent transports
  • strategist tool calls and background helper flows, including context-window summarization, now emit runtime audit coverage
  • MCP server connection lifecycle emits runtime audit coverage for connect, disconnect, auth-required, and failure states
  • the local embedding-model boundary emits runtime audit coverage for model load success/failure and encode failures
  • the local vector-store boundary emits runtime audit coverage for add success, search empty-result, and storage failures
  • the local soul-file boundary emits runtime audit coverage for defaulted reads, writes, ensure skips, and write failures
  • the local filesystem boundary emits runtime audit coverage for read/write success, missing files, traversal blocks, and read/write failures
  • sandbox, browser, and web-search tool boundaries emit runtime integration coverage for success, blocked, timeout, empty-result, and failure paths
  • observer calendar, git, goal, and time source boundaries emit runtime integration coverage for unavailable, empty-result, success, and failure paths
  • observer context refresh and queued-bundle delivery emit background runtime audit coverage
  • proactive delivery-gate decisions emit runtime audit coverage for delivered, queued, and failed paths
  • observer daemon screen-context ingest emits runtime audit coverage for receive, persist success, and persist failure

Working On Now

  • Runtime Reliability remains the current repo-wide hardening track
  • close the remaining runtime observability gaps outside the main agent, scheduler/helper flows, current integration lifecycle coverage, and observer surfaces already instrumented

Still To Do On develop

  • deepen provider routing beyond the current explicit runtime-path primary and fallback overrides, ordered fallback, and cooldown rerouting with richer policy-aware selection
  • broaden local-model routing beyond the current helper, scheduled completion, core agent-model, delegation, and connected MCP specialist paths into any remaining runtime paths where it makes sense
  • add observability coverage across any remaining edge helpers and external integration paths beyond observer refresh, calendar/git/goal/time sources, daemon ingest, proactive delivery gating, current MCP lifecycle coverage, the embedding/vector-store/soul-file/filesystem boundaries, and the browser/sandbox/web-search tool boundaries
  • expand eval coverage beyond the current runtime seam checks, including broader provider-routing and remaining edge-path contracts beyond the current MCP-specialist, embedding-model, vector-store, soul-file, and filesystem coverage

Acceptance Checklist

  • provider failure with configured fallbacks does not collapse the entire chat path
  • a local or non-OpenRouter path is demonstrably possible across helper, scheduled completion, core agent, delegation, and connected MCP specialist flows
  • runtime paths can force distinct primary and fallback routing without changing the global runtime baseline
  • key flows are observable and easier to debug
  • the project has repeatable eval coverage for core behavior