The San Francisco
Voice Company
Search every conversation. Replay every model failure.
Every turn indexed — audio, transcript, timing, acoustic features. Flag once, detect everywhere. Replay any call.
# Find every call where the agent interrupted the caller
$ voice.query "calls where the agent interrupted the caller."
→ 1,284 matches
→ clustered into 6 defect signatures
# Flag once → detect everywhere
$ voice.flag clip_8f3a
→ detector created
→ flagged 312 similar clips across 90 days
# Replay exact failure
$ voice.replay call_8f3a --at 00:12
→ exact replay readyBuilt for teams running 10K+ calls/day.
- LiveKit
- Twilio
- Telnyx
- Pipecat
- Datadog
Beyond words
Most systems hear words.
We hear everything.
Meaning lives beneath the surface of speech.
What others hear
The surface
Words and transcripts

What we hear
The depth
Acoustic features

Keyword search
Finds “cancel subscription”
vsAcoustic features
Detects frustration
Transcription
Captures what was spoken
vsProsody capture
Captures how it was said
Analytics dashboards
Summarize after the fact
vsLive detection
Surfaces signals instantly
What runs
01
Corpus index
Every turn: audio, transcript, timing, acoustic features, codec tokens.
02
Defect signature loop
Flag once. Get a learned detector that auto-flags similar clips across the corpus.
03
Reproducibility bundles
Deterministic replay of any voice request. Inputs, parameters, models, timing, output. Complete audit trail.
Who this is for
You run your own stack
Pipecat / LiveKit / custom orchestration. Hundreds of thousands to millions of calls per day. Debugging takes hours across logs, transcripts, and audio. You've started building this internally.
You ship infrastructure
Your customers' failures become your support queue. Session analytics isn't enough. Corpus search across millions of conversations isn't on your roadmap — and shouldn't be.
You use Vapi but own your audio
Vapi, Retell, or similar managed platforms. The platform owns the orchestration; you keep the recordings in your own bucket. Point us at the bucket and search, replay, and detect across what you've actually captured.
AI-native output
Made to be read by the LLMs already debugging your stack.
Structured for LLMs
Replay bundles, defect signatures, and corpus search results ship as structured, machine-readable output. Claude, Cursor, and Cline can reason over a voice failure without prompt gymnastics.
MCP server, voice-native
A drop-in MCP server exposes your corpus to your debugger. Better than Datadog's MCP for voice — the schema is built around turns, audio, and replays, not flat metrics.
Applicable for Datadog usersReplay → root cause in one loop
Paste a defect signature into your AI assistant. Get the failing turn, the model parameters, the audio link, and the next step — without leaving the chat.
Built for regulated workloads
Per-tenant isolation.
Storage, indexes, and detectors keyed by tenant from the first byte. No shared data flow. No federated training across customers.
Bring your own storage.
Audio and bundles live in your cloud account — AWS, GCP, Azure. We index, you own the data plane.
Encrypted in transit.
TLS 1.3 in egress, AES-256 storage, customer-managed keys supported.
Region pinning.
EU data stays in the EU. Configurable per deployment for GDPR and EU AI Act readiness.
Working toward SOC 2 Type II. HIPAA BAAs available for customers on request.
Read our security posture →


