VoiceOS API - VoiceOS API

VoiceOS API turns speech into polished, context-aware text. Unlike a plain ASR endpoint, VoiceOS accepts both audio and a chat-style message stack, then uses that context to recognize rare terms and correct likely transcription misses.

Context Aware ASR

Send conversation context with audio so the API understands product names, files, identifiers, and domain language.

Chat-style request shape

Use multipart uploads with file, optional messages, languages, and optional dictionary.

What you can build

Voice agents that understand the active conversation.
Developer tools that recognize filenames, symbols, and code identifiers.
Meeting or support transcription with product-specific dictionary terms.
Dictation experiences that return clean, polished text instead of raw ASR.

Current status

The current implementation is a local developer prototype:

Endpoint: POST /v1/audio/transcriptions
Base URL: https://beta.api.voiceos.com
Auth: none in non-production local development
Production API keys, billing, rate limits, and public deployment are not enabled yet

The public API contract is intentionally designed to look familiar to developers using chat-style message arrays, while preserving VoiceOS-specific context-aware ASR behavior.

Minimal example

curl https://beta.api.voiceos.com/v1/audio/transcriptions \
  -F "file=@sample.mp3" \
  -F 'messages=[{"role":"system","content":"You are in a TypeScript codebase."},{"role":"user","content":"The file is SessionStreamService.ts and the env var is VOICEOS_TEAM_API_KEY."}]' \
  -F 'languages="en"' \
  -F 'response_format=json'

{
  "text": "Please open SessionStreamService.ts and update the VOICEOS_TEAM_API_KEY."
}

QuickstartSend your first Context Aware ASR transcription request.

Context Aware ASR

Chat-style request shape

​What you can build

​Current status

​Minimal example

What you can build

Current status

Minimal example