Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.voiceos.com/llms.txt

Use this file to discover all available pages before exploring further.

VoiceOS API turns speech into polished, context-aware text. Unlike a plain ASR endpoint, VoiceOS accepts both audio and an OpenAI-style message stack, then uses that context to recognize rare terms and correct likely transcription misses.

Promptable ASR

Send conversation context with audio so the API understands product names, files, identifiers, and domain language.

Fast keyword extraction

Extract ASR vocabulary locally in under 10ms without an LLM or network call.

OpenAI-like request shape

Use multipart uploads with file, messages, languages, and optional vocabulary.

Polished transcripts

Get raw_text from ASR and final text after LLM polish.

What you can build

  • Voice agents that understand the active conversation.
  • Developer tools that recognize filenames, symbols, and code identifiers.
  • Meeting or support transcription with product-specific vocabulary.
  • Dictation experiences that return clean, polished text instead of raw ASR.

Current status

The current implementation is a local developer prototype:
  • Endpoint: POST /v1/audio/transcriptions
  • Local base URL: http://localhost:3000
  • Auth: none in non-production local development
  • Production API keys, billing, rate limits, and public deployment are not enabled yet
The public API contract is intentionally designed to look familiar to developers using OpenAI-style message arrays, while preserving VoiceOS-specific promptable ASR behavior.

Minimal example

curl http://localhost:3000/v1/audio/transcriptions \
  -F "file=@sample.mp3" \
  -F 'messages=[{"role":"system","content":"You are in a TypeScript codebase."},{"role":"user","content":"The file is TestDashboardApp.tsx and the env var is CEREBRAS_API_KEY."}]' \
  -F 'languages="en"' \
  -F 'response_format=json'
{
  "text": "Please open TestDashboardApp.tsx and update the CEREBRAS_API_KEY.",
  "raw_text": "Please open test dashboard app dot T S X and update the Cerebras API key."
}