Skip to main content

Overview

SambaNovaSTTService provides speech-to-text capabilities using SambaNova’s hosted Whisper API with Voice Activity Detection (VAD) for optimized processing. It efficiently processes speech segments to deliver accurate transcription with SambaNova’s high-performance inference platform.

Installation

To use SambaNova services, install the required dependency:
pip install "pipecat-ai[sambanova]"

Prerequisites

SambaNova Account Setup

Before using SambaNova STT services, you need:
  1. SambaNova Account: Sign up at SambaNova Cloud
  2. API Key: Generate an API key from your account dashboard
  3. Model Access: Ensure access to Whisper transcription models

Required Environment Variables

  • SAMBANOVA_API_KEY: Your SambaNova API key for authentication

Configuration

SambaNovaSTTService

model
str
default:"Whisper-Large-v3"
deprecated
Whisper model to use for transcription. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.
api_key
str
default:"None"
SambaNova API key. Falls back to the SAMBANOVA_API_KEY environment variable.
base_url
str
default:"https://api.sambanova.ai/v1"
API base URL.
language
Language
default:"Language.EN"
deprecated
Language of the audio input. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.
prompt
str
default:"None"
deprecated
Optional text to guide the model’s style or continue a previous segment. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.
temperature
float
default:"None"
deprecated
Sampling temperature between 0 and 1. Lower values produce more deterministic results. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.
settings
SambaNovaSTTService.Settings
default:"None"
Runtime-configurable settings for the STT service. See Settings below.
ttfs_p99_latency
float
default:"SAMBANOVA_TTFS_P99"
P99 latency from speech end to final transcript in seconds. Override for your deployment. See stt-benchmark.
push_empty_transcripts
bool
default:"False"
If true, allow empty TranscriptionFrame frames to be pushed downstream instead of discarding them. This is intended for situations where VAD fires even though the user did not speak. In these cases, it is useful to know that nothing was transcribed so that the agent can resume speaking, instead of waiting longer for a transcription.

Settings

Runtime-configurable settings passed via the settings constructor argument using SambaNovaSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstr"Whisper-Large-v3"Whisper model to use. (Inherited from base STT settings.)
languageLanguage | strLanguage.ENLanguage of the audio input. (Inherited from base STT settings.)
promptstrNoneOptional text to guide the model’s style or continue a previous segment.
temperaturefloatNoneSampling temperature between 0 and 1.

Usage

Basic Setup

from pipecat.services.sambanova.stt import SambaNovaSTTService

stt = SambaNovaSTTService(
    api_key=os.getenv("SAMBANOVA_API_KEY"),
)

With Custom Configuration

from pipecat.services.sambanova.stt import SambaNovaSTTService
from pipecat.transcriptions.language import Language

stt = SambaNovaSTTService(
    api_key=os.getenv("SAMBANOVA_API_KEY"),
    settings=SambaNovaSTTService.Settings(
        model="Whisper-Large-v3",
        language=Language.ES,
        prompt="Transcribe the following conversation about technology.",
        temperature=0.0,
    ),
)
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • Segmented transcription: SambaNovaSTTService extends SegmentedSTTService (via BaseWhisperSTTService), processing complete audio segments after VAD detects the user has stopped speaking.
  • Whisper API compatible: Uses the OpenAI-compatible Whisper API interface hosted on SambaNova’s infrastructure.
  • Probability metrics not supported: SambaNova’s Whisper API does not support probability metrics. The include_prob_metrics parameter has no effect.