SambaNova (Whisper)

Overview

SambaNovaSTTService provides speech-to-text capabilities using SambaNova’s hosted Whisper API with Voice Activity Detection (VAD) for optimized processing. It efficiently processes speech segments to deliver accurate transcription with SambaNova’s high-performance inference platform.

SambaNova STT API Reference

Pipecat’s API methods for SambaNova STT integration

Example Implementation

Complete example with function calling

SambaNova Documentation

Official SambaNova API documentation and features

SambaNova Cloud

Access API keys and Whisper models

Installation

To use SambaNova services, install the required dependency:

pip install "pipecat-ai[sambanova]"

Prerequisites

SambaNova Account Setup

Before using SambaNova STT services, you need:

SambaNova Account: Sign up at SambaNova Cloud
API Key: Generate an API key from your account dashboard
Model Access: Ensure access to Whisper transcription models

Required Environment Variables

SAMBANOVA_API_KEY: Your SambaNova API key for authentication

Configuration

SambaNovaSTTService

model

str

default:"Whisper-Large-v3"

deprecated

Whisper model to use for transcription. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.

api_key

str

default:"None"

SambaNova API key. Falls back to the SAMBANOVA_API_KEY environment variable.

base_url

str

default:"https://api.sambanova.ai/v1"

API base URL.

language

Language

default:"Language.EN"

deprecated

Language of the audio input. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.

prompt

str

default:"None"

deprecated

Optional text to guide the model’s style or continue a previous segment. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.

temperature

float

default:"None"

deprecated

Sampling temperature between 0 and 1. Lower values produce more deterministic results. Deprecated in v0.0.105. Use settings=SambaNovaSTTService.Settings(...) instead.

settings

SambaNovaSTTService.Settings

default:"None"

Runtime-configurable settings for the STT service. See Settings below.

ttfs_p99_latency

float

default:"SAMBANOVA_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment. See stt-benchmark.

push_empty_transcripts

bool

default:"False"

If true, allow empty TranscriptionFrame frames to be pushed downstream instead of discarding them. This is intended for situations where VAD fires even though the user did not speak. In these cases, it is useful to know that nothing was transcribed so that the agent can resume speaking, instead of waiting longer for a transcription.

Settings

Runtime-configurable settings passed via the settings constructor argument using SambaNovaSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`"Whisper-Large-v3"`	Whisper model to use. (Inherited from base STT settings.)
`language`	`Language \| str`	`Language.EN`	Language of the audio input. (Inherited from base STT settings.)
`prompt`	`str`	`None`	Optional text to guide the model’s style or continue a previous segment.
`temperature`	`float`	`None`	Sampling temperature between 0 and 1.

Usage

Basic Setup

from pipecat.services.sambanova.stt import SambaNovaSTTService

stt = SambaNovaSTTService(
    api_key=os.getenv("SAMBANOVA_API_KEY"),
)

With Custom Configuration

from pipecat.services.sambanova.stt import SambaNovaSTTService
from pipecat.transcriptions.language import Language

stt = SambaNovaSTTService(
    api_key=os.getenv("SAMBANOVA_API_KEY"),
    settings=SambaNovaSTTService.Settings(
        model="Whisper-Large-v3",
        language=Language.ES,
        prompt="Transcribe the following conversation about technology.",
        temperature=0.0,
    ),
)

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

Segmented transcription: SambaNovaSTTService extends SegmentedSTTService (via BaseWhisperSTTService), processing complete audio segments after VAD detects the user has stopped speaking.
Whisper API compatible: Uses the OpenAI-compatible Whisper API interface hosted on SambaNova’s infrastructure.
Probability metrics not supported: SambaNova’s Whisper API does not support probability metrics. The include_prob_metrics parameter has no effect.

API Reference

Services

Utilities

Frameworks

Pipeline

SambaNova (Whisper)

Overview

SambaNova STT API Reference

Example Implementation

SambaNova Documentation

SambaNova Cloud

Installation

Prerequisites

SambaNova Account Setup

Required Environment Variables

Configuration

SambaNovaSTTService

Settings

Usage

Basic Setup

With Custom Configuration

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

SambaNova STT API Reference

Example Implementation

SambaNova Documentation

SambaNova Cloud

​Installation

​Prerequisites

​SambaNova Account Setup

​Required Environment Variables

​Configuration

​SambaNovaSTTService

​Settings

​Usage

​Basic Setup

​With Custom Configuration

​Notes

Overview

Installation

Prerequisites

SambaNova Account Setup

Required Environment Variables

Configuration

SambaNovaSTTService

Settings

Usage

Basic Setup

With Custom Configuration

Notes