ElevenLabs

Overview

ElevenLabs provides two STT service implementations:

ElevenLabsSTTService (HTTP) — File-based transcription using ElevenLabs’ Speech-to-Text API with segmented audio processing. Uploads audio files and receives transcription results directly.
ElevenLabsRealtimeSTTService (WebSocket) — Real-time streaming transcription with ultra-low latency, supporting both partial (interim) and committed (final) transcripts with manual or VAD-based commit strategies.

ElevenLabs STT API Reference

Pipecat’s API methods for ElevenLabs STT integration

Example Implementation

Complete example with ElevenLabs STT and TTS

ElevenLabs Documentation

Official ElevenLabs STT API documentation

ElevenLabs Platform

Access API keys and speech-to-text models

Installation

To use ElevenLabs STT services, install the required dependencies:

pip install "pipecat-ai[elevenlabs]"

Prerequisites

ElevenLabs Account Setup

Before using ElevenLabs STT services, you need:

ElevenLabs Account: Sign up at ElevenLabs Platform
API Key: Generate an API key from your account dashboard
Model Access: Ensure access to the Scribe v2 transcription model (default: scribe_v2)

Required Environment Variables

ELEVENLABS_API_KEY: Your ElevenLabs API key for authentication

ElevenLabsSTTService

api_key

str

required

ElevenLabs API key for authentication.

aiohttp_session

aiohttp.ClientSession

required

An aiohttp session for HTTP requests. You must create and manage this yourself.

base_url

str

default:"https://api.elevenlabs.io"

Base URL for the ElevenLabs API.

model

str

default:"scribe_v2"

deprecated

Model ID for transcription. Deprecated in v0.0.105. Use settings=ElevenLabsSTTService.Settings(...) instead.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

settings

ElevenLabsSTTService.Settings

default:"None"

Runtime-configurable settings for the STT service. See Settings below.

params

ElevenLabsSTTService.InputParams

default:"None"

deprecated

Configuration parameters for the STT service. Deprecated in v0.0.105. Use settings=ElevenLabsSTTService.Settings(...) instead.

ttfs_p99_latency

float

default:"ELEVENLABS_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

Settings

Runtime-configurable settings passed via the settings constructor argument using ElevenLabsSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model ID for transcription. (Inherited from base STT settings.)
`language`	`Language \| str`	`None`	Target language for transcription. (Inherited from base STT settings.)
`tag_audio_events`	`bool`	`True`	Include audio events like (laughter), (coughing) in transcription.

Usage

import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
    )

With Language and Audio Events

import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
from pipecat.transcriptions.language import Language

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
        settings=ElevenLabsSTTService.Settings(
            language=Language.ES,
            tag_audio_events=False,
        ),
    )

Notes

The HTTP service uploads complete audio segments and is best for VAD-segmented transcription.
Does not have connection events since it uses per-request HTTP calls.

ElevenLabsRealtimeSTTService

api_key

str

required

ElevenLabs API key for authentication.

base_url

str

default:"api.elevenlabs.io"

Base URL for the ElevenLabs WebSocket API.

model

str

default:"scribe_v2_realtime"

deprecated

Model ID for real-time transcription. Deprecated in v0.0.105. Use settings=ElevenLabsRealtimeSTTService.Settings(...) instead.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

settings

ElevenLabsRealtimeSTTService.Settings

default:"None"

Runtime-configurable settings for the Realtime STT service. See Settings below.

commit_strategy

CommitStrategy

default:"CommitStrategy.MANUAL"

How to segment speech. CommitStrategy.MANUAL uses Pipecat’s VAD to control when transcript segments are committed. CommitStrategy.VAD uses ElevenLabs’ built-in VAD for segment boundaries.

include_timestamps

bool

default:"False"

Whether to include word-level timestamps in transcripts.

enable_logging

bool

default:"False"

Whether to enable logging on ElevenLabs’ side.

include_language_detection

bool

default:"False"

Whether to include language detection in transcripts.

params

ElevenLabsRealtimeSTTService.InputParams

default:"None"

deprecated

Configuration parameters for the STT service. Deprecated in v0.0.105. Use settings=ElevenLabsRealtimeSTTService.Settings(...) instead.

ttfs_p99_latency

float

default:"ELEVENLABS_REALTIME_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

Settings

Runtime-configurable settings passed via the settings constructor argument using ElevenLabsRealtimeSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model ID for transcription. (Inherited from base STT settings.)
`language`	`Language \| str`	`None`	Language for speech recognition. (Inherited from base STT settings.)
`vad_silence_threshold_secs`	`float`	`None`	Seconds of silence before VAD commits (0.3-3.0). Only used with VAD commit strategy.
`vad_threshold`	`float`	`None`	VAD sensitivity (0.1-0.9, lower is more sensitive). Only used with VAD commit strategy.
`min_speech_duration_ms`	`int`	`None`	Minimum speech duration for VAD (50-2000ms). Only used with VAD commit strategy.
`min_silence_duration_ms`	`int`	`None`	Minimum silence duration for VAD (50-2000ms). Only used with VAD commit strategy.

Usage

from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
)

With Timestamps and Custom Commit Strategy

from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService, CommitStrategy

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
    language_code="eng",
    commit_strategy=CommitStrategy.VAD,
    include_timestamps=True,
    settings=ElevenLabsRealtimeSTTService.Settings(
        vad_silence_threshold_secs=1.0,
    ),
)

Notes

Commit strategies: Defaults to manual commit strategy, where Pipecat’s VAD controls when transcription segments are committed. Set commit_strategy=CommitStrategy.VAD to let ElevenLabs handle segment boundaries. When using MANUAL commit strategy, transcription frames are marked as finalized (TranscriptionFrame.finalized=True).
Keepalive: Sends silent audio chunks as keepalive to prevent idle disconnections (keepalive interval: 5s, timeout: 10s).
Auto-reconnect: Automatically reconnects if the WebSocket connection is closed when new audio arrives.

Event Handlers

Supports the standard service connection events:

Event	Description
`on_connected`	Connected to ElevenLabs Realtime STT WebSocket
`on_disconnected`	Disconnected from ElevenLabs Realtime STT WebSocket

@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to ElevenLabs Realtime STT")

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

ElevenLabs STT API Reference

Example Implementation

ElevenLabs Documentation

ElevenLabs Platform

Installation

Prerequisites

ElevenLabs Account Setup

Required Environment Variables

ElevenLabsSTTService

Settings

Usage

With Language and Audio Events

Notes

ElevenLabsRealtimeSTTService

Settings

Usage

With Timestamps and Custom Commit Strategy

Notes

Event Handlers

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

ElevenLabs STT API Reference

Example Implementation

ElevenLabs Documentation

ElevenLabs Platform

​Installation

​Prerequisites

​ElevenLabs Account Setup

​Required Environment Variables

​ElevenLabsSTTService

​Settings

​Usage

​With Language and Audio Events

​Notes

​ElevenLabsRealtimeSTTService

​Settings

​Usage

​With Timestamps and Custom Commit Strategy

​Notes

​Event Handlers

Overview

Installation

Prerequisites

ElevenLabs Account Setup

Required Environment Variables

ElevenLabsSTTService

Settings

Usage

With Language and Audio Events

Notes

ElevenLabsRealtimeSTTService

Settings

Usage

With Timestamps and Custom Commit Strategy

Notes

Event Handlers