Sarvam AI

Overview

SarvamTTSService provides text-to-speech synthesis specialized for Indian languages and voices. The service offers extensive voice customization options including pitch, pace, and loudness control, with support for multiple Indian languages and preprocessing for mixed-language content. The bulbul:v3-beta model adds temperature control and 25 new speaker voices.

Sarvam TTS API Reference

Pipecat’s API methods for Sarvam AI TTS integration

Example Implementation

Complete example with Indian language support

Sarvam Documentation

Official Sarvam AI text-to-speech API documentation

Sarvam Console

Access Indian language voices and API keys

Installation

To use Sarvam AI services, no additional dependencies are required beyond the base installation:

pip install "pipecat-ai"

Prerequisites

Sarvam AI Account Setup

Before using Sarvam AI TTS services, you need:

Sarvam AI Account: Sign up at Sarvam AI Console
API Key: Generate an API key from your account dashboard
Language Selection: Choose from available Indian language voices

Required Environment Variables

SARVAM_API_KEY: Your Sarvam AI API key for authentication

Configuration

Sarvam offers two service implementations: SarvamTTSService (WebSocket) for real-time streaming and SarvamHttpTTSService (HTTP) for simpler batch synthesis.

SarvamTTSService

api_key

str

required

Sarvam AI API subscription key.

model

str

default:"bulbul:v2"

deprecated

TTS model to use. Options: bulbul:v2, bulbul:v3-beta, bulbul:v3. Deprecated in v0.0.105. Use settings=SarvamTTSService.Settings(model=...) instead.

voice_id

str

default:"None"

deprecated

Speaker voice ID. If None, uses the model-appropriate default (anushka for v2, shubh for v3). Deprecated in v0.0.105. Use settings=SarvamTTSService.Settings(voice=...) instead.

url

str

default:"wss://api.sarvam.ai/text-to-speech/ws"

WebSocket URL for the TTS backend.

text_aggregation_mode

TextAggregationMode

default:"TextAggregationMode.SENTENCE"

Controls how incoming text is aggregated before synthesis. SENTENCE (default) buffers text until sentence boundaries, producing more natural speech. TOKEN streams tokens directly for lower latency. Import from pipecat.services.tts_service.

aggregate_sentences

bool

default:"None"

deprecated

Deprecated in v0.0.104. Use text_aggregation_mode instead.

sample_rate

int

default:"None"

Audio sample rate in Hz (8000, 16000, 22050, 24000). If None, uses model-specific default (22050 for v2, 24000 for v3).

params

InputParams

default:"None"

deprecated

Deprecated in v0.0.105. Use settings=SarvamTTSService.Settings(...) instead.

settings

SarvamTTSService.Settings

default:"None"

Runtime-configurable settings. See SarvamTTSService Settings below.

SarvamHttpTTSService

api_key

str

required

Sarvam AI API subscription key.

aiohttp_session

aiohttp.ClientSession

required

An aiohttp session for HTTP requests.

model

str

default:"bulbul:v2"

deprecated

TTS model to use. Options: bulbul:v2, bulbul:v3-beta, bulbul:v3. Deprecated in v0.0.105. Use settings=SarvamHttpTTSService.Settings(model=...) instead.

voice_id

str

default:"None"

deprecated

Speaker voice ID. If None, uses the model-appropriate default. Deprecated in v0.0.105. Use settings=SarvamHttpTTSService.Settings(voice=...) instead.

base_url

str

default:"https://api.sarvam.ai"

Sarvam AI API base URL.

sample_rate

int

default:"None"

Audio sample rate in Hz (8000, 16000, 22050, 24000). If None, uses model-specific default.

params

InputParams

default:"None"

deprecated

Deprecated in v0.0.105. Use settings=SarvamHttpTTSService.Settings(...) instead.

settings

SarvamHttpTTSService.Settings

default:"None"

Runtime-configurable settings. See SarvamHttpTTSService Settings below.

SarvamTTSService Settings

Runtime-configurable settings passed via the settings constructor argument using SarvamTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model identifier. (Inherited.)
`voice`	`str`	`None`	Voice identifier. (Inherited.)
`language`	`Language \| str`	`None`	Language for synthesis. (Inherited.)
`enable_preprocessing`	`bool`	`NOT_GIVEN`	Enable text preprocessing.
`pace`	`float`	`NOT_GIVEN`	Pace of speech.
`pitch`	`float`	`NOT_GIVEN`	Pitch of speech.
`loudness`	`float`	`NOT_GIVEN`	Loudness of speech.
`temperature`	`float`	`NOT_GIVEN`	Temperature for speech synthesis.
`min_buffer_size`	`int`	`NOT_GIVEN`	Minimum buffer size for WebSocket.
`max_chunk_length`	`int`	`NOT_GIVEN`	Maximum chunk length for WebSocket.

SarvamHttpTTSService Settings

Runtime-configurable settings passed via the settings constructor argument using SarvamHttpTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model identifier. (Inherited.)
`voice`	`str`	`None`	Voice identifier. (Inherited.)
`language`	`Language \| str`	`None`	Language for synthesis. (Inherited.)
`enable_preprocessing`	`bool`	`NOT_GIVEN`	Enable text preprocessing.
`pace`	`float`	`NOT_GIVEN`	Pace of speech.
`pitch`	`float`	`NOT_GIVEN`	Pitch of speech.
`loudness`	`float`	`NOT_GIVEN`	Loudness of speech.
`temperature`	`float`	`NOT_GIVEN`	Temperature for speech synthesis.

Usage

Basic Setup (WebSocket)

from pipecat.services.sarvam import SarvamTTSService
from pipecat.transcriptions.language import Language

tts = SarvamTTSService(
    api_key=os.getenv("SARVAM_API_KEY"),
    settings=SarvamTTSService.Settings(
        voice="anushka",
        language=Language.HI,
    ),
)

With v3 Model and Temperature Control

from pipecat.services.sarvam import SarvamTTSService
from pipecat.transcriptions.language import Language

tts = SarvamTTSService(
    api_key=os.getenv("SARVAM_API_KEY"),
    settings=SarvamTTSService.Settings(
        voice="aditya",
        model="bulbul:v3-beta",
        language=Language.HI,
        pace=1.2,
        temperature=0.8,
    ),
)

HTTP Service

import aiohttp
from pipecat.services.sarvam import SarvamHttpTTSService
from pipecat.transcriptions.language import Language

async with aiohttp.ClientSession() as session:
    tts = SarvamHttpTTSService(
        api_key=os.getenv("SARVAM_API_KEY"),
        aiohttp_session=session,
        settings=SarvamHttpTTSService.Settings(
            voice="anushka",
            language=Language.HI,
            pitch=0.1,
            pace=1.2,
            loudness=1.5,
        ),
    )

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

Model differences: bulbul:v2 supports pitch and loudness control; bulbul:v3-beta and bulbul:v3 add temperature control but do not support pitch or loudness. Setting unsupported parameters for a model will log a warning.
Default speakers vary by model: v2 defaults to anushka; v3 models default to shubh.
Default sample rates vary by model: v2 defaults to 22050 Hz; v3 models default to 24000 Hz.
Indian language focus: Sarvam AI specializes in Indian languages, supporting Bengali, English (India), Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, and Telugu.
Pace ranges differ: bulbul:v2 supports pace from 0.3 to 3.0, while v3 models support 0.5 to 2.0. Values outside the range are clamped automatically.

Event Handlers

Sarvam WebSocket TTS supports the standard service connection events:

Event	Description
`on_connected`	Connected to Sarvam WebSocket
`on_disconnected`	Disconnected from Sarvam WebSocket
`on_connection_error`	WebSocket connection error occurred

@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Sarvam")

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

Sarvam TTS API Reference

Example Implementation

Sarvam Documentation

Sarvam Console

Installation

Prerequisites

Sarvam AI Account Setup

Required Environment Variables

Configuration

SarvamTTSService

SarvamHttpTTSService

SarvamTTSService Settings

SarvamHttpTTSService Settings

Usage

Basic Setup (WebSocket)

With v3 Model and Temperature Control

HTTP Service

Notes

Event Handlers

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

Sarvam TTS API Reference

Example Implementation

Sarvam Documentation

Sarvam Console

​Installation

​Prerequisites

​Sarvam AI Account Setup

​Required Environment Variables

​Configuration

​SarvamTTSService

​SarvamHttpTTSService

​SarvamTTSService Settings

​SarvamHttpTTSService Settings

​Usage

​Basic Setup (WebSocket)

​With v3 Model and Temperature Control

​HTTP Service

​Notes

​Event Handlers

Overview

Installation

Prerequisites

Sarvam AI Account Setup

Required Environment Variables

Configuration

SarvamTTSService

SarvamHttpTTSService

SarvamTTSService Settings

SarvamHttpTTSService Settings

Usage

Basic Setup (WebSocket)

With v3 Model and Temperature Control

HTTP Service

Notes

Event Handlers