Azure

Overview

Azure Cognitive Services provides high-quality text-to-speech synthesis with two service implementations: AzureTTSService (WebSocket-based) for real-time streaming with low latency, and AzureHttpTTSService (HTTP-based) for batch synthesis. AzureTTSService is recommended for interactive applications requiring streaming capabilities.

Azure TTS API Reference

Pipecat’s API methods for Azure TTS integration

Example Implementation

Complete example with streaming synthesis

Azure Speech Documentation

Official Azure Speech Services documentation

Voice Gallery

Browse available voices and languages

Installation

To use Azure services, install the required dependencies:

pip install "pipecat-ai[azure]"

Prerequisites

Azure Account Setup

Before using Azure TTS services, you need:

Azure Account: Sign up at Azure Portal
Speech Service: Create a Speech resource in your Azure subscription
API Key and Region: Get your subscription key and service region
Voice Selection: Choose from available voices in the Voice Gallery

Required Environment Variables

AZURE_SPEECH_API_KEY: Your Azure Speech service API key
AZURE_SPEECH_REGION: Your Azure Speech service region (e.g., “eastus”)

Configuration

AzureTTSService

api_key

str

required

Azure Cognitive Services subscription key.

region

str

required

Azure region identifier (e.g., "eastus", "westus2").

voice

str

default:"en-US-SaraNeural"

deprecated

Voice name to use for synthesis. Deprecated in v0.0.105. Use settings=AzureTTSService.Settings(voice=...) instead.

sample_rate

int

default:"None"

Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

text_aggregation_mode

TextAggregationMode

default:"TextAggregationMode.SENTENCE"

Controls how incoming text is aggregated before synthesis. SENTENCE (default) buffers text until sentence boundaries, producing more natural speech. TOKEN streams tokens directly for lower latency. Import from pipecat.services.tts_service.

aggregate_sentences

bool

default:"None"

deprecated

Deprecated in v0.0.104. Use text_aggregation_mode instead.

params

InputParams

default:"None"

deprecated

Deprecated in v0.0.105. Use settings=AzureTTSService.Settings(...) instead.

settings

AzureTTSService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

AzureHttpTTSService

The HTTP service accepts the same parameters as the streaming service except text_aggregation_mode and aggregate_sentences:

api_key

str

required

Azure Cognitive Services subscription key.

region

str

required

Azure region identifier.

voice

str

default:"en-US-SaraNeural"

deprecated

Voice name to use for synthesis. Deprecated in v0.0.105. Use settings=AzureHttpTTSService.Settings(voice=...) instead.

sample_rate

int

default:"None"

Output audio sample rate in Hz.

params

InputParams

default:"None"

deprecated

Deprecated in v0.0.105. Use settings=AzureHttpTTSService.Settings(...) instead.

settings

AzureHttpTTSService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using AzureTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model identifier. (Inherited.)
`voice`	`str`	`None`	Voice identifier. (Inherited.)
`language`	`Language \| str`	`None`	Language for synthesis. (Inherited.)
`emphasis`	`str`	`NOT_GIVEN`	Emphasis level for SSML.
`pitch`	`str`	`NOT_GIVEN`	Pitch adjustment.
`rate`	`str`	`NOT_GIVEN`	Speaking rate.
`role`	`str`	`NOT_GIVEN`	Role for SSML.
`style`	`str`	`NOT_GIVEN`	Speaking style.
`style_degree`	`str`	`NOT_GIVEN`	Degree of the speaking style.
`volume`	`str`	`NOT_GIVEN`	Volume level.

Usage

Basic Setup

from pipecat.services.azure import AzureTTSService

tts = AzureTTSService(
    api_key=os.getenv("AZURE_SPEECH_API_KEY"),
    region=os.getenv("AZURE_SPEECH_REGION"),
    settings=AzureTTSService.Settings(
        voice="en-US-SaraNeural",
    ),
)

With Voice Customization

from pipecat.transcriptions.language import Language

tts = AzureTTSService(
    api_key=os.getenv("AZURE_SPEECH_API_KEY"),
    region="eastus",
    settings=AzureTTSService.Settings(
        voice="en-US-JennyMultilingualNeural",
        language=Language.EN_US,
        style="cheerful",
        style_degree="1.5",
        rate="1.1",
    ),
)

HTTP Service

from pipecat.services.azure import AzureHttpTTSService

tts = AzureHttpTTSService(
    api_key=os.getenv("AZURE_SPEECH_API_KEY"),
    region=os.getenv("AZURE_SPEECH_REGION"),
    voice="en-US-SaraNeural",
)

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

Streaming vs HTTP: The streaming service (AzureTTSService) provides word-level timestamps and lower latency, making it better for interactive conversations. The HTTP service is simpler but returns the complete audio at once.
SSML support: Both services automatically construct SSML from the Settings. Special characters in text are automatically escaped.
Word timestamps: AzureTTSService supports word-level timestamps for synchronized text display. CJK languages receive special handling to merge individual characters into meaningful word units.
8kHz workaround: At 8kHz sample rates, Azure’s reported audio duration may not match word boundary offsets. The service uses word boundary offsets for timing in this case.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

Azure TTS API Reference

Example Implementation

Azure Speech Documentation

Voice Gallery

Installation

Prerequisites

Azure Account Setup

Required Environment Variables

Configuration

AzureTTSService

AzureHttpTTSService

Settings

Usage

Basic Setup

With Voice Customization

HTTP Service

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

Azure TTS API Reference

Example Implementation

Azure Speech Documentation

Voice Gallery

​Installation

​Prerequisites

​Azure Account Setup

​Required Environment Variables

​Configuration

​AzureTTSService

​AzureHttpTTSService

​Settings

​Usage

​Basic Setup

​With Voice Customization

​HTTP Service

​Notes

Overview

Installation

Prerequisites

Azure Account Setup

Required Environment Variables

Configuration

AzureTTSService

AzureHttpTTSService

Settings

Usage

Basic Setup

With Voice Customization

HTTP Service

Notes