Overview
FishAudioTTSService provides real-time text-to-speech synthesis through Fish Audio’s WebSocket-based streaming API. The service offers custom voice models, prosody controls, and multiple audio formats optimized for conversational AI applications with low latency.
Fish TTS API Reference
Pipecat’s API methods for Fish Audio TTS integration
Example Implementation
Complete example with custom voice model
Fish Audio Documentation
Official Fish Audio documentation
Voice Models
Create and manage custom voice models
Installation
To use Fish Audio services, install the required dependencies:Prerequisites
Fish Audio Account Setup
Before using Fish Audio TTS services, you need:- Fish Audio Account: Sign up at Fish Audio Console
- API Key: Generate an API key from your account dashboard
- Voice Models: Create or select custom voice models for synthesis
Required Environment Variables
FISH_API_KEY: Your Fish Audio API key for authentication
Configuration
FishAudioTTSService
Fish Audio API key for authentication.
Reference ID of the voice model to use for synthesis. Deprecated in v0.0.105.
Use
settings=FishAudioTTSService.Settings(voice=...) instead.Fish Audio TTS model to use.Deprecated in v0.0.105. Use
settings=FishAudioTTSService.Settings(...) instead.Audio output format. Options:
"pcm", "opus", "mp3", "wav".Output audio sample rate in Hz. When
None, uses the pipeline’s configured
sample rate.Runtime-configurable voice settings. See InputParams below.Deprecated in v0.0.105. Use
settings=FishAudioTTSService.Settings(...) instead.Runtime-configurable settings. See Settings below.
Settings
Runtime-configurable settings passed via thesettings constructor argument using FishAudioTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | None | Model identifier. (Inherited.) |
voice | str | None | Voice identifier. (Inherited.) |
language | Language | str | None | Language for synthesis. (Inherited.) |
latency | str | NOT_GIVEN | Latency mode setting. |
normalize | bool | NOT_GIVEN | Whether to normalize audio. |
temperature | float | NOT_GIVEN | Temperature for sampling. |
top_p | float | NOT_GIVEN | Top-p sampling parameter. |
prosody_speed | float | NOT_GIVEN | Prosody speed control. |
prosody_volume | int | NOT_GIVEN | Prosody volume control. |
Usage
Basic Setup
With Prosody Controls
Notes
voicerequired: You must specify eithervoice(preferred) or the deprecatedmodelorreference_idparameter. Passing both raises aValueError.- Model switching: Changing the model via
set_model()automatically disconnects and reconnects the WebSocket with the new model configuration.
Event Handlers
Fish Audio TTS supports the standard service connection events:| Event | Description |
|---|---|
on_connected | Connected to Fish Audio WebSocket |
on_disconnected | Disconnected from Fish Audio WebSocket |
on_connection_error | WebSocket connection error occurred |