Skip to main content

Overview

FishAudioTTSService provides real-time text-to-speech synthesis through Fish Audio’s WebSocket-based streaming API. The service offers custom voice models, prosody controls, and multiple audio formats optimized for conversational AI applications with low latency.

Installation

To use Fish Audio services, install the required dependencies:
pip install "pipecat-ai[fish]"

Prerequisites

Fish Audio Account Setup

Before using Fish Audio TTS services, you need:
  1. Fish Audio Account: Sign up at Fish Audio Console
  2. API Key: Generate an API key from your account dashboard
  3. Voice Models: Create or select custom voice models for synthesis

Required Environment Variables

  • FISH_API_KEY: Your Fish Audio API key for authentication

Configuration

FishAudioTTSService

api_key
str
required
Fish Audio API key for authentication.
reference_id
str
default:"None"
deprecated
Reference ID of the voice model to use for synthesis. Deprecated in v0.0.105. Use settings=FishAudioTTSService.Settings(voice=...) instead.
model_id
str
default:"s2-pro"
deprecated
Fish Audio TTS model to use.Deprecated in v0.0.105. Use settings=FishAudioTTSService.Settings(...) instead.
output_format
str
default:"pcm"
Audio output format. Options: "pcm", "opus", "mp3", "wav".
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
params
InputParams
default:"None"
deprecated
Runtime-configurable voice settings. See InputParams below.Deprecated in v0.0.105. Use settings=FishAudioTTSService.Settings(...) instead.
settings
FishAudioTTSService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using FishAudioTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneModel identifier. (Inherited.)
voicestrNoneVoice identifier. (Inherited.)
languageLanguage | strNoneLanguage for synthesis. (Inherited.)
latencystrNOT_GIVENLatency mode setting.
normalizeboolNOT_GIVENWhether to normalize audio.
temperaturefloatNOT_GIVENTemperature for sampling.
top_pfloatNOT_GIVENTop-p sampling parameter.
prosody_speedfloatNOT_GIVENProsody speed control.
prosody_volumeintNOT_GIVENProsody volume control.

Usage

Basic Setup

from pipecat.services.fish import FishAudioTTSService

tts = FishAudioTTSService(
    api_key=os.getenv("FISH_API_KEY"),
    settings=FishAudioTTSService.Settings(
        voice="your-voice-reference-id",
    ),
)

With Prosody Controls

tts = FishAudioTTSService(
    api_key=os.getenv("FISH_API_KEY"),
    settings=FishAudioTTSService.Settings(
        voice="your-voice-reference-id",
        model="s2-pro",
        prosody_speed=1.2,
        prosody_volume=3,
        latency="balanced",
    ),
)
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • voice required: You must specify either voice (preferred) or the deprecated model or reference_id parameter. Passing both raises a ValueError.
  • Model switching: Changing the model via set_model() automatically disconnects and reconnects the WebSocket with the new model configuration.

Event Handlers

Fish Audio TTS supports the standard service connection events:
EventDescription
on_connectedConnected to Fish Audio WebSocket
on_disconnectedDisconnected from Fish Audio WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Fish Audio")