Skip to main content

Overview

NvidiaTTSService provides high-quality text-to-speech synthesis through NVIDIA Riva’s cloud-based AI models accessible via gRPC API. The service offers multilingual support, configurable quality settings, and streaming audio generation optimized for real-time applications.

Installation

To use NVIDIA Riva services, install the required dependencies:
pip install "pipecat-ai[nvidia]"

Prerequisites

NVIDIA Riva Setup

Before using Riva TTS services, you need:
  1. NVIDIA Developer Account: Sign up at NVIDIA Developer Portal
  2. API Key: Generate an NVIDIA API key for Riva services
  3. Riva Access: Ensure access to NVIDIA Riva TTS services

Required Environment Variables

  • NVIDIA_API_KEY: Your NVIDIA API key for authentication

Configuration

NvidiaTTSService

api_key
str
required
NVIDIA API key for authentication.
server
str
default:"grpc.nvcf.nvidia.com:443"
gRPC server endpoint.
voice_id
str
default:"Magpie-Multilingual.EN-US.Aria"
deprecated
Voice model identifier.Deprecated in v0.0.105. Use settings=NvidiaTTSService.Settings(...) instead.
sample_rate
int
default:"None"
Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
model_function_map
dict
Dictionary containing function_id and model_name for the TTS model.
use_ssl
bool
default:"True"
Whether to use SSL for the NVIDIA Riva server connection.
params
InputParams
default:"None"
deprecated
Runtime-configurable synthesis settings. See InputParams below.Deprecated in v0.0.105. Use settings=NvidiaTTSService.Settings(...) instead.
settings
NvidiaTTSService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using NvidiaTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneModel identifier. (Inherited.)
voicestrNoneVoice identifier. (Inherited.)
languageLanguage | strNoneLanguage for synthesis. (Inherited.)
qualityintNOT_GIVENAudio quality setting.

Usage

Basic Setup

from pipecat.services.nvidia import NvidiaTTSService

tts = NvidiaTTSService(
    api_key=os.getenv("NVIDIA_API_KEY"),
)

With Custom Voice and Quality

from pipecat.services.nvidia import NvidiaTTSService
from pipecat.transcriptions.language import Language

tts = NvidiaTTSService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    model_function_map={
        "function_id": "877104f7-e885-42b9-8de8-f6e4c6303969",
        "model_name": "magpie-tts-multilingual",
    },
    settings=NvidiaTTSService.Settings(
        voice="Magpie-Multilingual.EN-US.Aria",
        language=Language.EN_US,
        quality=40,
    ),
)
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • gRPC-based: NVIDIA Riva uses gRPC (not HTTP or WebSocket) for communication with the TTS service.
  • Model cannot be changed after initialization: The model and function ID must be set during construction via model_function_map. Calling set_model() after initialization will log a warning and have no effect.
  • SSL enabled by default: The service connects to NVIDIA’s cloud endpoint with SSL. Set use_ssl=False only for local or custom Riva deployments.
  • Blocking gRPC calls: Audio generation uses asyncio.to_thread to avoid blocking the event loop, since the underlying Riva client uses synchronous gRPC calls.