Overview
AWSPollyTTSService provides high-quality text-to-speech synthesis through Amazon Polly with support for standard, neural, and generative engines. The service offers extensive language support, SSML features, and voice customization options including prosody controls for pitch, rate, and volume.
AWS Polly API Reference
Pipecat’s API methods for AWS Polly integration
Example Implementation
Complete example with generative engine
AWS Polly Documentation
Official AWS Polly documentation and features
Voice Samples
Browse available voices and languages
Installation
To use AWS Polly services, install the required dependencies:Prerequisites
AWS Account Setup
Before using AWS Polly TTS services, you need:- AWS Account: Sign up at AWS Console
- IAM User: Create an IAM user with Polly permissions
- Access Keys: Generate access key ID and secret access key
- Voice Selection: Choose from available voices in the voice list
Required Environment Variables
AWS_ACCESS_KEY_ID: Your AWS access key IDAWS_SECRET_ACCESS_KEY: Your AWS secret access keyAWS_SESSION_TOKEN: Session token (if using temporary credentials)AWS_REGION: AWS region (defaults to “us-east-1”)
Configuration
AWSPollyTTSService
AWS secret access key. If
None, uses the AWS_SECRET_ACCESS_KEY environment
variable.AWS access key ID. If
None, uses the AWS_ACCESS_KEY_ID environment
variable.AWS session token for temporary credentials.
AWS region for Polly service. Defaults to
us-east-1 if not set via
environment variable.Voice ID to use for synthesis. Deprecated in v0.0.105. Use
settings=AWSPollyTTSService.Settings(voice=...) instead.Output audio sample rate in Hz. When
None, uses the pipeline’s configured
sample rate. AWS Polly internally synthesizes at 16kHz and resamples to the
target rate.Deprecated in v0.0.105. Use
settings=AWSPollyTTSService.Settings(...)
instead.Runtime-configurable settings. See Settings below.
Settings
Runtime-configurable settings passed via thesettings constructor argument using AWSPollyTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | None | Model identifier. (Inherited.) |
voice | str | None | Voice identifier. (Inherited.) |
language | Language | str | None | Language for synthesis. (Inherited.) |
engine | str | NOT_GIVEN | Engine type (e.g., “neural”, “standard”). |
pitch | str | NOT_GIVEN | Pitch adjustment for SSML. |
rate | str | NOT_GIVEN | Speaking rate for SSML. |
volume | str | NOT_GIVEN | Volume for SSML. |
lexicon_names | List[str] | NOT_GIVEN | List of lexicon names for pronunciation. |
Usage
Basic Setup
With Voice Customization
Notes
- Engine selection: AWS Polly supports
"standard","neural", and"generative"engines. Not all voices support all engines. Check the AWS voice list for compatibility. - Pitch control: The
pitchparameter only works with the"standard"engine. Neural and generative engines ignore it. - Audio resampling: Polly synthesizes PCM at 16kHz internally. The service automatically resamples to match your pipeline’s sample rate.