Overview
AWSTranscribeSTTService provides real-time speech recognition using Amazon Transcribe’s WebSocket streaming API with support for interim results, multiple languages, and configurable audio processing parameters for enterprise-grade transcription.
AWS Transcribe STT API Reference
Pipecat’s API methods for AWS Transcribe integration
Example Implementation
Complete example with AWS services integration
AWS Transcribe Documentation
Official AWS Transcribe documentation and features
AWS Console
Access AWS Transcribe services and IAM setup
Installation
To use AWS Transcribe services, install the required dependency:Prerequisites
AWS Account Setup
Before using AWS Transcribe STT services, you need:- AWS Account: Sign up at AWS Console
- IAM User: Create an IAM user with Amazon Transcribe permissions
- Credentials: Set up AWS access keys and region configuration
Required Environment Variables
AWS_ACCESS_KEY_ID: Your AWS access key IDAWS_SECRET_ACCESS_KEY: Your AWS secret access keyAWS_SESSION_TOKEN: Session token (if using temporary credentials)AWS_REGION: AWS region (defaults to “us-east-1”)
Configuration
AWS secret access key. If
None, uses AWS_SECRET_ACCESS_KEY environment
variable.AWS access key ID. If
None, uses AWS_ACCESS_KEY_ID environment variable.AWS session token for temporary credentials. If
None, uses
AWS_SESSION_TOKEN environment variable.AWS region for the service. If
None, uses AWS_REGION environment variable
(defaults to "us-east-1").Audio sample rate in Hz. When
None, uses the pipeline’s configured sample
rate. AWS Transcribe only supports 8000 or 16000 Hz; other values are
clamped to 16000 Hz at connect time.Language for transcription. Supports a wide range of languages including
English, Spanish, French, German, and many more. See AWS Transcribe supported
languages.
Deprecated in v0.0.105. Use
settings=AWSTranscribeSTTService.Settings(...)
instead.Runtime-configurable settings for the STT service. See Settings
below.
P99 latency from speech end to final transcript in seconds. Override for your
deployment.
Settings
Runtime-configurable settings passed via thesettings constructor argument using AWSTranscribeSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | None | STT model identifier. (Inherited from base STT settings.) |
language | Language | str | Language.EN | Language for transcription. (Inherited from base STT settings.) |
Usage
Basic Setup
With Custom Language and Sample Rate
Notes
- Supported sample rates: AWS Transcribe only supports
8000Hz and16000Hz. If a different rate is provided, the service automatically falls back to16000Hz with a warning. - Pre-signed URL authentication: The service uses pre-signed URLs for WebSocket authentication rather than passing credentials directly, following AWS best practices.
- Partial results stabilization: Enabled by default with
"high"stability, which reduces changes to interim transcripts at the cost of slightly higher latency.
Event Handlers
AWS Transcribe STT supports the standard service connection events:| Event | Description |
|---|---|
on_connected | Connected to AWS Transcribe WebSocket |
on_disconnected | Disconnected from AWS Transcribe WebSocket |