AWS Nova Sonic

Overview

AWSNovaSonicLLMService enables natural, real-time conversations with AWS Nova Sonic. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences with bidirectional audio streaming, text generation, and function calling capabilities.

AWS Nova Sonic API Reference

Pipecat’s API methods for AWS Nova Sonic integration

Example Implementation

Complete AWS Nova Sonic conversation example

AWS Bedrock Documentation

Official AWS Bedrock and Nova Sonic documentation

AWS Console

Access AWS Bedrock and manage Nova Sonic models

Installation

To use AWS Nova Sonic services, install the required dependencies:

pip install "pipecat-ai[aws-nova-sonic]"

Prerequisites

AWS Account Setup

Before using AWS Nova Sonic services, you need:

AWS Account: Set up at AWS Console
Bedrock Access: Enable AWS Bedrock service in your region
Model Access: Request access to Nova Sonic models in Bedrock
IAM Credentials: Configure AWS access keys with Bedrock permissions

Required Environment Variables

AWS_SECRET_ACCESS_KEY: Your AWS secret access key
AWS_ACCESS_KEY_ID: Your AWS access key ID
AWS_REGION: AWS region where Bedrock is available

Key Features

Real-time Speech-to-Speech: Direct audio input to audio output processing
Built-in Transcription: Automatic speech-to-text with real-time streaming
Voice Activity Detection: Automatic detection of speech start/stop
Function Calling: Support for external function and API integration
Multiple Voices: Choose from matthew, tiffany, and amy voice options

Configuration

AWSNovaSonicLLMService

secret_access_key

str

required

AWS secret access key for authentication.

access_key_id

str

required

AWS access key ID for authentication.

session_token

str

default:"None"

AWS session token for temporary credentials (e.g., when using AWS STS).

region

str

required

AWS region where the service is hosted. Supported regions for Nova 2 Sonic (default): "us-east-1", "us-west-2", "ap-northeast-1". Supported regions for Nova Sonic (older model): "us-east-1", "ap-northeast-1".

model

str

default:"amazon.nova-2-sonic-v1:0"

deprecated

Model identifier. Use "amazon.nova-2-sonic-v1:0" for the latest model or "amazon.nova-sonic-v1:0" for the older model.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(model=...) instead.

voice_id

str

default:"matthew"

deprecated

Voice ID for speech synthesis. Some voices are designed for specific languages. See AWS Nova 2 Sonic voice support for available voices.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(voice=...) instead.

params

Params

default:"Params()"

deprecated

Model parameters for audio configuration and inference. See Params below.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(...) for inference settings and audio_config=AudioConfig(...) for audio configuration.

audio_config

AudioConfig

default:"None"

Audio configuration (sample rates, sample sizes, channel counts). If not provided, defaults are used (16kHz input, 24kHz output, 16-bit, mono). See AudioConfig below.

settings

AWSNovaSonicLLMService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

system_instruction

str

default:"None"

deprecated

System-level instruction for the model.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(system_instruction=...) instead.

tools

ToolsSchema

default:"None"

Available tools/functions for the model to use.

Settings

Runtime-configurable settings passed via the settings constructor argument using AWSNovaSonicLLMService.Settings(...). These can be updated mid-conversation with LLMUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`NOT_GIVEN`	Model identifier. (Inherited from base settings.)
`system_instruction`	`str`	`NOT_GIVEN`	System instruction/prompt. (Inherited from base settings.)
`temperature`	`float`	`NOT_GIVEN`	Sampling temperature for text generation. (Inherited from base settings.)
`max_tokens`	`int`	`NOT_GIVEN`	Maximum number of tokens to generate. (Inherited from base settings.)
`top_p`	`float`	`NOT_GIVEN`	Nucleus sampling parameter. (Inherited from base settings.)
`voice`	`str`	`NOT_GIVEN`	Voice ID for speech synthesis.
`endpointing_sensitivity`	`str \| None`	`NOT_GIVEN`	Controls how quickly Nova Sonic decides the user has stopped speaking. Values: `"LOW"`, `"MEDIUM"`, or `"HIGH"`. Only supported with Nova 2 Sonic (default model).

NOT_GIVEN values are omitted, letting the service use its own defaults (e.g. "amazon.nova-2-sonic-v1:0" for model, "matthew" for voice, 0.7 for temperature, 1024 for max_tokens). Only parameters that are explicitly set are included.

AudioConfig

Audio configuration passed via the audio_config constructor argument.

Parameter	Type	Default	Description
`input_sample_rate`	`int`	`16000`	Audio input sample rate in Hz.
`input_sample_size`	`int`	`16`	Audio input sample size in bits.
`input_channel_count`	`int`	`1`	Number of input audio channels.
`output_sample_rate`	`int`	`24000`	Audio output sample rate in Hz.
`output_sample_size`	`int`	`16`	Audio output sample size in bits.
`output_channel_count`	`int`	`1`	Number of output audio channels.

Usage

Basic Setup

import os
from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region=os.getenv("AWS_REGION"),
    settings=AWSNovaSonicLLMService.Settings(
        voice="matthew",
        system_instruction="You are a helpful assistant.",
    ),
)

With Settings

from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService, AudioConfig

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="us-east-1",
    audio_config=AudioConfig(
        input_sample_rate=16000,
        output_sample_rate=24000,
    ),
    settings=AWSNovaSonicLLMService.Settings(
        model="amazon.nova-2-sonic-v1:0",
        voice="tiffany",
        system_instruction="You are a helpful assistant.",
        temperature=0.5,
        max_tokens=2048,
        endpointing_sensitivity="MEDIUM",
    ),
)

With Function Calling

from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="us-east-1",
    settings=AWSNovaSonicLLMService.Settings(
        voice="matthew",
        system_instruction="You are a helpful assistant that can check the weather.",
    ),
    tools=tools,  # ToolsSchema instance
)

@llm.function("get_weather")
async def get_weather(function_name, tool_call_id, args, llm, context, result_callback):
    location = args.get("location", "unknown")
    await result_callback({"temperature": 72, "condition": "sunny", "location": location})

The Params / params= pattern is deprecated as of v0.0.105. Use Settings / settings= for inference settings and AudioConfig / audio_config= for audio configuration instead. See the Service Settings guide for migration details.

Notes

Model versions: Nova 2 Sonic (amazon.nova-2-sonic-v1:0) is the default and recommended model. The older Nova Sonic (amazon.nova-sonic-v1:0) has fewer features and requires an assistant response trigger mechanism.
Endpointing sensitivity: Only supported with Nova 2 Sonic. Controls how quickly the model decides the user has stopped speaking — "HIGH" causes the model to respond most quickly.
Transcription frames: User speech transcription frames are always emitted upstream.
Connection resilience: If a connection error occurs while the service wants to stay connected, it automatically resets the conversation and reconnects.
System instruction and tools precedence: Instructions and tools provided in the LLM context take precedence over those provided at initialization time.
Audio format: Uses LPCM (Linear PCM) audio format for both input and output. Input defaults to 16kHz and output defaults to 24kHz.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

AWS Nova Sonic API Reference

Example Implementation

AWS Bedrock Documentation

AWS Console

Installation

Prerequisites

AWS Account Setup

Required Environment Variables

Key Features

Configuration

AWSNovaSonicLLMService

Settings

AudioConfig

Usage

Basic Setup

With Settings

With Function Calling

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

AWS Nova Sonic API Reference

Example Implementation

AWS Bedrock Documentation

AWS Console

​Installation

​Prerequisites

​AWS Account Setup

​Required Environment Variables

​Key Features

​Configuration

​AWSNovaSonicLLMService

​Settings

​AudioConfig

​Usage

​Basic Setup

​With Settings

​With Function Calling

​Notes

Overview

Installation

Prerequisites

AWS Account Setup

Required Environment Variables

Key Features

Configuration

AWSNovaSonicLLMService

Settings

AudioConfig

Usage

Basic Setup

With Settings

With Function Calling

Notes