Skip to main content

Overview

AzureLLMService provides access to Azure OpenAI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management with enterprise-grade security and compliance.

Installation

To use Azure OpenAI services, install the required dependency:
pip install "pipecat-ai[azure]"

Prerequisites

Azure OpenAI Setup

Before using Azure OpenAI LLM services, you need:
  1. Azure Account: Sign up at Azure Portal
  2. OpenAI Resource: Create an Azure OpenAI resource in your subscription
  3. Model Deployment: Deploy your chosen model (GPT-4, GPT-4o, etc.)
  4. Credentials: Get your API key, endpoint, and deployment name

Required Environment Variables

  • AZURE_CHATGPT_API_KEY: Your Azure OpenAI API key
  • AZURE_CHATGPT_ENDPOINT: Your Azure OpenAI endpoint URL
  • AZURE_CHATGPT_MODEL: Your model deployment name

Configuration

api_key
str
required
Azure OpenAI API key for authentication.
endpoint
str
required
Azure OpenAI endpoint URL (e.g., "https://your-resource.openai.azure.com/").
model
str
default:"None"
deprecated
Deprecated in v0.0.105. Use settings=AzureLLMService.Settings(model=...) instead.
api_version
str
default:"2024-09-01-preview"
Azure OpenAI API version string.
settings
AzureLLMService.Settings
default:"None"
Runtime-configurable settings. See Settings below.
Since AzureLLMService inherits from OpenAILLMService, it also accepts the following parameters:
params
InputParams
default:"None"
deprecated
Deprecated in v0.0.105. Use settings=AzureLLMService.Settings(...) instead.
retry_timeout_secs
float
default:"5.0"
Request timeout in seconds. Used when retry_on_timeout is enabled to determine when to retry.
retry_on_timeout
bool
default:"False"
Whether to retry the request once if it times out. The retry attempt has no timeout limit.

Settings

Runtime-configurable settings passed via the settings constructor argument using AzureLLMService.Settings(...). These can be updated mid-conversation with LLMUpdateSettingsFrame. See Service Settings for details. AzureLLMService uses the same settings as OpenAILLMService. See the OpenAI LLM Settings section for the full parameter reference.

Usage

Basic Setup

from pipecat.services.azure import AzureLLMService

llm = AzureLLMService(
    api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
    endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
    model=os.getenv("AZURE_CHATGPT_MODEL"),
)

With Custom Settings

from pipecat.services.azure import AzureLLMService

llm = AzureLLMService(
    api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
    endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
    model=os.getenv("AZURE_CHATGPT_MODEL"),
    api_version="2024-09-01-preview",
    settings=AzureLLMService.Settings(
        temperature=0.7,
        max_completion_tokens=1000,
        frequency_penalty=0.5,
    ),
)

Updating Settings at Runtime

Model settings can be changed mid-conversation using LLMUpdateSettingsFrame:
from pipecat.frames.frames import LLMUpdateSettingsFrame
from pipecat.services.openai.base_llm import OpenAILLMSettings

await task.queue_frame(
    LLMUpdateSettingsFrame(
        delta=OpenAILLMSettings(
            temperature=0.3,
            max_completion_tokens=500,
        )
    )
)

Notes

  • Deployment name vs model name: The model parameter should be your Azure deployment name, not the underlying model name (e.g., use "my-gpt4-deployment" instead of "gpt-4").
  • API version: Different API versions support different features. Check the Azure OpenAI documentation for version-specific capabilities.
  • Full OpenAI compatibility: Since AzureLLMService inherits from OpenAILLMService, it supports all the same features including function calling, vision input, and streaming responses.

Event Handlers

AzureLLMService supports the same event handlers as OpenAILLMService, inherited from LLMService:
EventDescription
on_completion_timeoutCalled when an LLM completion request times out
on_function_calls_startedCalled when function calls are received and execution is about to start
@llm.event_handler("on_completion_timeout")
async def on_completion_timeout(service):
    print("LLM completion timed out")
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.