[WIP]: Chat audio #159

samwillis · 2025-12-19T17:28:17Z

🎯 Changes

Very Early WIP: Add audio output streaming support to OpenAI adapter

This PR adds support for streaming audio output from OpenAI's audio-capable models (e.g., gpt-4o-audio-preview) via the Chat Completions API.

Opening as a discussion starter.

Background

The current OpenAI adapter uses the Responses API (client.responses.create()), which does not support audio output modalities. Audio output streaming requires the Chat Completions API with modalities: ['text', 'audio'] and audio: { voice, format } configuration.

Changes

1. New AudioStreamChunk type (packages/typescript/ai/src/types.ts)

Added 'audio' to StreamChunkType union
Added AudioStreamChunk interface with data (base64), transcript, and format fields
Added to StreamChunk union type

2. Audio output options (packages/typescript/ai-openai/src/text/text-provider-options.ts)

Added OpenAIAudioOutputOptions interface with modalities and audio config
Included in ExternalTextProviderOptions

3. OpenAI adapter audio routing (packages/typescript/ai-openai/src/openai-adapter.ts)

chatStream() now detects modalities.includes('audio') in provider options
When audio is requested, routes to new chatStreamWithAudio() method
chatStreamWithAudio() uses Chat Completions API instead of Responses API
Yields AudioStreamChunk for audio data and ContentStreamChunk for transcripts

4. Model metadata (packages/typescript/ai-openai/src/model-meta.ts)

Added gpt-4o-audio-preview model definition
Added OpenAIAudioOutputOptions to audio model provider option types

Usage

import { chat } from "@tanstack/ai"
import { createOpenAI } from "@tanstack/ai-openai"

const stream = chat({
  adapter: createOpenAI(apiKey),
  model: "gpt-4o-audio-preview",
  messages: [{ role: "user", content: "Tell me a story" }],
  providerOptions: {
    modalities: ["text", "audio"],
    audio: { voice: "alloy", format: "pcm16" },
  },
})

for await (const chunk of stream) {
  if (chunk.type === "audio") {
    // chunk.data is base64-encoded PCM16 audio
  }
  if (chunk.type === "content") {
    // chunk.delta is transcript text
  }
}

Real-world usage

This is being used by the Durable Streams story-app example - a child-friendly AI story generator that streams both narrated audio and synchronized text transcripts to a durable stream for resilient playback.

Open questions

Should audio routing be explicit via a separate method, or is auto-detection from modalities the right approach?
Should we add a ChatCompletionsAdapter as a separate class for broader Chat Completions API support?
Are there other audio-related events from the Chat Completions API we should handle?

✅ Checklist

I have followed the steps in the Contributing guide.
I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

This change affects published code, and I have generated a changeset.
This change is docs/CI/dev-only (no release).

coderabbitai · 2025-12-19T17:28:25Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

WIP chat audio

9adc429

samwillis mentioned this pull request Dec 19, 2025

example app: Story teller with Tanstack AI durable-streams/durable-streams#73

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[WIP]: Chat audio #159

[WIP]: Chat audio #159

samwillis commented Dec 19, 2025

Uh oh!

coderabbitai bot commented Dec 19, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[WIP]: Chat audio #159

Are you sure you want to change the base?

[WIP]: Chat audio #159

Conversation

samwillis commented Dec 19, 2025

🎯 Changes

Background

Changes

Usage

Real-world usage

Open questions

✅ Checklist

🚀 Release Impact

Uh oh!

coderabbitai bot commented Dec 19, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant