How to Enable Voice Chat with TTS for Your Assistant

Superinterface allows you to enable voice chat using OpenAI’s Whisper and TTS APIs. This setup transcribes speech into text, processes it through your AI model, and converts the response back into natural-sounding speech.

Unlike realtime WebRTC connections, this method works with any AI model, making it a flexible solution for voice interactions.

Let’s get started with setting it up step by step.

Step 1: Setting Up Your Assistant

Create and Configure

Create a New Assistant

Log in to your Superinterface dashboard, go to Assistants, and click New Assistant.

Need help? Follow this guide.

Choose an AI Provider

TTS works with any provider, so select Anthropic, OpenAI, or another supported model.

Set Thread Storage and Execution

Choose Superinterface Cloud to handle transcription and speech generation.

Pick a Model

Select any model you prefer. For this example, we’ll use Claude 3.5 Sonnet.

Setting up a voice assistant with TTS in Superinterface.

Add Personalization

Customize Your Assistant

Set a name and define its behavior.

Set an Initial Message

Add a welcome message like “Hi! How can I assist you today?” This will be spoken aloud when the voice chat starts.

Save Your Assistant

Click Save to store your settings.

Step 2: Publish Your TTS Voice Interface

Creating an interface for your TTS-based voice assistant.

Once your assistant is set up, it’s time to enable TTS-based voice chat and publish the interface.

Enable Voice Chat

Create a New Interface

Navigate to the Publish tab and click Choose Interface, then Create New Interface.

Select Interaction Mode: Voice Chat

This enables OpenAI’s Whisper and TTS APIs to process speech.

Enabling Voice Mode in Superinterface.

Publish Your Assistant

Finalize Publishing

Choose your publishing method: subdomain, custom domain, script tag, or React component.

Need help? Follow this guide.

When users speak, their voice is transcribed using Whisper, processed by the assistant, and converted into speech using OpenAI’s TTS models.

Voice assistant using OpenAI's TTS and Whisper APIs via Superinterface.

You’re all set!

Now go ahead and let your users experience natural, high-quality voice interactions with your AI assistant—no realtime model required.