I wonder if the API allows to produce an audio file using two or more voices alternating one another like a movie script, instead of producing distinct audio files with partial prompts and having to assemble them manually…
Anyone has tested? I cannot seem to make it work.
I came out with this:
const voiceMapping = {
"Mark": "alloy",
"Joan": "nova"
const script = [
"[Mark]: Hello, how are you?",
"[Joan]: I'm good, thanks! How about you?",
"[Mark]: I'm doing well!",
"[Character 3]: Did you hear about the news?",
"[Joan]: No, what happened?"
async function convertToAudio(script) {
const apikey = localStorage.getItem("openaikey");
const audioChunks = [];
for (const line of script) {
const match = line.match(/^\[(.+?)\]: (.+)$/);
if (match) {
const character = match[1].trim(); // Extract character name
const dialogue = match[2].trim(); // Extract dialogue
const selectedVoice = voiceMapping[character]; // Look up voice
if (selectedVoice) {
try {
const response = await fetch("https://api.openai.com/v1/audio/speech", {
method: "POST",
headers: {
Authorization: `Bearer ${apikey}`,
"Content-Type": "application/json"
body: JSON.stringify({
model: "tts-1",
input: dialogue,
voice: selectedVoice
if (!response.ok) {
throw new Error(`Error: ${response.statusText}`);
const blob = await response.blob();
} catch (error) {
console.error("Error while converting TTS: ", error);
} else {
console.log(`No voice mapping found for character: ${character}`);
} else {
console.log(`Line not in expected format: ${line}`);
function playAudioChunks(chunks) {
const audioPlayer = document.getElementById("audioPlayer");
const playNext = (index) => {
if (index < chunks.length) {
audioPlayer.src = chunks[index];
audioPlayer.onended = () => playNext(index + 1);
The routine is called via a button that calls convertToAudio(script);
The fact is that i continue getting the Bad request error and the chuncks are not assembled.