New models appear on the API

sps · November 11, 2023, 5:34am

I was looking at the models listed on the API and found two new models listed on the /models endpoint:

canary-whisper - Latest whisper Yay! (likely V3)
canary-tts - not sure what’s added to tts, but latest

_j · November 13, 2023, 8:29pm

Same, then a search came back with the forum. Usually my searches for specialized areas of knowledge lead to my own writings on the topic already…

canary-tts, created Wednesday, November 8, 2023 5:22:15 PM California time

Canary, like a canary in a coalmine, to detect who will play with random models? Canary is also the bleeding-edge Windows 11 version for insiders.

Still takes the same voice specifications:

(enum_values=[<Voice.NOVA: ‘nova’>, <Voice.SHIMMER: ‘shimmer’>, <Voice.ECHO: ‘echo’>, <Voice.ONYX: ‘onyx’>, <Voice.FABLE: ‘fable’>, <Voice.ALLOY: ‘alloy’>])"

I can say that the quality of voice is significantly degraded when using speed 1.2, a choppy effect, so I won’t try this on other models either. It is likely post-processed.

testing generation speeds

tts_canary-tts_alloy_20231113_093208.flac took 2.64 seconds
tts_tts-1_alloy_20231113_093230.flac took 2.51 seconds
tts_tts-1-hd_alloy_20231113_093254.flac took 2.85 seconds

The new billing usage makes it impossible to find the cost.

sps · November 13, 2023, 8:33pm

There are things on canary-tts that are amazing! Like speech modulation.

_j · November 14, 2023, 11:24am

It took another day to see billing, with their poor breakdown. I didn’t log all the inputs as I was also writing a utility library to ease the chunking of document files for max character limits, audio file appending, and the the awkwardness of the python library.

Imagine, set class object parameters once if you don’t like my defaults, then keep feeding methods text for the existing filename if not just calling by input/output file. Now uninspired to continue writing though as it would not be for me.

(the TTS fails on reading back OICU812RUOK) for maximum audio per input)

sps · November 15, 2023, 10:47pm

Here’s a tiny sample of speech modulation on canary-tts:

This has been completely made with the API. No post processing.

Foxalabs · November 15, 2023, 10:51pm

So what was the bit at the beginning? A sob?

This is a sentence.
These are words in a sequence

Experimented last night and could not get the thing to work.

sps · November 15, 2023, 10:56pm

That was a soft laugh. It’s not engineered, the model knows what a soft laugh is IMO.

I’ll be writing about this more, once I have done some more tests.

N2U · November 15, 2023, 10:57pm

Did you just prompt [soft laugh] or?

To me it sounds a bit like inhaling?

sps · November 15, 2023, 11:02pm

Yes, soft laugh and then some more. Certainly there’s room for more detail, will be sharing more soon.

ostov · November 16, 2023, 10:20am

Could you please give me the prompt? I have not been able to recreate this.

_j · November 16, 2023, 2:24pm

Here’s a test: check your billing the day after using this canary-tts.

Do you, for not using any untrained completion models like davinci-002, have in your billing usage for the day “base models” with an extraordinary charge?

The rate limits for the model are not under “audio”, but rather are rated as completion with completion token limits - and so are the bills?

sps · November 16, 2023, 3:08pm

Nope nothing like that. It’s billed under its own category:

And I did use base model davinci for an experiment which is reflected here:
Screenshot

sps · November 16, 2023, 5:52pm

Another model is now added to the API: whisper-1-1p

Topic		Replies	Views
New TTS API pricing and gotchas API	8	3977	March 25, 2025
Oct 31: New Voices on Chat Completions, to replace others - up API	1	3284	October 31, 2024
New model, tts-2, any news on it? (new voice mode) API tts	9	2274	February 21, 2025
Did OpenAI just make a new AI Voice? API	7	3386	May 16, 2024
TTS API service usability API tts	17	7405	December 16, 2023

New models appear on the API

testing generation speeds

Related topics