Gpt-3.5-turbo-0125 is available via API now. Share your first impressions

Hi guys! What are your first impressions? Any progress compared to 1106? It would be interesting to read your comments.

1 Like

This thing is blasting whole sentences and parts of paragraphs at once on the playground at a tremendous rate (probably because the whole chunk is first content filtered…)

Same effect when -instruct came out, 100+ tokens per second before everyone found it.

Will have to move over to a device with scripts to see what streaming looks like for chunk size and any modifications. Also run current issues of 1106 against this model. Then see if it follows system instructions that haven’t worked since 0613 was damaged in September so bad even OpenAI’s own cookbook examples were broken.

Answer 1: The problem of calling multi-tool functions wrong when they are completely unneeded persists. This is an issue seen across latest models and even ChatGPT to various degrees starting in the last week.

2 Likes

I am testing using a multi-language customer support chatbot and it works as expected, at least in this app. It also appears fast compared to 1106.

1 Like

For now anyway. We get to see where it tops out before load, .

—gpt-3.5-turbo—
[128 tokens in 2.5s. 52.2 tps]
[1024 tokens in 15.4s. 66.6 tps]

—gpt-3.5-turbo-1106—
[128 tokens in 5.4s. 23.8 tps]
[1024 tokens in 23.1s. 44.3 tps]

—gpt-3.5-turbo-0125—
[128 tokens in 1.6s. 82.2 tps]
[1024 tokens in 7.3s. 140.2 tps]

(each also writes a different test document title…)

Also get to find out where the 50% savings in input and 25% savings in output comes from in terms of computation allotted.

2 Likes

Since I only use it for a casual chat app, I don’t see too much difference from 1106. It’s decent for a non-critical chat app. And it’s cheaper.:grinning:

It still doesn’t follow instructions to write long speeches, like “write 500 tokens.” Meanwhile, 3.5-0613 follows it closely. When they drop 3.5-0613, I’ll switch to the 3.5-instruct model to generate long speeches.

I think it can’t be helped. It’s cheap.

Hello guys, good news, i am implementing it into a neo4j app (chatbot and text analysis). However i can’t fine tune it yet, did u heard about this ? I know 4 turbo is not available for fine tuning, but i was hoping to get 3.5 0125 this week for fine tuning

EDIT : found this on the doc

Yes @ivanpzk, same here

any idea – anyone ?? when will they open it for fine-tuning?