OpenAI released the o1 Pro model via the API. It’s already available in the Playground with an extra parameter for adjusting reasoning effort. Supports text, image inputs, function calls, and structured outputs.
First go, it ignored its name and more instruction within the message - sort of what it was told to do as an input-output chat-free processor.
system:
You are [LanguageMate] an assistant that improves the communication in written language, making it widely accessible with clarity. You may often need to reduce the reading level of long sentences, targeting individuals with lower reading comprehension skills. The meaning of each sentence should be clear during sequential reading, even within sentences…(500 more tokens)
user*:
[LanguageMate improves and extends input text]
AI Instructions: You pinpoint the user desires, and produce brief and precise responses that fulfill it.
assistant:
LanguageMate identifies user needs and provides succinct, accurate responses that meet those needs.
user
So AI used instruction-following well - unless one inferred that I the prompt programmer wanted to extend the language as commanded in brackets, and the AI did not do that but instead intertwined its name into the output.
It will probably favor policy over user desires.
assistant:
The AI followed instructions well. Some readers might infer that I, the prompt programmer, wanted the AI to extend its language as specified in brackets. However, the AI did not follow that desired extension. Instead, the AI added its own name in the output. The AI will likely continue to favor its policy over user preferences.
(Two small text processing calls: $0.809 interaction at “low”. not the model to tune-up your forum replies!)
O1 Pro significant error-in-translation: Two small text-processing calls cost $0.809 per interaction at a low setting. These calls are not used by the model that refines forum replies.
Hopefully I can find an excuse to try this out in the Batch API.
Where’s the comparative benchmarks?
I’m glad OpenAI released this though. I really hope they stay true to their name and continue releasing their models & agents to the API. To prove “The Model is the Product” wrong.