Do you have any information on whether the Assistants API will make it easier for us to retrieve the actual text that assistant is using from vector store? I believe v1 had an easier way to do this, but it seemed to have been removed from v2.
Great news, devs have been waiting patiently for our Shipmas present!
With o1 now having vision capabilities (which I presume will be released in the API in the future) how should we be thinking about using it for use-cases that typically require in-context learning?
Is it effective to provide the model a multi-shot prompt with each example having the ∼5 images that will be encountered in the real world, or is a different strategy recommended?
Are there any news for the Assistants API, particularly File Search updates (metadata, image parsing, etc)? Citations with exact text references are still missing as well.
Thanks and happy Shipmas!
Ask hypothetical question now, don’t know what you’re asking about until tomorrow? Clever twist.
“…so after that announcement, how long until O1-preview shutoff, if so?”
“…a parameter to tune reasoning length from minimum (for continued chat quality and speed) up to Pro 0-shot performance (that might confuse who even asked?)”. [answer in stream: yes, but not up to O1 Pro level, reserved for a future model]
“inter-call ID reference for continuing on prior reasoning context we can’t supply via API but paid for?”
“…with that additional context length, will context caching see further competitive discounts?”
Or what can be known:
On the {n} day of Chanukah, my true love gave to me:
how about complete developer customization of the Assistants file search tool text, the internal instructions of the tool, to accommodate the application: to match the provider and authority of the files (user or developer), to inform what will be found behind the search, to instruct HyDE queries of high semantic quality for the data, and da kine.
and working max_num_results Assistants run parameter
Why don’t vector store results get file names for chunks (like ChatGPT), only IDs? Thus, the user or organization could simply discuss the file upload source.
Assistants api has been in beta with associated rate limits for over a year now - why should we as developers continue to waste time prototyping solutions with OpenAI when we can’t actually ship anything. No roadmap or communication. Either kill it or release it - permanent betas are just toxic.
Dall-E seed parameter, pretty please! We need to be able to generate consistent images.
Output vision improvements in GPT-4o and future models. Specifically, the most important capability would be finding (x,y) coordinates in a specific image.
When will DALL·E’s seed functionality be available via the API?
This feature has been available in ChatGPT for over a year and is essential for developers. DALL·E’s prompt adherence and quality is unmatched, and its seed functionality delivers more consistent results than competitive image generation models. Releasing it via the API would unlock incredible potential for the broader developer community. Please consider prioritizing this!
Do you have any plans to give users more internal access to certain aspects of the model in the API, such as probabilities or control over censorship levels? For example, Google’s AI Studio allows users to adjust censorship levels more granularly.
First, thanks for sharing all these cool little (and big) updates so far.
As a Pro tier user (previously Teams user) and avid user both of the different apps and API, I have a few feature requests / ideas:
Option for devs to add API key inside of ChatGPT app and adjust model params inside the app (even if that means that responses are more censored due to regulations etc.)
Remove custom instructions from o1. Prompting for reasoning models and prompting regular instruct models are not the same and by default I believe custom instructions should be disabled for reasoning models.
GPTs are super cool and a powerful feature in any AI toolkit, but there is room for improvement in terms of effectively measuring GPT performance. Can we have an ELO based rating system to find the best GPTs for different categories (coding, math graphics etc)?
Thoughts on the following list of features for devs:
-Free, but low RPM/RPD dev endpoint for building esp helpful for smol devs
-Free limited number of [V] finetunes & RL a week also would be helpful. ><’
-Finetune Store to buy & sell FT GPTs & or training runs alone as skills/concepts/knowledge/behaviors so we could stack a bajillion Vision FTs and openai could collect the training runs/data too!
-o1 w vision via api (w reasoning guide params) --i assume it’s close by prob lol
-vision models w updated FT w computer use/screen endpoint
and ty for all your hard work guys and for the OpenAI 12 days of Shipmas, it’s been fun. :3
I would like to propose a few ideas for your consideration regarding future access to such training opportunities:
Additional Free Training Periods: Would it be possible for OpenAI to schedule additional free training periods in the future? These initiatives are invaluable for students, researchers, and small developers who may not have the financial means to invest in full training sessions but are eager to contribute meaningful projects to the community.
Retaining Deprecating Models for Free Use: Instead of fully deprecating certain models annually, OpenAI might consider maintaining select models for free or discounted access. This approach could allow users to continue their research and development without incurring significant costs while enabling OpenAI to foster an even broader community of innovation.
Permanent Cost Reductions for Training: To make training more accessible, OpenAI could consider permanently reducing token and epoch costs for training by 50% or more. For instance:
Reducing token requirements (e.g., from 500,000 tokens to 250,000 tokens or less).
Limiting epoch counts (e.g., from 10 epochs to 5 epochs).Such adjustments could significantly lower financial barriers for many users while ensuring that the computational demand remains manageable. This approach would be particularly impactful for students, researchers, and smaller organizations looking to experiment with fine-tuning but constrained by budget limitations.
When do you increase the maximum TOKEN (input) in the API? We are developing a data analysis tool and this is a must. As it is now, it will be replaced by gemini.
Would love to hear from the API team if the endless newline bug was fixed with streaming structured outputs, where the API spits out newlines for 90 seconds and then disconnects. Gif of it here. In the community forums here. Thanks!