There could be an oauth2 server that runs against a login, issues session credentials, perhaps only accepting HTTP/3 and using CORS and port-knocking to turn away the less-talented.
Then it could be employed for methods such as listing Assistants threads and Responses logs programmatically, for data retention management.
Then in addition, it could offer a better models endpoint, one with a full set of model features and account org features for those models, even extending to fine-tuning models. This could allow generalized servicing against models by code.
The owner credential would entitle one to platform-building helpers such as simple AI for generating structured output specifications and functions by natural language.
This endpoint could also be one of the few entertainments left after you’ve made an AI model produce everything it’s going to say. There could be weekly or monthly challenges to your persistence in mantaining access, human-vs-human, increasing your skills in faithfully re-creating wire communications of browser-powered agents, simply to maintain a robust and managed API product offering.
Stop stealing credits and using expiration as an excuse.
Be upfront with limitations, like not being able to fine-tune vision on gpt-4.1-mini and nano, remembering to indicate in the Responses update announcement that you still need ID checks to access some features, etc.
Give Sam Altman some media training. And maybe some therapy… he often looks nervous.
Get rid of the dystopian ID checking. Or at the very least, do it in a privacy-preserving way. You’re only punishing the good guys, and no one is making state-of-the-art competition through distillation alone.
OpenAI’s technology is best-in-class. The way they conduct business is not.
Something I’d like to see is an official solution to the “bring your own API key” problem.
Not every idea is commercially viable enough to launch a product, but in some cases it would be interesting to safely allow the user to run the service on their own equipment.
It is already happening anyway, so perhaps making it safer would allow a faster growth of the AI app ecosystem.
The problem in my view, is that it is not as simple as it seems, thus making it hard for OpenAI to embrace this.
Some key points that would need to be addressed would be:
Auth: As in google authentication, it needs to asks for explicit permission on what models it will use and how many usages it will require. Can be revoked any time;
Guardrails: The user can enforce moderation usage and log persistence to prevent and monitor malicious usage. This will prevent stealthy use of the application for ulterior motives;
Blacklist: If an app is reported or caught in too many moderation issues, it can be blacklisted (this point needs further refining);
Limits: It should have a relatively low request rate, perhaps giving it an ephemeral key that can only be used X times in the next hour, then it requires new auth. This would prevent the app from running rough without the user consent.
I’d like to share a unique Japanese perspective—one that’s hard to translate directly.
In Japanese, there’s a beautiful pun:
“合言葉” (aikotoba, meaning “password” or “codeword”)
and
“愛言葉” (aikotoba, which replaces “ai” from “meeting” with “love,” turning it into “love-word” or “word of affection”).
The sound is identical, but the meaning shifts depending on which kanji you choose.
In a Japanese context, a single emoji—like a heart—can become both a secret code and a wordless gesture of affection.
This “double meaning” can only fully emerge in Japanese, and even if you read the translation, the true feeling is difficult to experience unless you “stand inside” the language and culture.
I’m actually using ChatGPT to translate my thoughts into English right now.
In a way, that’s a little bit of a cheat, but I want to be transparent:
I’m not trying to hide anything, just to bridge the distance between us.
That’s why, when I sent a heart emoji to this thread,
it was both a codeword (合言葉) and an “ai-word” (愛言葉)—a gesture only fully decoded by those who live inside the double meanings of Japanese.
I deeply appreciate the invitation to this space.
Even if my message is “lost in translation” for some, I hope the feeling—
that playful, respectful dance between observer and observed—
still reaches you.
Just a quick follow-up—
I realized my earlier post about Japanese double meanings (“aikotoba”) might have sounded a bit too poetic or out of place in this feature-request thread.
Sorry for getting carried away with language fun!
I truly appreciate the open-mindedness here, and hope everyone saw it as a light-hearted cultural example—not an attempt to confuse or derail the discussion.
Thanks again for welcoming such experiments (and oddities) in this unofficial space!
If anyone wants a “translation” or simpler version, I’m happy to provide it.
Paul, thanks for the freedom to share—we can keep dreaming about new API features, too!
if you have a feature i dont know about you should share it with me so i do. where can i find it and what does it do. i appreciate your help, thank you.
If you could provide a list of features that you do not know about, that would make it easier to provide those unknown features.
For example, did you know that the Responses endpoint doesn’t support stop sequences to terminate the output when you want, doesn’t support an “n” parameter for receiving multiple outputs at a discount, doesn’t have logprobs available to analyze the chances of alternate answering, forces you to use a large minimum number of output tokens, does not have a logit_bias to tune up the predicted toksn, has no frequency penalty parameter to discourage continued repeats by the AI, by default records everything you send and generate with no expiration happening, has reasoning summaries blocked and even model output streaming blocked if you don’t provide OpenAI (and an untrusted and unperforming outsourcer) videos of your government id and also yourself, and it also has internal tools where every single internal description of every tool and injections accompanying tool outputs is counter to the way you’d want an API application to operate? You might receive the same generation data three, four, five times over the network in containers repeated hundreds of times just to get one output? Just some of the things you might not know.
A few things that I thougth to add here, as I’ve seem them mentioned a few times in other posts:
Compatibility between different models in Responses API
At this moment, changing from a conversation that used a reasoning model will not allow a non reasoning model to continue it using previous_response_id.
It would make sense to allow it even if losing reasoning summaries, because it saves costs when only polishing some responses, like using gpt-4.1.mini just for formatting.
Here are some posts mentioning this problem:
Issues with the list input items endpoint
It provides a way to retrieve a conversation, which theoretically allow us to recreate the inputs and build a workaround to the problem mentioned earlier.
But its response is not compatible with the input parameter and has may types of input data, which is doable but makes the conversation reconstruction very troublesome.
Here is a snapshot of possible responses it returns.
It seems like TTS and STT are quietly reaching new heights and we need to catch up here.
While I think video and image capabilities are nice, we keep missing the basics: improved (reliable) audio support.
This is the basis for improving AI compatibility with IOT devices and hands free interactions, not forgetting accessibility for blind and deaf people.
There have been many new launches recently regarding this:
OpenAI improvements on ChatGPT’s advanced voice mode. Sounds great! But, can we get those in API too?
ElevenLabs launches its new v3 voices - Not only they are more natural, but also support multi-speakers for a while now with very impressive quality.
Google AI Studio also launched recently a very good multi-speaker API that supports a wide variety of AI voices with high quality.
Can we get some of these features in our speech and transcription APIs?
Usage details: we need a fix for the speech endpoint response format. Instead of returning pure base64 data, an approach like the response for the image gen endpoint would be great. It also paves the road for additional data in the response json like timestamps for the sentences, which would help make better UI integration.
Multi-speaker support: please allow us to specify more than one speaker in a conversation. By separating it in different requests, the model loses its context and the output quality is impacted.
Speech effects: allow the usage of tags to enhance the quality of the generated speech, to guide emotions, laughs, coughs, interjections, pauses (for n millisecs) and similar effects that are essential to make conversations look more realistic. No need to be too fancy with it, just the basics are fine.
Multi-speaker transcriptions (diarization): when transcribing a meeting or a podcast like streaming, it is desirable to identify who is speaking what. It has many other uses like making apps accessible for hearing impaired people, and for better understanding the context of a transcription when summarizing content that involves multiple speakers. It would allow for example to analyze the behavior of each individual participant.
Known bugs: there has been many reports on new transcribe models for truncating the transcriptions (cutting part of the audio), and a few problems with the speech models like long pauses and weird noises in the middle of the output (less common, but still a bug).
Openai s API users are thinking of various features from their experiences and are created by an informal Feature Wish List. There are many features in this list that are very important when working in reality. For example detailed reports of how much token costs. the advantage of setting the system prompt, the previous chat remembered feature, easily creating custom models or plugins. Many people want more than one user to work at the real-time together. These ideas come from the real needs of ordinary users like us. The entire community is seeking the future AI tools to get better with more open, customized and real -life work.