Enhanced Prompt Management

Sorry that you hit the aforementioned edge case. That isn’t the experience we intend for new users.

I just ran through the flow on a brand new account today and can confirm that I was able to use the “Make your first API call” snippet without issue (and without adding a payment method, purchasing credits). I encourage you to try it!

1 Like

Reminder: half a billion ChatGPT users already have an account that lets them log into the API site if they were curious. An account just like mine (case 1).

Accounts are not fungible, “make a dozen to try”. An OpenAI account is: no credential reuse allowed, unique email and limited phone number use opportunity, nor can these login details be changed, later, ever. Nor is there a “poof, more organizations for you” button (unless another owner takes over and leaves you with none, and you contact support).

So, sign-up again and get more free use is specifically one reason why we can imagine the one-time $5 API credit grant completely went away. That dollar amount was easy to understand, still seen in grant history - except that ChatGPT users got it and it expired without them ever knowing.


I would suggest that “quickstart”:

https://platform.openai.com/docs/quickstart?api-mode=responses

which talks about making API calls, but doesn’t discuss setup or actually paying for services, be improved.

It would say something along the lines of how to initiate this flow on any account, and exactly what is available for free use and how it is activated for particular qualifying accounts - while others (I just showed two orgs) are denied any free quota. Then, exactly the mechanism by how much such use would be “credited” or “made free”, how much you have left per period, and for what endpoints or models. What you’d see once you are looking at your usage and billing, how calls affects your current credit balance if you do pay, etc.


BTW: “Check your rate limits page” for tier, not an indicator of what is offered for you to consume for free - here’s my organization auto-named “personal” (created late 2023, added automatically to account with API org already) just gone through the pictured “flow” that cannot make calls because credit balance = $0 (case 2). It doesn’t have all models like gpt-4 or o3 provisioned (as expected), but this in “free tier” rate limits is unlikely to indicate that I can make 200 $1 API calls a day to O1:

Just to clarify the off-topic discussion if users can make free API calls:

On the free tier it is possible to make calls to some of the mini-models at very low rates.

We can now return to discussing the new prompt managing tools.

2 Likes

Can you explain what I see on your image? What exactly would this feature be?

What you see is Completions (not Chat Completions, for any historical screenshot there would not reach the potential).

What is shown in the center dialog has not been replicated yet in the user interface for Chat Completions - logprobs display. Beyond that, logprobs are completely missing as a parameter and return in the Responses endpoint.

A mouse hover in that UI would show the probabilities powering the choices made in individual token sampling, how certain the AI was in generating logits that are randomly chosen from. Selection of a run also delivers a perplexity metric.

If there was a user interface to determine how well your application would perform in tasks like classification and entity extraction, it would certainly be to see how well your particular messages and model selection perform - how many trials would be the “wrong” answer, discovered instantly and not with hundreds of trials and evals against non-deterministic models.

Beyond that, the interface is straightforward - and still has presets to share to demonstrate implementation of an application with others.


But really the question is: does anybody really want to write code that can only refer to a “prompt ID” for half of its settings, is without any API method to even find out what those settings are or change them, what the supported variable fields are that still need extremely custom code equivalent to just sending a system message, and which does not store a melange of co-dependent parameters to actually run inference? That’s “prompt management”.

1 Like

I agree with @aprendendo.next, it would be great to be able to set as default a new prompt version when creating it from the Playground!!

Otherwise you need to Update in Playground > Confirm new version > Go to Dashboard / Prompts > Click the prompt and select the latest version > Set as default.

These are least 4 steps that could be removed!!

1 Like

We hear you, working to make this even more streamlined! Keep the feedback coming

1 Like

Thank you for sharing this update @dmitry-p. I have a question about the example you shared:

In your Prompt Playground screenshot, it appears the model config can be selected and coupled directly with the prompt. In your API call example, the model parameter is missing, suggesting it’s inherited from the prompt ID itself.

I want to ensure I understand this correctly, as there seems to be a discrepancy: the examples in the docs align with the Responses API format, and the API reference still lists model as a required parameter.

If model config coupling within Prompts is supported, could you point us toward documentation covering this behavior (or will the Responses API reference be updated to account for this)? Specifically, it would be helpful to understand the expected behavior, default handling, and whether the API-level model parameter can override template-coupled models.

Additionally, will there be any dedicated API methods for prompt management? This could include functionality to programmatically retrieve prompt metadata, such as available variables, version history, the full prompt content, associated model configurations, and other parameters, etc. Thanks!

1 Like

@dmitry-p it seems the "strict": false issue was fixed, and now a non-strict schema is being pulled ok.

But the max-tokens issue is still there:

  • Can’t save the max tokens value in the prompt, every time I open a saved prompt, I need to adjust this value.

Also I found another issue with max tokens in the playground:

  • When opening a saved token by following a link
  • Then update the max tokens value to the max
  • The max is 16384 tokens for gpt-4.1, but it should be 32768
  • Now switch the model to something else, then back to gpt-4.1
  • The max value is now 32768, as it was supposed to be.

The maximum of the playground slider is set by an internal API that delivers all the model information to the playground. It is not surprising for it not to be correct nor to deliver the full capability of the model. GPT-4.1 will never output that much naturally anyway. It is like the model feature information or pricing information or call logs - in a UI, on an API, and not for you.

The prompt object item (by another API that is also only by owner session token and browser request) does not have a maximum tokens value returned, as an answer to your question.

You’re left to figure out yourself in your API setup if you need to set max_output_tokens 10x because it is a reasoning model in the “prompt” with a hard problem…or if you can get a reasoning summary or encrypted reasoning or code interpreter inputs, or blocked streaming without ID verification based on one prompt vs another, or if those will status 400 the API (even if null).

Here’s the object that is loaded and stored by the playground. However the API for reading this is not available by remote call with an API key, therefore there is no creating, modifying, or even reading in your application. Only consuming a product.

  "data": [
    {
      "id": "pmpt_1234",
      "object": "prompt",
      "created_at": 1750117071,
      "creator_user_id": "user-1234",
      "default_version": "1",
      "ephemeral": false,
      "instructions": [
        {
          "type": "message",
          "content": [
            {
              "type": "input_text",
              "text": "You are a helpful programming assistant .........."
            }
          ],
          "role": "system"
        }
      ],
      "is_default": true,
      "model": "gpt-4.5-preview",
      "name": "API structured schema bot",
      "reasoning": {
        "effort": null
      },
      "temperature": 0.01,
      "text": {
        "format": {
          "type": "text"
        }
      },
      "tool_choice": "auto",
      "tools": [],
      "top_p": 0.01,
      "updated_at": 1750117071,
      "version": "1",
      "version_creator_user_id": null
    },
    {
      "id": "pmpt_

It is like the model feature information or pricing information or call logs - in a UI, on an API, and not for you .

I don’t agree with you on this one. When set too low (2k by default) it simply is not providing nearly as long response as I need it to. So it really is a setting affecting users directly.

The prompt object item (by another API that is also only by owner session token and browser request) does not have a maximum tokens value returned, as an answer to your question.

This is exactly the issue. To be useful in playground we need to be able to set the max-tokens config for saved prompts. Otherwise, if I need to configure my saved prompt before being able to use it, how do I standardize it for testing across the team?

Moreover, this is a regression from what we used to have with presets, when all prompt configs were saved.

You’re left to figure out yourself in your API setup if you need to set max_output_tokens 10x

There is no question about API usage, just playground.

1 Like

Another minor “bug” that I just noticed: we used to have a stateful url for selecting the default model when we open the playground. Like this:

https://platform.openai.com/playground/prompts?models=o4-mini

But it is now always changing back to “gpt-4.1”

https://platform.openai.com/playground/prompts?models=gpt-4.1

1 Like

@dmitry-p hi!
Am I doing something wrong? When I try to use a newly-created dashboard prompt and run it via openai.responses.create, I get an error which indicates that model and input fields are required (even your code example is different from the docs, which provides the model field explicitly). Changing input to [] makes the model consider the input empty and does not produce any output.
Help is super appreciated as I was looking for this feature for a long time. The lib version is 1.88

1 Like

@dmitry-p , both @IvanBlooper and I seem to be having the same issue. Here’s my write up as well.

I also confirm the issue.

A workaround is to define an empty input and set model as None:

response = client.responses.create(
    model=None,
    prompt={
    "id": "pmpt_12345.....",
    "variables": {
        "sentence": "the book is on the table"
      }
    },
    input=[],
)
print(response.model, response.output_text)

Thank you @aprendendo.next. I hope the team can either update their example posted on this thread, or the Prompt documentation and the Responses API reference.

I’m supposing you’re having the same issue with Chat Completions as well? Thanks!

I think they are not available for completions yet:
image

1 Like

Thanks! I was hoping to try and couple the seed parameter with the reusable prompt within the completions method, so we’ll just have to wait I think. :slight_smile:

2 Likes

Thanks for the flag, looking into it now. The SDK issues are likely downstream of it being marked as required in the OpenAPI spec.

2 Likes

The “prompt” item is not really a prompt. It is system message plus multi-shot.

There still should be a per-call user “input” for the task to be performed. Either the default of it being a user string, or it being a list of role-assigned messages.

Unlike the first post here with a screenshot of system talking to the AI like a user, AI models don’t like interacting with “system” as a user message giving a task.


And just today, another case where “I could have shared a preset to demonstrate usage”, but the feature is lost.

I could see what effect the API having an endpoint parameter of “instructions” and the prompt object also internally being “instructions” has. Whether appending or overriding per parameter is not described - if this at all interested me.