The seed inference parameter in GPT 4-TURBO

One of the announcement made today was the “seed” parameter which helps to get a deterministic response every time. The response now returns a system_fingerprint. based on the below I thought when the response is deterministic, the fingerprint will be same and when the response drifts the fingerprint will change. But looks like the fingerprint is always the same. What is the use of it then?

system_fingerprint

This fingerprint represents the backend configuration that the model runs with. It can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.This is the indicator on whether users should expect “almost always the same result”.

When OpenAI retrains or reparameterizes the AI model, it will no longer be completely without disclosure and for you to figure out what went wrong. You can track the version of the model used.

The seed is likely the seed that is set into the multinomial logit sampling selection. However, I already discovered that gpt-3.5-turbo-instruct (like embeddings), logprobs are not the same every call, and even at minimum top-p for greedy sampling, the top token can still switch with another, so that seed can’t completely control output unless there are multiple randomization-based vector or token-selection methods before logit probability scoring that also can be controlled.

2 Likes

One thing I can tell you from my own internal testing is with temp at .1 and top_p at 1 I was getting about a 97% overall reliability rate between 15 metrics. Most of them being at 100% consistent output for the same prompt.

The thing is pretty dang freaking reliable / consistent given the same set of text if you set the parameters correctly. But maybe for whatever your use is “pretty dang accurate” doesn’t cut it. I’m just sayin…I didn’t really get the point they were making today because by my own testing I think the thing is more reliable than almost anything at giving the same determined responses with the given parameters. But I get some people need the exact same response every time

1 Like

If your JSON has 10 tokens that are each a 90% chance of being right, but 10% chance of being something else produced that is invalid token, then decreasing the temperature can increase the distances between those probabilities but doesn’t eliminate them.

A top_p (nucleus sampling) parameter of 0.50 will only sample tokens from the top 50% probability of all tokens. If you have a token that is even 60% likely, no alternate can be selected. That’s what you want to use for reliability.

The seed would be useful for getting the same wild unpredictable path of token generation again, to then see where the logits went wrong.

1 Like

My reading from the documentation regarding seed parameter and system fingerprint is that seed parameter is like a client or session ID. You pick a number just to label things, then in your logs you record {“seed parameter”: '123;, “system fingerprint”: ‘a1-b2-c3’, and then over time you can see if answers drift but fingerprint is the same, or if fingerprint changes (due to OpenAI changing something).

Will know more once there’s more written about it.