AI model fingerprints are not unique, making them fairly useless for tracking model updates

_j · April 16, 2024, 3:25am

The top result can switch positions with the second result if they have nearly equal but unreliable probabilities. That is, when using top_p to only get the top result.

However, this is the concept that I wanted to portray:

The unreliable tokens can also destroy the usefulness of seed. You can’t repeat the particular token that was randomly selected if the token logprobs are different each time.

The multinomial sampler would be provided a dictionary of logprobs, and then try to repeat its choice. However, if the probability space occupied by tokens is different, a different token choice may be made even if the same random threshold (“seed”) that selects a token is repeated.

I use probability mass just to refer to the whole set or subset. I’m not so clever as to find shared nomenclature when I refer to this threshold of cutoff.

Ask an AI about quantiles, thresholds, probs, mass…

Probability Mass:
The term “probability mass” is correctly used in the context of discrete probability distributions and refers to the probability assigned to a particular value or set of values in such a distribution. In your example, discussing the probability mass of values above a threshold like 0.46 is technically correct if you are dealing with discrete data or categories.
For Continuous Distributions:
In the context of continuous random variables, the term “probability density” might be more appropriate when referring to the “weight” of certain sections of the probability distribution.
However, when you are talking about the sum of probabilities above a certain point (like your 0.46 threshold), you are generally referring to the “tail probability” or the “survival function” of the distribution. This function (commonly denoted as S(x)S(x) for a threshold xx) is defined as the probability that the random variable is greater than xx, i.e., S(x)=1−F(x)S(x)=1−F(x) where F(x)F(x) is the cumulative distribution function at xx.
Using “Quantile” in Your Context:
If you want to describe using a 0.46 threshold to decide categories based on your bar charts, you might say, “the 0.46 cutoff represents a specific quantile of the underlying probability distribution, used to categorize the outcomes.” This implies that you are using the value 0.46 to define a boundary between different probability sections (quantiles) of your data.
Recommendation for Clarity:
When discussing the cutoff in a practical sense (as in your bar graph example), it would be most clear and correct to refer to “the 0.46 quantile threshold” or simply “the threshold at the 0.46 quantile”, which effectively communicates that this value is used to divide the probability distribution at that point.

None of these are helpful for talking with the layperson…

For more research, consider the nucleus sampling paper itself.

Topic		Replies	Views
Seed param and reproducible output do not work API	30	17186	August 7, 2025
Embedding Model Determinism, big difference API api-embedding	3	821	April 7, 2025
Japanese usage in gpt-4-1106-preview is strange Feedback	13	1842	January 4, 2024
Why `OpenAI Embedding` return different vectors for the same text input? API	35	11563	April 30, 2024
The seed inference parameter in GPT 4-TURBO API gpt-4	4	20710	November 7, 2023

AI model fingerprints are not unique, making them fairly useless for tracking model updates

Related topics