This is what documentation for the API chat return object says about system_fingerprint, which is returned for only new models released since devday (and not for vision):
This fingerprint represents the backend configuration that the model runs with.
Can be used in conjunction with the
seed
request parameter to understand when backend changes have been made that might impact determinism.
However, I try to make a logger, to start tracking changes to AI models and find many different fingerprints being returned for the same model:
-- Fingerprint report from 4 trials:
gpt-3.5-turbo: (3):fp_b28b39ffa8, (1):fp_c2295e73ad
gpt-3.5-turbo-0125: (3):fp_b28b39ffa8, (1):fp_c2295e73ad
gpt-3.5-turbo-1106: (3):fp_592ef5907d, (1):fp_77a673219d
gpt-4-turbo: (2):fp_76f018034d, (2):fp_c5162df67e
gpt-4-turbo-2024-04-09: (3):fp_76f018034d, (1):fp_c5162df67e
gpt-4-turbo-preview: (3):fp_b77cb481ed, (1):fp_1fbd2f868d
gpt-4-0125-preview: (4):fp_b77cb481ed
gpt-4-1106-preview: (3):fp_d6526cacfe, (1):fp_89f117abc5
-- Fingerprint report from 10 trials:
gpt-3.5-turbo: (8):fp_b28b39ffa8, (2):fp_c2295e73ad
gpt-3.5-turbo-0125: (7):fp_b28b39ffa8, (3):fp_c2295e73ad
gpt-3.5-turbo-1106: (8):fp_77a673219d, (2):fp_592ef5907d
gpt-4-turbo: (7):fp_76f018034d, (1):fp_9c4936e070, (2):fp_c5162df67e
gpt-4-turbo-2024-04-09: (3):fp_c5162df67e, (6):fp_76f018034d, (1):fp_9c4936e070
gpt-4-turbo-preview: (6):fp_122114e45f, (4):fp_b77cb481ed
gpt-4-0125-preview: (6):fp_b77cb481ed, (3):fp_21e53d6942, (1):fp_54b778f7c8
gpt-4-1106-preview: (1):fp_94f711dcf6, (4):fp_d6526cacfe, (2):fp_6bc7cb96fb, (1):fp_5b4e6f81f5, (2):fp_89f117abc5
Even the brand new gpt-4-turbo has multiple values.
Does this require âseedâ to get the same fingerprint, if we read the confusing docs a different way? No improvement:
gpt-3.5-turbo: (6):fp_b28b39ffa8, (4):fp_c2295e73ad
gpt-3.5-turbo-0125: (9):fp_b28b39ffa8, (1):fp_c2295e73ad
gpt-3.5-turbo-1106: (4):fp_77a673219d, (6):fp_592ef5907d
gpt-4-turbo: (3):fp_c5162df67e, (4):fp_76f018034d, (2):fp_a39722e138, (1):fp_9c4936e070
gpt-4-turbo-2024-04-09: (8):fp_76f018034d, (1):fp_c5162df67e, (1):fp_a39722e138
gpt-4-turbo-preview: (5):fp_b77cb481ed, (4):fp_122114e45f, (1):fp_1d2ae78ab7
gpt-4-0125-preview: (5):fp_b77cb481ed, (1):fp_3b06ba039c, (3):fp_122114e45f, (1):fp_54b778f7c8
gpt-4-1106-preview: (2):fp_89f117abc5, (4):fp_6bc7cb96fb, (2):fp_94f711dcf6, (2):fp_d6526cacfe
What is this supposed to be reporting to us anyway, then? Four different versions of a model running in the wild? Letâs crank up the inspections and statistics of a model:
-- Fingerprint report from 100 trials:
gpt-4-turbo: (54):fp_76f018034d, (32):fp_c5162df67e, (10):fp_9c4936e070, (4):fp_a39722e138
300+ trials to chat models and the only useful thing I can find: The âbackend changesâ that have been made destroy any expectation of determinism by repeating calls.
Next: find the variety of logprobs that transcend fingerprintâŚ
Python code for reporting on all supporting models
from openai import OpenAI
client = OpenAI(timeout=15)
models = [
"gpt-3.5-turbo-0125", "gpt-3.5-turbo-1106",
"gpt-4-turbo-2024-04-09",
"gpt-4-0125-preview", "gpt-4-1106-preview",
]
fingerprints = {} # contains {"gpt-4-turbo": ["print1", "print2"], ...}
trials = 100
# Collecting fingerprints
for trial in range(trials):
for model in models:
if model not in fingerprints:
fingerprints[model] = []
try:
response = client.chat.completions.create(
messages=[{"role": "system", "content": "Hello"}],
model=model, max_tokens=1, seed=123456,
)
fingerprint = response.system_fingerprint or ""
fingerprints[model].append(fingerprint)
print(f"{model} fingerprint {trial}: {fingerprints[model][trial]}")
except:
print(f"{model} fingerprint {trial}: timeout or error")
pass
for model in fingerprints.keys():
fingerprints[model] = sorted(fingerprints[model])
print(f"\n-- Fingerprint report from {trials} trials:")
for model, prints in fingerprints.items():
unique_fingerprints = {}
for fp in prints:
if fp in unique_fingerprints:
unique_fingerprints[fp] += 1
else:
unique_fingerprints[fp] = 1
report_line = f"{model}: "
report_line += ", ".join(f"({count}):{fp}" for fp, count in unique_fingerprints.items())
print(report_line)