assistant = client.beta.assistants.create(
instructions="""<Instructions>
""",
model="gpt-4-1106-preview",
tools=[{"type": "retrieval"}],
file_ids=[file.id]
)
I assume there must be some argument to set model hyperparameters in this client.beta.assistants.create code block ? Anyone knows which argument or if there is any other way to set hyperparameters ?
8 Likes
Thanks for posting and welcome to the forum!
We do not currently support setting temperature or other completions parameters in the Assistants API, but it is something we’re seeking feedback on during the beta period.
Can you share more about your use case for modifying these parameters?
Assistant Runs currently sample multiple messages in a loop, and results can be choppy if we apply high or low temperatures to every message we sample.
1 Like
I am not aware of the sampling concept being applied internally by Assistant Runs, but when I ask same question and provide the same file, I sometimes get different answers. Sometimes its like
I apologize for the inconvenience, but it seems that there is an issue with the accessibility of the file you've uploaded. The tool I would typically use to browse and analyze the contents of your file is not able to access it. ...
While sometimes its correct but the generation is differently worded.
I usually set the temperature parameter to 0 or very low so that the LLM becomes deterministic and I can estimate its accuracy on questions related to my domain.
My usecase is related to data analysis by Assistant and generate some result on which further action can be taken, so I prefer Assistant to be more deterministic.
5 Likes
Agree this is important to support. We’re generating SQL, which we want to be deterministic.
4 Likes
Thank you! We’ll include this into our planning for the next beta update.
3 Likes
_j
6
Also, just that any API call to a language model without constraining top-p or top-k will get you a long tail of token possibilities, and then you’re rolling the dice of breaking output and costing yourself another uncontrolled call with this system running at maximum context length…
Funny how the bare minimum chat history is what ChatGPT gets, but when paying, you get the maximum model context loaded up with “threads” for all the iterations the AI wants when it meanders through documents.
Any idea when these model parameters would become available? The lack of deterministic outputs is currently a blocker for my company to begin using the Assistants API. Need to guarantee the same outputs when performing PDF analysis (or at least very similar outputs) for the same inputs.
1 Like
A little off the topic, could you elaborate more on the message sampling concept?
We’re in the same situation. We’ve built our own assistant that returns and resolves SQL. Controlling seed / temperature in this context really helps with a level of determinism.
The Assistants API would allows us to do away with our “hand-rolled” assistant, and simply rely on threads.
I would suggest that you define your temperature (and other parameters) at Assistant creation time. Any new threads will inherit the new temperature and settings. This would alleviate the “choppy” issue as discussed above.
4 Likes
_j
10
Both embeddings and AI is non-deterministic, due to the architecture of the models used.
Ada embeddings, you will not get the same vector back every time and thus cannot guarantee the same set of top-x matches regardless.
Temperature or top-p, at 0.0000000000000001 instead of the lesser 0.0 placeholder, you will still get a model that can have the position of top logits flip, which will then alter the course of generation.
Then put all your data in the mystery box of assistants? nyet.
1 Like
This is the same problem I am receiving. Sometimes the information is found in the files and sometimes not.
mattgos
12
Unfortunately the Assistants API isn’t very useful without these basic controls. Thumbs up for adding them as a feature.