Hi everyone,
I hope you’re all doing well! I’m reaching out because I’m facing a frustrating issue with the API I’m using. When I input specific data, the responses I receive are incomplete, yet the API isn’t giving me any hints that this might be related to token configuration.
I’ve experimented with various settings, including setting the max prompt and output tokens to None, 30k, and even 50k. Unfortunately, none of these adjustments have resolved the issue—I keep encountering the same incomplete responses.
Here’s a snippet of the assistant’s message for context (I’ve anonymized the content for privacy):
case_manager.case_assistant.ask_assistant: Assistant response: SyncCursorPage[Message](data=[Message(id='msg_oBWloQIclGw8ylgH727XpOVz', assistant_id='asst_kTuHLeOAkxISPiEGOps3bVSp', attachments=[], completed_at=None, content=[TextContentBlock(text=Text(annotations=[], value='...'), type='text')], created_at=1723938371, incomplete_at=None, incomplete_details=None, metadata={}, object='thread.message', role='assistant', run_id='run_99vDA6Q0K75BO3WtjTplcmrH', status=None, thread_id='thread_HA906DpRhIK5MVRWZxhSvyMp'), Message(id='msg_piHWvoiGCQxSt9KyBXVMcvyE', assistant_id=None, attachments=[], completed_at=None, content=[TextContentBlock(text=Text(annotations=[], value='...'), type='text')], created_at=1723938366, incomplete_at=None, incomplete_details=None, metadata={}, object='thread.message', role='user', run_id=None, status=None, thread_id='thread_HA906DpRhIK5MVRWZxhSvyMp')], object='list', first_id='msg_oBWloQIclGw8ylgH727XpOVz', last_id='msg_piHWvoiGCQxSt9KyBXVMcvyE', has_more=False)
I would greatly appreciate any insights or suggestions you might have! Has anyone else encountered a similar issue, or does anyone know of any workarounds? Your expertise would mean a lot to me. Thanks in advance for your help!
Another important piece of information, I tested it on the Playground and I am getting the same issue when it uses less than 17k tokens to do all file searches, it works perfectly but when it uses more than 17k tokens it just cuts the answer without any explanation. When I use a prompt that needs less context, works perfectly. BTW, even setting up my code max tokens for 50k or None I still have it.
Looking forward to your responses!
Best,
Victor