I’m trying to build a flask-interface for an Assistant.
I have the Assistant set-up correctly, with streaming printing to console and working perfectly. However, I am really struggling to make a Flask-interface that streams the output.
Does anyone have any examples of flask-streaming chat-bots that might help me out?
Here is what I have so far. It is working, however, the flask app only updates when the buffer size hits a certain mark (several hundred words) instead of every detla. I’m also not sure how to approach integrating the streaming into a chat-interface that shows back and forth replies.
@main.route('/chat_stream', methods=['POST', 'GET'])
def chat_stream():
logger.info("Chat route accessed.")
# Use the chat_session assigned to the current app instance.
chat_session = current_app.config['chat_session']
handler = EventHandler()
user_input = 'please write a 100 word poem'
message = chat_session.client.beta.threads.messages.create(
thread_id=chat_session.thread.id,
role='user',
content=user_input
)
# @stream_with_context
def generate():
try:
with chat_session.client.beta.threads.runs.stream(
thread_id=chat_session.thread.id,
assistant_id=chat_session.assistant_id,
instructions='system_prompt',
event_handler=EventHandler(),
) as stream:
for chunk in stream:
if type(chunk) == openai.types.beta.assistant_stream_event.ThreadMessageDelta:
yield chunk.data.delta.content[0].text.value
except Exception as e:
print(f"Error during streaming: {e}")
yield f"Error: {e}"
return Response(generate(), content_type='text/plain', headers={"Transfer-Encoding": "chunked"})