How can I speed up an analytic chatbot that's based on Langchain (with agents and tools) and Streamlit and disable its intermediate steps?

ill42 · November 2, 2023, 1:09am

I created an analytic chatbot using Langchain (with tools and agents) for the backend and Streamlit for the frontend. It works, but for some users’ questions, it takes too much time to output anything. If I look at the output of intermediate steps, I can see that the chatbot tries to print out all relevant rows in the output. For example, below, the chatbot found 40 relevant comments and printed them out in one of its intermediate steps one by one (it takes up to one minute).

My questions are:

Is there any way to speed up this process?
How can I disable the intermediate output of the chatbot? (I already put return_intermediate_steps=False, verbose=False, and expand_new_thoughts=False, but the chatbot still shows intermediate steps.)

Code for chatbot:



def load_data(path):
    return pd.read_csv(path)

if st.sidebar.button('Use Data'):
    # If button is clicked, load the EDW.csv file
    st.session_state["df"] = load_data('./data/EDW.csv')
uploaded_file = st.sidebar.file_uploader("Choose a CSV file", type="csv")


if "df" in st.session_state:

    msgs = StreamlitChatMessageHistory()
    memory = ConversationBufferWindowMemory(chat_memory=msgs, 
                                            return_messages=True, 
                                            k=5, 
                                            memory_key="chat_history", 
                                            output_key="output")
    
    if len(msgs.messages) == 0 or st.sidebar.button("Reset chat history"):
        msgs.clear()
        msgs.add_ai_message("How can I help you?")
        st.session_state.steps = {}

    avatars = {"human": "user", "ai": "assistant"}

    # Display a chat input widget
    if prompt := st.chat_input(placeholder=""):
        st.chat_message("user").write(prompt)

        llm = AzureChatOpenAI(
                        deployment_name = "gpt-4",
                        model_name = "gpt-4",
                        openai_api_key = os.environ["OPENAI_API_KEY"],
                        openai_api_version = os.environ["OPENAI_API_VERSION"],
                        openai_api_base = os.environ["OPENAI_API_BASE"],
                        temperature = 0, 
                        streaming=True
                        )
        
        max_number_of_rows = 40
        agent_analytics_node = create_pandas_dataframe_agent(
                                                        llm, 
                                                        st.session_state["df"], 
                                                        verbose=False, 
                                                        agent_type=AgentType.OPENAI_FUNCTIONS,
                                                        reduce_k_below_max_tokens=True, # to not exceed token limit 
                                                        max_execution_time = 20,
                                                        early_stopping_method="generate", # will generate a final answer after the max_execution_time has been surpassed
                                                        # max_iterations=2, # to cap an agent at taking a certain number of steps
                                                    )
        tool_analytics_node = Tool(
                                return_intermediate_steps=False,
                                name='Analytics Node',
                                func=agent_analytics_node.run,
                                description=f''' 
                                            This tool is useful when you need to answer questions about data stored in a pandas dataframe, referred to as 'df'. 
                                            'df' comprises the following columns: {st.session_state["df"].columns.to_list()}.
                                            Here is a sample of the data: {st.session_state["df"].head(5)}.
                                            When working with df, ensure not to output more than {max_number_of_rows} rows at once, either in intermediate steps or in the final answer. This is because df could contain too many rows, which could potentially overload memory, for example instead of `df[df['survey_comment'].str.contains('wet', na=False, case=False)]['survey_comment'].tolist()` use `df[df['survey_comment'].str.contains('wet', na=False, case=False)]['survey_comment'].head({max_number_of_rows}).tolist()`.
                                            '''
                            )              
        
        tools = [tool_analytics_node] 
        chat_agent = ConversationalChatAgent.from_llm_and_tools(llm=llm, tools=tools, return_intermediate_steps=False)
    
        
        executor = AgentExecutor.from_agent_and_tools(
                                                        agent=chat_agent,
                                                        tools=tools,
                                                        memory=memory,
                                                        return_intermediate_steps=False,
                                                        handle_parsing_errors=True,
                                                        verbose=False,
                                                    )
        
        with st.chat_message("assistant"):
          
            st_cb = StreamlitCallbackHandler(st.container(), expand_new_thoughts=False)
            response = executor(prompt, callbacks=[st_cb])
            st.write(response["output"])

elzeindima · January 31, 2024, 11:22am

Hello @ill42
Did you find a solution for this?

ill42 · February 1, 2024, 2:14am

Hi @elzeindima,

To remove the intermediate steps, just don’t create and use a StreamlitCallbackHandler. The rest should work without it.
E.g. instead of:

st_cb = StreamlitCallbackHandler(st.container(), expand_new_thoughts=False)
response = executor(prompt, callbacks=[st_cb])

Just do

response = executor(prompt)

Topic		Replies	Views
How to control o3-mini chat model without returning "One moment please" API chatgpt , azure-openai , o3-mini	3	414	March 11, 2025
Accessing different sections of the LLM's class's output API gpt-4 , chatgpt , plugin-development , api , chatgpt-plugin	4	1304	December 17, 2023
Only use available tools, remove content API	3	626	April 10, 2024
ChatGPT API Very Slow at generating Responses API gpt-4 , api	8	5290	December 25, 2023
Return_intermediate_steps Error API api	1	577	August 30, 2023

How can I speed up an analytic chatbot that's based on Langchain (with agents and tools) and Streamlit and disable its intermediate steps?

Related topics