Function Calling with memory

I am new to this function calling feature of Openai. I am building a chatbot which fetches information about the drugs from a website and returns it to the user.

So I want to add some memory in the function calling but struggling how to do that. Because if let’s say for the first time it fetches a detailed information about any specific medicine from a website, I don’t want it to to that again - it should just check from the memory and return the response. So it should not call the function every time based on the memory it should have. And I don’t think it’s storing the chats in memory.
I tried to add the assistant response in the message list, but that’s not working either.
Any help would be appreciated. Thanks

This is the code:

def get_drug_info(question):
    ### some code
    return detailed_info

messages = []
def run_conversation(input_message):
    messages.append({"role": "user", "content": f"{input_message}"})
    functions = [
            "name": "get_drug_info",
            "description": "Get the details about a drug/medication",
            "parameters": {
                "type": "object",
                "properties": {
                    "question": {
                        "type": "string",
                        "description": "The question you want to ask about the drug/medication",
                "required": ["question"],
    response = openai.ChatCompletion.create(
    response_message = response["choices"][0]["message"]

    if response_message.get("function_call"):
        available_functions = {"get_drug_info": get_drug_info}
        function_name = response_message["function_call"]["name"]
        function_to_call = available_functions[function_name]
        function_args = json.loads(response_message["function_call"]["arguments"])
        function_response = function_to_call(

            {"role": "function", "name": function_name, "content": function_response}
        second_response = openai.ChatCompletion.create(
        return second_response

resp =  run_conversation("What are the side effects of augmentin?")

You have to store all memory yourself, and provide that as part of the chat history when you call back for the next iteration of inference.

The only thing these models do, is look at some input text, and then predict what the output text should be. That’s it! Everything else, is constructed on top of the model, using prompt engineering, embedding search, pre- and post-processing, and a lot of heuristics and elbow grease.

So, for example, if you know in your session that a search was made as a function call, the next call you make to the API for this session, could replace the bit that generated the function call, with the output of the information you got from that call, and some prompt lead-in that explains what it is.

1 Like

You need to manually append the result of the function call to the system prompt the next time you send an inquiry if you want it to remember something from the function call result that was not referenced from the initial inquiry. This is needed specially since sometimes the next inquiry will not trigger function calling although it is related to the first inquiry.

As you are using function call, it just calls function every time the request is made, though you already included this information in the message history.
I hope message history is only useful to maintain the context but not function call result.

To solve this, you may need to maintain this as a cache in your own function.