How do we submit evals for function calls?

sam.saffron · June 18, 2023, 11:07pm

I am noticing quite a few cases where evals for function calls can help

While testing I made a searchbot and asked it to search for “404 in user profiles.” no matter what I did I only got “profiles404 user” in my function args.
While testing I noticed that no matter what I do I can not get the model to run multiple functions, “search for car and then bar, what is common?”

Reading through eval source code I can see that the completions only look at the returned message:

openai/evals/blob/c2587c69a2f330282d3ba76eceaf580ee03fa67a/evals/completion_fns/openai.py#L28-L35


      
          class OpenAIChatCompletionResult(OpenAIBaseCompletionResult):
              def get_completions(self) -> list[str]:
                  completions = []
                  if self.raw_data and "choices" in self.raw_data:
                      for choice in self.raw_data["choices"]:
                          if "message" in choice:
                              completions.append(choice["message"]["content"])
                  return completions

I tried searching through the repo for function_call and can not find anything.

Can OpenAI add a sample eval for function calls, then we can all together help hone the functionality!

dani-mp · September 7, 2023, 10:45am

I faced this problem myself today. Is it even possible to include the functions in the first place?

Topic		Replies	Views
Function calling in gpt-3.5-turbo-instruct API function-calling , gpt-35-turbo-instruc	2	3129	September 22, 2023
How the `function_call` argument in OpenAI chat completion api might've been implemented API openai-documentation	2	1328	September 16, 2023
OpenAI Evals analogous to Fine Tuning? API	5	1129	August 2, 2024
About the function_call returns API	8	3401	December 19, 2023
1) Open API appends quotation marks on function arguments 2) Open API don't utilize function calls 3) How to force multiple function calls? API gpt-4	0	533	November 17, 2023

How do we submit evals for function calls?

Related topics