File search + function calling on Assistants

fpalacios · April 26, 2024, 11:38pm

I have a pdf file that I want to provide to the assistant through file search, this is crucial for my business.

At same time I want to get a structure response using calling function.

I tried, I get the json response but the assistant don’t have the knowledge of the pdf file.

Is it possible do it work?

sps · April 27, 2024, 2:00am

Welcome to the dev forum @fpalacios
PDF isn’t the ideal format for making the knowledge-base. A txt file would go a long way in making info conveniently readable for the assistant.

PaulBellow · April 27, 2024, 2:01am

Yeah, is it an “image-PDF” or does it have text inside?

Any examples you can share with us?

fpalacios · April 27, 2024, 4:33am

@sps @PaulBellow it’s a pdf that contains guidelines of around 55 pages, full text (3 columns in every page), just a few images and some tables.

Basically my use case is:

I want provide to the AI the knowledge of guidelines based on the pdf.
I will send to the AI a json object that contains all fields used in the form (this can change, that is why I was thinking on use threads).
I expect a result of a json object with two properties (I was thinking use function calling): a property for summaries/observations in md format as string, AND a property for fields with observations (to be used for my frontend)

sps · April 27, 2024, 6:20am

PDFs can be tricky, particularly if they are scanned ones.

For best performance I’d recommend extracting the text and saving it into .txt file, or .doc file if you want to include images.

fpalacios · April 27, 2024, 12:22pm

@sps ok, just to double checking is it possible ask questions about the docs using function calling in assisstants ai?

_j · April 27, 2024, 1:01pm

You can upload a PDF file to files for file search, add it to a vector store for the assistant, and that text which is “searchable PDF text” and can be extracted will be employed to provide results when the AI invokes the search tool.

You may need to tell the AI what information it doesn’t know, and what it must find by doing a search, in order for it to fulfill user inputs automatically with that knowledge.

You can see if an assistant can answer from documents, before then adding structured output instructions.

The fault in your plan is in using tool call outputs to obtain a response from the AI within assistants. Tool call is used when the AI decides a function’s features are useful for its user’s task. An AI that could only respond by tool would be an endless loop of returning a tool reply to the thread when action is required, and then having more tool calls, resulting in a useless thread.

A tool call that could work, but wouldn’t align with your goal, would be a fake function like “safety_moderation”: “you must send all potential responses to a user to this function first to see if they meet policy standards, no exceptions”. You could return “OK” and then also get your normal response to a user.

Best is to just fully describe the desired output with schema and example, and use the response_format: json parameter now available to curtail any chat or markdown. Super best is to take the plain text of the document over to chat completions.

smithlock365 · April 27, 2024, 1:06pm

Currently, the Assistant can’t directly access or analyze the content of PDF files provided through file search. However, you can manually extract the relevant information from the PDF and provide it in text form for the Assistant to process.Custom Makeup Boxes

_j · April 27, 2024, 1:15pm

I’ll give you the opportunity to revise your answer after checking documentation and even OpenAI’s examples.

https://platform.openai.com/docs/assistants/tools/file-search/step-2-upload-files-and-add-them-to-a-vector-store

what it is lacking:

Support for parsing images within documents (including images of charts, graphs, tables etc.)

geekykidstuff · April 27, 2024, 5:26pm

Currently, the Assistant can’t directly access or analyze the content of PDF files provided through file search

How come? I have many assistants in production running as WhatsApp bots that use file search with PDFs (I created them when it was called knowledge retrieval). They work perfectly fine, I just need to be careful with the formatting but all my files are PDFs.

fpalacios · April 27, 2024, 5:45pm

Thanks guys, I got make it work.

I keep enabling the file search (I already provided the pdf file), also I used a function calling.
I don’t force to the assistant give me a response from function calling, instead describing instructions has been the trick. Something like:

Read …
Analyze file and give me …
Return a response using the calling function “function_name”

Thats it, thanks.

alanabi20 · June 17, 2024, 10:14pm

I know this is not relevant to this thread, but any hints on how to build WhatsApp Bots?!

geekykidstuff · June 18, 2024, 1:45am

Hi! You can build them directly on top of WhatsApp Business API. For that, you basically need a server that listens to all the WhatsApp webhooks and use the other endpoints to reply to messages an such.

Another way is to use a bot builder platform (e.g. Landbot). That will give you like an IDE to easily build the bot. However, it’s more expensive and you’ll be limited to the features of the platform.

In my experience, if the platform can do HTTP requests you can already do a lot of things.

wolsen · August 6, 2024, 3:28am

Hey @fpalacios how did you get the output? I keep getting stuck at “Status: Requires Action” and can’t get past that?

Topic		Replies	Views
Function calling + file search API assistants-api	0	59	October 28, 2024
How to create a custom function with file-search and code-interpreter API assistants-api	0	46	October 9, 2024
Assistant with knowledge files API assistant	4	1272	May 3, 2024
File Search - Is an Assistant Mandatory? API	4	499	July 25, 2024
Assistant + function calling + file search API api , assistants-api	1	461	September 6, 2024

File search + function calling on Assistants

Related topics