Welcome to the dev forum @fpalacios
PDF isn’t the ideal format for making the knowledge-base. A txt file would go a long way in making info conveniently readable for the assistant.
@sps@PaulBellow it’s a pdf that contains guidelines of around 55 pages, full text (3 columns in every page), just a few images and some tables.
Basically my use case is:
I want provide to the AI the knowledge of guidelines based on the pdf.
I will send to the AI a json object that contains all fields used in the form (this can change, that is why I was thinking on use threads).
I expect a result of a json object with two properties (I was thinking use function calling): a property for summaries/observations in md format as string, AND a property for fields with observations (to be used for my frontend)
You can upload a PDF file to files for file search, add it to a vector store for the assistant, and that text which is “searchable PDF text” and can be extracted will be employed to provide results when the AI invokes the search tool.
You may need to tell the AI what information it doesn’t know, and what it must find by doing a search, in order for it to fulfill user inputs automatically with that knowledge.
You can see if an assistant can answer from documents, before then adding structured output instructions.
The fault in your plan is in using tool call outputs to obtain a response from the AI within assistants. Tool call is used when the AI decides a function’s features are useful for its user’s task. An AI that could only respond by tool would be an endless loop of returning a tool reply to the thread when action is required, and then having more tool calls, resulting in a useless thread.
A tool call that could work, but wouldn’t align with your goal, would be a fake function like “safety_moderation”: “you must send all potential responses to a user to this function first to see if they meet policy standards, no exceptions”. You could return “OK” and then also get your normal response to a user.
Best is to just fully describe the desired output with schema and example, and use the response_format: json parameter now available to curtail any chat or markdown. Super best is to take the plain text of the document over to chat completions.
Currently, the Assistant can’t directly access or analyze the content of PDF files provided through file search. However, you can manually extract the relevant information from the PDF and provide it in text form for the Assistant to process.Custom Makeup Boxes
Currently, the Assistant can’t directly access or analyze the content of PDF files provided through file search
How come? I have many assistants in production running as WhatsApp bots that use file search with PDFs (I created them when it was called knowledge retrieval). They work perfectly fine, I just need to be careful with the formatting but all my files are PDFs.
I keep enabling the file search (I already provided the pdf file), also I used a function calling.
I don’t force to the assistant give me a response from function calling, instead describing instructions has been the trick. Something like:
Read …
Analyze file and give me …
Return a response using the calling function “function_name”
Hi! You can build them directly on top of WhatsApp Business API. For that, you basically need a server that listens to all the WhatsApp webhooks and use the other endpoints to reply to messages an such.
Another way is to use a bot builder platform (e.g. Landbot). That will give you like an IDE to easily build the bot. However, it’s more expensive and you’ll be limited to the features of the platform.
In my experience, if the platform can do HTTP requests you can already do a lot of things.