Assistants API can't provide working links / attachments to PDFs I uploaded in knowledge base

BowenFeng · July 13, 2024, 11:42am

I want to create an AI assistant which has access to a repository of PDFs and is able to reference those to provide answers AND provide working links or attachments of any PDFs which it refers to in its answers.

I’ve tried both custom GPTs and the Assistants API (Playground) to create this, and am able to get them to correctly reference the PDFs, but any links provided are incorrect. They either don’t work, linked to another resource from the internet or get error “File not found”.

Anyone found a solution to this? Providing the actual PDF attachments would also be acceptable.

shafique1 · July 13, 2024, 12:22pm

To create an AI assistant that correctly references PDFs and provides working links or attachments, consider setting up a dedicated storage solution, such as AWS S3 or Google Cloud Storage, for your PDFs. Then, ensure your AI retrieves the correct file paths or URLs from this storage. Verify that the URLs are publicly accessible or properly authenticated to avoid “File not found” errors. If attachments are preferred, configure your AI to include the PDFs directly in its responses.

BowenFeng · July 13, 2024, 1:56pm

Thanks for your reply! Yes that could well be the direction I need to go, was hoping to be able to validate it via a PoC (by using Assistants on Playground and manually uploading the images) first before committing more to development. So I’m assuming working links not possible via this method?

Regarding attaching files directly, would this be possible from prompting based on people’s experience (haven’t gotten it to work myself), or would this require custom development?

sps · July 13, 2024, 4:26pm

Welcome @BowenFeng,

I’d recommend extracting text from PDFs to respective text files and supplying that, especially if you want to build an efficient knowledge base for the assistant.

The reason is that PDF is a pretty complex format where data can be text, scanned text, image, or a mixture of these, and this makes it very difficult to ensure that the assistant can really access the knowledge you want it to use.

BowenFeng · July 14, 2024, 6:33am

Hi @sps, thanks for your input!

I tried using the assistants API which created a vector store with the uploaded PDF files (did this in playground), which the AI now references - not sure if this step already covers the text extraction step or still better to do this separately?

Unfortunately reference links still not working yet, but think I might need to go with @shafique1 's advice and host the files in the cloud and provide the links to the AI somehow (will need to do some tinkering there). Will have a play around over the coming days and post any updates if there is progress.

In the meantime if anyone has solved this use case themselves, would love to hear how you’ve done it!

sps · August 27, 2024, 12:35pm

2 posts were split to a new topic: How to achieve ChatGPT-level PDF parsing with APIs?

Topic		Replies	Views
Is attaching a file to a prompt possible through API as it is in the UI? API	12	13604	March 18, 2025
Assistant API system files should not be exposed to the user + PDF file parsing is intermittently buggy Feedback api	6	563	March 25, 2024
Assistant on playground struggling to read pdf's in the retrieval API gpt-4 , assistants-api	6	2828	January 9, 2024
Chatgpt Assistant APIs not sending the image links API	0	177	July 10, 2024
Assistants API 'file_search' Completely Broken in Playground & API v2 API bug , api , assistant , assistants-files	12	2151	August 20, 2024

Assistants API can't provide working links / attachments to PDFs I uploaded in knowledge base

Related topics