Using the Assistants API I would like to add a functionality to upload a PDF (or other type) of documents containing delivery notes, receipts, invoices or other similar. Given an XSD or Json schema, ChatGPT should extract all the information from the PDF file and populate a target Json or XML document that I can use to import the data into my database. I tried with the first version of the Assistants API and got some promising results although not always the right ones, but with the latest versions of the models and the Assistants API v2, ChatGPT always hallucinates and returns XML with imaginary data, contact people who do not even exist in the original document, quantities of items and prices completely wrong. I couldn’t believe it could hallucinate like that. Another problem is that the latest versions don’t accept XSD (XML schemas) as input file whereas before it did… seems like ChaptGT is going backwards. Instead of improving these very useful skills, it is losing them. Is there a solution to fix it? I have tried myriads of different prompts to no avail.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Assistant x Chatgpt API Quality (PDF reading) | 0 | 568 | April 17, 2024 | |
Problems with PDF content recognition with gpt-4o-mini (OCR) | 3 | 978 | October 14, 2024 | |
Problems with recognising and reading file formats | 7 | 777 | April 3, 2024 | |
Assistant API system files should not be exposed to the user + PDF file parsing is intermittently buggy | 6 | 561 | March 25, 2024 | |
Using gpt4o as OCR fills data with invented data | 10 | 521 | December 20, 2024 |