Extract the table data from a semi-structured PDF

anushka · May 3, 2024, 6:46am

I have two methods to extract the table data from a semi-structured PDF and implement an RAG pipeline with langchain
*** Pandas DataFrame Agent**
*** By using MultiVectorRetriever**

I want to know :

Is there any other method to extract tables from semi-structured pdf in langchain
How can I get better responses when the table contains descriptive text
When I ask a question that does not include the same words as in the table

Topic		Replies	Views
Table extraction for LangChain and vectorstore API	2	3435	January 9, 2024
Tabular data for finetuning a model API fine-tuning , pdf	1	1629	December 23, 2023
What is the current rag architecture of openai for pdf uploads? Community gpt-4	2	989	July 24, 2024
Read into pdf and output table API gpt-4 , chatgpt , api	10	10317	September 9, 2023
Search long pdf for specific table - possibly need fine tuning model API gpt-4 , fine-tuning , api	10	3157	March 29, 2024

Extract the table data from a semi-structured PDF

Related topics