Extract the table data from a semi-structured PDF

I have two methods to extract the table data from a semi-structured PDF and implement an RAG pipeline with langchain
*** Pandas DataFrame Agent**
*** By using MultiVectorRetriever**

I want to know :

  • Is there any other method to extract tables from semi-structured pdf in langchain
  • How can I get better responses when the table contains descriptive text
  • When I ask a question that does not include the same words as in the table