Use GPT3.5-turbo API for semantic search - open my source code for ideas

I have implemented search with OpenAI APIs .

You can get some ideas of this project from screen captures : here and here.

I am opening up source code to solicit ideas and helps from the community.

  • Any suggestions on building some impactful apps based on this code?
  • How to improve the way to extract contents from a document, specifically for table contents, multi-column documents, diagrams.
  • Any other ideas and critiques on this project?

What this program does is to scan web pages (either PDF or HTML), extract contents with original document structure (in case of HTML, it’s h1, h2, h3) and put them into a Pandas dataframe as local knowledgebase, then try to respond to user questions from local knowledgebase first. If none is found, fall back to OpenAI. More details in Github.