Cloning a trained GPT4 - SQLite, Langchain - Understanding the correct approach

Hello,

I would like to request your guidance on the appropriate method to develop a concept for my company.

I have created a custom GPT-4 using OpenAI’s beta version, incorporating a knowledge base that includes: the main instructions written in the configuration section’s textbox and 10 files (5 DOCX files with technical details and 5 XLSX files with data). The main instruction describes the functioning of the knowledge base, and one DOCX file elucidates the connection between the XLSX files. The largest XLSX file contains approximately 65,000 rows. Everything operates flawlessly, and this assistant effectively responds to queries. Now, I aim to replicate this experience using an API and a local SQLite database with over 5 million records.

For this new concept, I plan to use Flask for managing the web server, but I am struggling to understand how to deploy a clone of what I achieved with the OpenAI web interface.

I created a new Assistant in the Playground and uploaded the DOCX files with instructions. I also wrote some basic code to test the knowledge base, and it works.

Additionally, I successfully queried the local database with Langchain, but I think I should use the Assistant for a ChatGPT-like experience.

Users could invoke the API for general information, and the responses might not always rely on the database. However, in many cases, the assistant should generate accurate queries to search data in SQLite.

I am unclear about the relationship between the Assistant, which understands the rules and topics, and Langchain, which can execute database queries.

Could you offer any advice? I hope I have managed to convey my intentions clearly.

Thank you for reading and assisting.

Fab

So your Assistant is now working and you want to integrate your local database. What you can do is add a function, (e.g. getInformationFromDB) in your Assistant that will be triggered when you need the information stored in your database. Now, when that function is triggered (e.g. getInformationFromDB: { query: ‘…’ } ), you can use your Langchain code to process the query and return the answer back to the Assistant.

2 Likes

Hi and thank you for your reply, mai I ask you how would you implement the function in the working GPT?

There are many ways to implement it. Let say we are getting info for a product in the local database. Here is a sample function:

{
    "name": "get_product",
    "description": "Get product information based from user query",
    "parameters": {
        "type": "object",
        "properties": {
            "name": {
                "type": "string",
                "description": "Product name"
            },
           "other_details": {
			   "type": "string",
               "description": "Other details about the product"
	       }
        },
        "required": [ "name", "other_details" ]
    }
}

Then the Instructions will look like this:

You are a helpful assistant.
# available tools
You have the following tools you can invoke depending on user request.
- get_product, when the user wants to get information about a certain product given the name and some other details.

Let say the user sends this query:

Please tell me about Otsuka Dining Table with solid wood finish.

The tool will be triggered like this:

get_product: { name: “Otsuka Dining Table”, other_details: “solid wood finish” }

Then you probably turn it into an SQL for your langchain/database code.

const sql_query = `SELECT * FROM products WHERE name = '${name}' AND description LIKE '%${other_details}%'`

It is also possible, that get_product already outputs SQL query for you if you change the function definition and you might end up like this:

get_product: { sql_query: “SELECT * FROM products WHERE name = ‘Otsuka Dining Table’ AND description LIKE ‘%solid wood finish%’” }

1 Like

That’s interesting and, again, thanks for coming back.

Dumb question: where the function you described should be located ?

I’m thinking in this way:

  1. ptyhon file on local machine asks for user input
  2. python passes that value to the OpenAI API
  3. GPT reads the instructions you mentioned
  4. GPT “should” be able to call the python file containing the function you mentioned.

Am I understanding your suggestions?

Look forward to your reply.

For testing, you can use the Playground and add your function definitions there. It validates what you submit so you can check if it is okay or not.

What you describe means you want the functions to be dynamically added. Well, you can do that, too. You can let the user adds functions, maybe save it in DB or as plain JSON file in the server. Then each time you add a function, you also update your Instructions and append a line for each function, like this:

You are a helpful assistant.
# available tools
You have the following tools you can invoke depending on user request.
- get_product, when the user wants to get information about a certain product given the name and some other details.
- new_function_name, description how you want the function to be invoked and what parameters are needed

where new_function_name is the new function you added plus some additional instruction.

But if the functions do not change perhaps, for simplicity, you can just save it as JSON file (located in the server) or JSON string (saved in DB) which you read each time and append to the API call.

1 Like

Again, thanks so much!
I’m testing what you suggested.