Assistants API + SQL Retrieval

Is it possible to use the new Assistants API to answer questions in a SQL database? For example, take the default-sql-translate example and use it in an assistant to answer questions, not just generate SQL queries?

5 Likes

Good question, interested too =)

1 Like

There are only 2 ways of doing this.

First, as you said, is creating a Query, so you can run in it in your site
Second, give the Assistant a readable file of your DB so the assistant can return the result directly.

2 Likes

Hello Clubmaple, I’m trying to do something similar. How did you go about explaining your database schema to the assistant , so it could recommend the SQL query given a user query in natural language ?

I tried given this as plain text in the system message content, while this works for small schemas with a handful of tables, this seems clumsy for schemas with several tables and complicated relationships.

That is an interesting question, I also want to develop the Assistant API with similar functions. However, I checked the supported files for the Assitant API and the SQL file is not part of it. It would be great if the Open AI can support the file type of SQL (e.g., *.sqlite3) for the Assistant API.

You can use markdown to explain the assistant your current schema

doesnt work great on 3.5, you will have to use GPT4 or above.

Remember you will need to add some locks to prevent hacking, like searching for undesired words (DELETE, UPDATE, etc)

Also a big tip:

  • Give GPT a simplified schema, and then you can replace it after.

For example

This will be a nice self-explain schema

CONTACTS table

  • id
  • name
  • last_name
  • gender | enum(female,male)
  • phone_number

Even in your real DB, you have “cell” instead of phone_number, but it is super hard to explain to GPT that “cell” means “phone_number”

Instead you tell GPT your column is called “phone_number” and then you can replace that string with “cell”

Even a better tip (i havent test it yet), could be to ask GPT to retrieve separately the elements of the query (select columns, wheres, orders, limit, etc).

That way it will be easier to run a security check before runing the query, like no limit above 100, etc)