Connecting an assistant to a database for retrieval

TheWarden · November 6, 2023, 11:27pm

With the new release of assistants, I am excited to connect it to a database and have it answer questions. It the docs i see it can be done with files but how can an assistant be connected to a database?

bleugreen · November 6, 2023, 11:30pm

I think its going to be easiest with Actions, but we need to wait for the GPT Editor in ChatGPT

TheWarden · November 7, 2023, 1:50am

Will Actions allow GPT to answer nuanced questions about the data in the database? From what I can tell it doesn’t transform the data into vectors so it wouldn’t be that good? I’m imagining something that vectorizes the data and embeds in a custom GPT… if that makes sense. Basically the functionality thats described here for files in context of the Retrieval tool

TheWarden · November 7, 2023, 2:00am

Also worth noting- not sure if anyones tried to use assistants with files but I get this uploading the file BadRequestError: 400 Failed to index file: Unsupported file file-Jf8okhY5J2Te2SuiBIosmozU type: application/csv

Clubmaple · November 7, 2023, 3:30am

if you have a small DB and doesnt update very frecuently, yes you can add the the db into the assistants files, in a readable format.

The other option, as i do, i give the AI my full DB structure, and in the functions i ask the AI to create the DB Query to run in my system, and return the result to the Assistants.

This is better to have updated results.

Just make sure to add some security to check the query before run it in your site (for example make sure it only has SELECT, has propper filters, etc)

bleugreen · November 7, 2023, 4:47am

I mean, it will be able to write SQL queries and then use the SQL response to answer the question, which is much more accurate and efficient than a vector search if your data is tabular and number/label-oriented.

Vector similarity search works best for text and natural language, so tasks like finding relevant snippets in a long document where it can match the ‘vibe’ of a query to a chunk of text that may or may not share the exact wording. However, for tabular data (e.g the name, population for every city in the U.S.), we probably don’t care about which city names match the general ‘vibe’ of “Philadelphia”, we just want to go to that row and see what the number is.

ctrlbrk · November 7, 2023, 9:37pm

Exactly my question for same reasons. Subscribing to thread.

g-wiz199 · November 9, 2023, 10:41am

I had the same plan yesterday. Tried to do it via actions and use a https_request json call. However, https_requests are not supported (yet?). What would be your best guess to get around this limitation?

syntichsizer · November 9, 2023, 11:01am

Is there a documentation regarding https_requests? I planned on using axios in my function to retrieve data from my database using rest api.

bleugreen · November 10, 2023, 2:40am

Got access to actions today and https seems to be working for me, what’s your api schema like?

a.singhal034 · January 8, 2024, 10:35am

@bleugreen How are you providing the database structure to assistant?
I have the same usecase and what I am doing is creating a file where I am storing the db schema with business context for each table and feeding to assistant. Is it the same ?
I have around 30 tables so may not fit in context limit

merefield · January 8, 2024, 11:55am

Hmmm … not sure I’d let the AI do that, you can’t predict what horrible queries might run which would at the very least have the potential for bringing your system to its knees, let alone get hold of sensitive data like User account information that it shouldn’t have access to! (unless you want to write a set of specific views which are permissioned separately?!)

Alternatively create a specific local function on your app server that takes specific arguments described in your function definition you share with the LLM and and tailor the queries for these specific use cases. You can then target indexes etc. This kind of wrapping is more work, but the system will be faster, more robust and more secure. You also then have a layer you can refactor if your schema changes without affecting the LLM code.

steichman · April 16, 2024, 6:15pm

@clubmaple, what is a readable format for the api? I uploaded a text file with a script I generated using sql server that has the dml statements to create my tables. It keeps saying it can’t read my file.

Clubmaple · April 29, 2024, 7:36am

I will recommend plain text, docs, pdf, or csv

i also explained it here:

Topic		Replies	Views
Assistants API + SQL Retrieval API	5	3021	April 29, 2024
Retrievals from files are cool , but I want to use retrieval from databases any ideas? API	3	1367	March 13, 2024
How to teach a model relational data? API assistants	10	1080	July 17, 2024
How to create a database in my Assistant? API	6	425	July 3, 2024
Best file format for Assistants on table data API assistants , assistants-api	7	3236	December 17, 2023

Connecting an assistant to a database for retrieval

Related topics