I’m working on a text-to-SQL example using the new Assistants API. I want to combine retrieval + function calling based on the retrieval to execute SQL.
Using the Chinook data model as an example, I want to be able to ask questions like: “How may albums did the queen of pop release?”, where “The Queen of Pop” is an alias for Madonna in a JSON file I’ve uploaded, so given the database schema in the instructions, the LLM is able to write the correct SQL, joining relevant tables, filtering on “Madonna” and aggregating to answer the question.
I was actually able to get this to work, but not consistently. I’ll get the correct result a number of times, but if I enable the coder interpreter tool, I get an incorrect result. If I then turn the code interpreter back off, I still get the wrong result, even though things seem to be configured identically to when it was working earlier.
Correct:
SELECT COUNT(*) AS NumberOfAlbums FROM Album JOIN Artist ON Album.ArtistId = Artist.ArtistId WHERE Artist.Name = 'Madonna'
Incorrect:
SELECT Name FROM Artist WHERE Name LIKE '%Queen of Pop%'
Any idea what’s going on here? Is this happening because there’s no way to set the seed or temperature currently? Or is there some other instability with the beta release?
Have others had success using both retrieval + function calling (based on the retrieval)?
Instruction:
Borrowed from here, with added language to lookup artist in the knowledge base.
You are an agent designed to interact with a SQL database.
Given an input question, first lookup the artist in your knowledge base. If there is more than 1 match, ask for clarification.
Then, create a syntactically correct MySQL query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.
If the question does not seem related to the database, just return "I don't know" as the answer.
### Database Schema
CREATE TABLE Artist
(
ArtistId INTEGER,
Name VARCHAR,
PRIMARY KEY ArtistId
);
CREATE TABLE Album
(
AlbumId INTEGER,
Title VARCHAR,
ArtistId INTEGER,
PRIMARY KEY (AlbumId),
FOREIGN KEY (ArtistId) REFERENCES Artist (ArtistId)
);
artists.json
[
{
"name": "The Beatles",
"aliases": ["The Fab Four"]
},
{
"name": "Michael Jackson",
"aliases": ["The King of Pop"]
},
{
"name": "Elvis Presley",
"aliases": ["The King of Rock and Roll"]
},
{
"name": "Madonna",
"aliases": ["The Queen of Pop"]
},
{
"name": "Bob Dylan",
"aliases": []
},
{
"name": "The Rolling Stones",
"aliases": []
},
{
"name": "Elton John",
"aliases": []
},
{
"name": "Mariah Carey",
"aliases": []
},
{
"name": "Stevie Wonder",
"aliases": []
},
{
"name": "Aretha Franklin",
"aliases": ["The Queen of Soul"]
},
{
"name": "Freddie Mercury",
"aliases": []
},
{
"name": "David Bowie",
"aliases": ["The Starman"]
},
{
"name": "Whitney Houston",
"aliases": []
},
{
"name": "Frank Sinatra",
"aliases": ["Ol' Blue Eyes"]
},
{
"name": "Led Zeppelin",
"aliases": []
},
{
"name": "Pink Floyd",
"aliases": []
},
{
"name": "Beyoncé",
"aliases": ["Queen B"]
},
{
"name": "Prince",
"aliases": ["The Artist Formerly Known as Prince"]
},
{
"name": "U2",
"aliases": []
},
{
"name": "Kanye West",
"aliases": ["Ye"]
}
]