What’s the ideal way to create a model based on a mysql schema? For example, I’ve a really simple shopping cart application with the following schema:
customers: id, first_name,last_name,email,password,billing_streetaddress,billing_city,billing_state,billing_zip,billing_country,shipping_city,shipping_state,shipping_zip,shipping_streetaddress, shipping_country
order_items:id,order_id,product_id, quantity, price
orders:id,customer_id,date_placed,status{enum('pending','shipped','delivered')},total_amount
products:id, name, department,price, quantity,image_url, sfdcID
I’ve created fine-tune models like this:
{"prompt":"Get the list of all products with a price between $50 and $100, sorted by price in descending order ->","completion":" * from products where price between 50 and 100 order by price desc;"}
{"prompt":"Get the details of the most expensive product ->","completion":" * from products order by price desc limit 1;"}
{"prompt":"Get the list of products that are out of stock ->","completion":" * from products where quantity = 0;"}
{"prompt":"Get the list of all products in the Electronics department ->","completion":" * from products where department = 'electronics';"}
{"prompt":"What is the department of the 'Samsung Galaxy S20' product? ->","completion":" department from products where name = 'samsung galaxy s20';"}
.. {there's about another 30 more}
and like this:
{"prompt":"Generate a list of orders placed by customer ID [customer_id] ->","completion":" Order ID: [order_id], Customer ID: [customer_id], Date Placed: [date_placed], Status: [status], Total Amount: [total_amount]\n"}
{"prompt":"Generate a list of orders placed by customer ID [customer_id] that include product ID [product_id] ->","completion":" Order ID: [order_id], Customer ID: [customer_id], Date Placed: [date_placed]\n"}
{"prompt":"Generate a list of orders placed within the last month for product ID [product_id] ->","completion":" Order ID: [order_id], Customer ID: [customer_id], Date Placed: [date_placed], Status: [status], Total Amount: [total_amount]\n"}
{there's about another 30 more}
Seems like the fine-tune models w/SQL statements are more accurate - but is that the right approach? Or, should I be more thinking about better prompts?