Hello developers!
I’m building a Sales Assistant that:
-
Answers questions about the company’s database.
-
Talks to customers to take orders and handle complaints.
-
Engages customers who haven’t made purchases to understand their needs.
To achieve this, I’ve been doing prompt engineering where I include:
-
Database schema.
-
Some rules (critical constraints).
-
Example inputs/outputs.
-
Context and user queries.
Right now, my prompt per request is huge (around 1,500 tokens), which is costly and inefficient.
My questions are:
-
Is there a cheaper or more efficient way to structure this without repeating the entire schema, examples, and rules each time?
-
Would this fall under few-shot prompts, system prompts, or something else?
-
Should I consider alternatives like fine-tuning or embeddings + templates instead of sending everything in every request?
I’d love to hear recommendations or best practices from those who’ve implemented something similar.
Thanks in advance!