Optimizing Context Injection for Scalable Script

igorleonir · February 20, 2025, 9:28pm

Currently, I use ChatGPT by sending three documents that detail a pre-sales approach script, which serve to educate the AI about my business model. Following that, I submit a transcript of a user call and ask the AI to assign a rating based on the provided parameters. However, sending these three documents with every prompt becomes prohibitively expensive when scaling up. How can I “teach” the API to internalize this context efficiently?

pretendlake · February 20, 2025, 11:34pm

You can’t “teach” the API your pre-sales approach, however you can design a system which would work as follows:

Generate embeddings of the pre-sales approach script and store the vectors along with the text chunks in a vector database (this is done only once, not every time the user interacts with your assistant)
Make a first API call instructing the GPT they are an assistant whose purpose is to summarize in 1. detailed paragraphs and 2. short explanations the different parts of a transcript and use a function call which accepts two arrays of strings as a parameter
For every short summary in the array, create an embedding and query for similar semantic results in your vector database (where your pdf embeddings are stored)
Make another API call for each summary and pass the matching text chunks as context along with the detailed summary as the user message, and make a good system prompt which instructs the GPT to give a rating of the user message according to the pre-sales approach script guidelines (that is the context you passed). Use structured outputs to get the rating
Combine all the ratings to get the final rating and make an additional API call which instructs the GPT to respond to the user

Thinking about it, you could combine 5 on a single API call instead of making a call per each summary and the result would most likely be the same.

TLDR: Store the pdf as embeddings, ask the GPT to split the transcript into pieces, retrieve the relevant parts of the script according to each part of the transcript, pass it as context and generate the rating

Topic		Replies	Views
Best method of injecting relatively large amount of context to be leveraged in a response API	10	10660	December 17, 2023
Optimization of large requests to GPT API chatgpt , chat-completion , assistants-api	1	1525	November 24, 2023
Using OpenAI API for Document Analysis with Static SOPs API	1	201	August 4, 2024
Over-prompting with irrelevant context Prompting embeddings , gpt-4	8	1610	December 17, 2023
Analyze call transcript (above GPT context windows) API	5	100	November 28, 2024

Optimizing Context Injection for Scalable Script

Related topics