Trying to figure out how to deal with an issue somehow related to the token limits in gpt3.5, probably you could help with this.
So here’s the scenario:
- I need to regularly structure portions of data according to certain rules.
- These rules/instructions will be sent to chatGPT together with each raw piece of data which has to be structured/formatted via chatGPT API.
- After this portion of text is structured/formatted according to the rules, I need to assign a specific label to it.
Note - All my labels are stored in a vectorDB (since the size of this label set is about 12k tokens and there’s no way to send it together with a prompt to the chatGPT with each result obviously).
- And here is the tough part:
- How can I take the completion from the chatGPT (my structured/reformatted piece of data) and make it address the vectorDB with the labels set to search for relevant label and add it to that completion?
- I need it to check which label from this label set matches this piece of optimized data based on common sense (openAI = AI label; Bard = AI label, etc.) and then find if there’s a similar label in my label set in the DB. And then add this lable to the final completion.
What I thought about is vectorizing the response from chatGPT (first completion) and then doing the search in the vectorDB with the labels set.
After it finds the match in the DB, it will take it from there and add it to the final completion.
Do you guys think it makes sense at all or probably there’re blockers or maybe some easier solutions to achieve it?
Thanks in advance!