I’m developing a word processor application, and I’m interested in the capacities of GPT to guess the end of a sentence.
Ideally, it would work like Github Copilot: the copilot is aware of the rest of the code, and can generate appropriate answers.
I imagine two possibilities:
- fine-tuning a model with the other inputs of the user
- pass a part of the current text as context to the chat API before asking to complete an unfinished sentence.
I think solution #1 will work better for very large texts, and solution #2 for small texts.
The thing is, the application will have thousands / millions of users. I guess it’s not possible to have millions of different fine-tuned models?
What context strategy Github copilot uses?
I think you are right. Apart from not being able to fine-tune models for every user (not enough data and if there were it would be very expensive) I don’t think it would be worthwhile as you would need a lot of data and expertise in finetuning LLMs. In general I think the best (and easiest) way to do something like that now would be to generate embeddings for the texts you want the LLM to consider, put them in a vector store database, and then use the prompt to query the most similar sections of your stored embeddings. I guess you can experiment with the last step, what you use as context.
If you ever need user-specific analytics, or protection measures like limiting the API cost per user session, I am developing a platform that allows you to do just that. I am also in the process of creating models to help detect prompt injection attemps, and dimensionality reduction models to help with visualizing trends in user prompts/responses. If you are interested, feel free to see more info at llmetrics.app