Hi, I want to use LLaVA but with very long prompt over 5000 token.
I’m now using GPTs with long prompt and it works very well.
So I want to replace GPT4 with LLaVA, open source VLM model however, I cannot find the way I give long prompt
Do you have good idea?
Do you have an example of the prompt?
If I’m hearing you right, you have something working great on GPT-4 and want to try to recreate it on a smaller, less capable free model?
Yes you are right
ChatGPT has feature like gpts
Can we do some things on llava
Can we treat system prompts separately ?
I’m not familiar with it, but likely not. ChatML is being copied by some, but OpenAI has it locked down at the moment.
If you experiment, though, be sure to come back and let us know how it goes.
Ah, I have an additional question
Can we recreate retrieval feature of Assistant API with an open-sourced model?
You can recreate anything an Assistant can do.
I would consider researching LangChain. Their framework currently leads on this front.
The “retrieval” mechanism is likely RAG (retrieval augmented generation), something that has existed well before Assistants. Meaning, there’s plenty of good documentation out there to get you started
Thank you for your advice,
could you give me good material that I can start with ?
I would read this just to get familiar:
Then, it would be a matter of picking your own DB to store the vector embeddings and going from there.
The OpenAI Cookbook has some good examples with different databases. Supabase might be a good one to start maybe?
This forum is also filled with people asking RAG questions, so you may be able to search around topics here in the forum for more particular questions as well.
THank you so much, I will start with these material!