So, I’ve built and application that uses many different fine-tune models interchangeably.
The issue is they’re always asleep! I get the error message “That model is loading, please try again.” inconsistently. Even if the model completed a successful completion a few seconds ago, sometimes it will return the loading error on the next query. Does anyone have any tips for keeping the models awake?
1 Like
From my experience, you need to give a bit of time after fine-tuning is completed, even if you haven’t used it for a while. As I understand the models are waiting in a cold state until get a request. My model needs approx one minute to go hot state.
1 Like
One thing you can do is combine models to do multiple tasks so you’re only using one fine-tune the whole time. This means it’s more likely to stay in memory. As far as I know, finetuning is still in beta so perhaps this will be fixed before it goes GA.
2 Likes
Thanks for the replies, guys. @daveshapautomator I’ve thought about doing that. I was just concerned about wires getting crossed on specialized tasks. I wonder if I can set up automation to keep them hot.