Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games

You can try these solutions:

Otherwise embeddings are the way to go as @sps pointed out.
You may even be able to cut down costs and get faster responses by using a standard model with embeddings.