How to get just one response in a completion LLM, like OpenAI's chat vs completion API

This question is more general than OpenAI’s APIs. It’s about LLMs generally.

Does anyone know how to get a LLM to complete just one line, like a chat conversation? For example, if you prompt it with:

The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.

Human: Hello, who are you?
AI: I am an AI created by OpenAI. How can I help you today?
Human: I’d like to cancel my subscription.

It should just respond with ONE chat message response, like AI: Okay, I can certainly help with that! What is your subscription id?

I have found it will instead respond with multiple chat messages (continuing the conversation too far). And it also repeats the entire conversation which is unnecessary. Anyone know how to get it to just be one chat message like OpenAI does with their chat API?

Generally chat models have the ability to send multiple messages (including system message, and past user/AI messages) in a single request. Then it will only respond with one message.

Chat models and completion models are different. Is there a reason you’re looking to have a completion model behave like a chat model, instead of just using a chat model?

Are you using a stop word like “Human:”… that way, the model will stop when/if it hits those tokens.

Simply include 2 examples in your prompt.

1 Like