Correct usage of messages with role="function" for few-shot inference and preserving context

alexander.burtsev · April 2, 2024, 9:18am

Hi,
I try to use the Chat Completion API in different scenarios, but all my cases always rely on the function-calling feature. One of the scenarios is a chatbot for creating and editing images. Let’s reduce the number of function declarations to two for clarity:

functions = [
	{
		"name": "prompt_to_image",
		"description": """The tool uses generative AI models to create images from text prompts""",
		"parameters": {
			"type": "object",
			"properties": {
				"n": {
					"type": "number",
					"description": "The number of images to generate",
				},
				"size": {
					"type": "string",
					"description": "The size of the generated images",
					"enum": ["1024x1024", "1792x1024", "1024x1792"],
				},
				"prompt": {
					"type": "string",
					"description": "Required text prompt for image generation",
				},
				"quality": {
					"type": "string",
					"enum": ["standard", "hd"],
					"description": "The quality of the generated image, where 'hd' indicates higher detail and consistency.",
				},
				"style": {
					"type": "string",
					"enum": ["vivid", "natural"],
					"description": "The style of the generated images, influencing the realism and dramatic effect.",
				},
			},
			"required": ["prompt"],
			"additionalProperties": False,
		}
	},
	{
		"name": "resize_image",
		"description": """Resize image tool. It is useful when it's necessary to resize the image to the given size.""",
		"parameters": {
			"type": "object",
			"properties": {
				"url": {
					"type": "string",
					"description": "Required non-empty source image URL",
				},
				"width": {
					"type": "number",
					"description": "The width of the target image",
				},
				"height": {
					"type": "number",
					"description": "The height of the target image",
				},
			},
			"required": ["url", "width", "height"],
		}
	}
]

Now, I want to explain to the LLM how to handle user prompts using few-shot inference.
Also, I want to pass history to maintain the conversation context.
The main question is, how can I do that? I’ve seen different and controversial approaches even on this forum.
One important thing here is that I don’t have a summarization phase and don’t need the LLM to complete the text after the actual function call. I just have results that look like an image URL and some metadata about the image (id, width, height, size, etc.)

APPROACH 1

messages.append({
	"role": "system",
	"content": """You are a helpful assistant who helps to create and edit images.
Select the most suitable function based on the user's request.
Don't make assumptions about what values to plug into functions"""
})
messages.append({
	"role": "user",
	"content": "Generate a picture of a cat"
})
messages.append({
	"role": "function",
	"name": "prompt_to_image",
	"content": "{\"prompt\": \"A picture of a cat\"}"
})
messages.append({
	"role": "assistant",
	"content": """Here is the image I created based on your request
ID: IMG1
URL: https://example.com/1.png
Width: 1024
Height: 1024
"""
})

APPROACH 2

messages.append({
	"role": "system",
	"content": """You are a helpful assistant who helps to create and edit images.
Select the most suitable function based on the user's request.
Don't make assumptions about what values to plug into functions"""
})
messages.append({
	"role": "user",
	"content": "Generate a picture of a cat"
})
messages.append({
	"role": "assistant",
	"content": None,
	"function_call": {
		"name": "prompt_to_image",
		"arguments": "{\"prompt\": \"A picture of a cat\"}",
	},
})
messages.append({
	"role": "function",
	"name": "prompt_to_image",
	"content": """Here is the image I created based on your request
ID: IMG1
URL: https://example.com/1.png
Width: 1024
Height: 1024
"""
})

The main question. Which of the two approaches is correct? Or are neither of them correct, and something third is needed? Thank you!

Topic		Replies	Views
Context in seperate messages, or all in user prompt API gpt-35-turbo , api , function-calling , long-context	0	349	March 3, 2024
Few-shot and function calling API	24	9252	December 27, 2023
Use function only in last message API gpt-4 , functions , function-calling	8	2742	November 23, 2023
What is the best way to format the output of a function in the conversation? API api , functions	4	4276	July 24, 2023
Chat Completion Message Content + Function_Call API gpt-35-turbo , chatgpt , api	8	7379	December 15, 2023

Correct usage of messages with role="function" for few-shot inference and preserving context

Related Topics