Integrating Custom Image Generation with ChatGPT

kovalenko.nik.gr · February 28, 2024, 4:23pm

Hello everyone! I’m developing an AI chat within our company’s platform, and one of the required functionalities is image generation. Currently, I implement it as follows:

In the ChatGPT request, I specify a callback function for image generation.
This function returns a link to the generated image (which I store on my S3).
Then, I send another hidden message to ChatGPT indicating that the image has already been shown to the user and should not be mentioned in any way.

The process looks like this:

1. {"role": "user", "content": "draw me a tree"}
2. {"role": "assistant", "tool_calls": [{"id": "call_identifier", "type": "function", "function": {"name": "generate_image", "arguments": "{"prompt":"tree"}"}}]}
3. {"name": "generate_image", "role": "tool", "content": "https://yourimagestorage.cloudfront.net/path/to/generated/image", "tool_call_id": "call_identifier"}
4. {"role": "assistant", "content": "DALL·E displayed 1 images. The images are already plainly visible, so don't repeat the descriptions in detail. Do not list download links as they are available in the ChatGPT UI already. The user may download the images by clicking on them, but do not mention anything about downloading to the user."}
5. {"role": "assistant", "content": "Here is your tree. Please have a look at the image provided."}

If I don’t inform ChatGPT that the image has already been shown to the user, it embeds the image within the message in markdown format. However, it’s not officially stated anywhere that ChatGPT responds in markdown format, and secondly, I need to store the image links in a separate table.

The problem is that sometimes ChatGPT still embeds the image link in the text, thus ignoring my instruction. Can anyone share their thoughts and experiences on integrating image generation into dialogue?

trenton.dambrowitz · February 28, 2024, 5:03pm

I’m not an expert at this stuff, but it seems a bit odd to be trying to pass it a hidden message by appending the chat history with a fake assistant message?

The markdown format thing is just a behaviour that it seems to have, quite useful in interfaces like streamlit but can get annoying when its undesired.

First off, have you tried sending that message as user instead of assistant, or even modifying the system prompt for that one interaction? Again, I’m not an expert so that might be a dumb suggestion but could be worth a try!

Topic		Replies	Views
Need Assistance with ChatGPT-Image Integration API	2	1182	December 19, 2023
Information Vision Assistant API API gpt-4 , chat-with-images	2	338	May 22, 2024
Returning image as tool output in Assistants API? API function-calling , gpt-4-vision , assistants-api , tools , gpt-4o	2	2451	May 19, 2024
Prompt and Context best practices with API Prompting chatgpt	4	1972	March 8, 2024
Issues in triggering usage of DALL-E in Custom GPT GPT builders gaming , dalle3	5	1472	June 7, 2024

Integrating Custom Image Generation with ChatGPT

Related topics