Sending image generation requests to dalle and sending it to gpt 4

xTarazTwilight · October 17, 2023, 12:34am

Hi I want to create a CLI script for blind users. The script should accept image requests, send them to DallE for processing, then pass the processed images to GPT-4. The results from GPT-4, in the form of generated text, should be output for dialog with the user. The user should also be able to start a dialog with GPT-4 based on the generated text, recreate the image, or save it.

I tried to write a script but for some reason it returns I’m sorry, but as a text-based AI, I’m unable to view or describe images.

Here is my script:

import openai
import os

openai.api_key = "KEY"

def get_images_from_dal_e(description):
    response = openai.Image.create(prompt=description, n=5, size="1024x1024")
    image_urls = [image['url'] for image in response['data']]
    return image_urls

def send_images_to_gpt4(images):
    descriptions = []
    for image_url in images:
        conversation = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": f"Describe this image: {image_url}"}
        ]
        response = openai.ChatCompletion.create(
            model="gpt-4",
            messages=conversation,
            max_tokens=60, 
            temperature=0.3 
        )
        descriptions.append(response.choices[0].message.content.strip())
    return descriptions

def save_image(image, filename):
    with open(filename, 'wb') as f:
        f.write(image)

def main():
    description = input("Enter an image description: ")
    while True:
        images = get_images_from_dal_e(description)
        descriptions = send_images_to_gpt4(images)

        print("Here are the descriptions:")
        for i, desc in enumerate(descriptions, start=1):
            print(f"{i}. {desc}")

        choice = input("Enter the number of the description you would like to learn more about, 'r' to regenerate, 'n' to create a new prompt, or 'x' to exit: ")
        if choice == 'r':
            continue
        elif choice == 'x':
            break
        elif choice == 'n':
            description = input("Enter an image description: ")
        else:
            chosen_desc = descriptions[int(choice) - 1]
            chosen_image_url = images[int(choice) - 1]

            while True:
                action = input("Enter 'q' to ask more questions, 'b' to go back, or 's' to save the image: ")
                if action == 'q':
                    pass
                elif action == 'b':
                    break
                elif action == 's':
                    save_image(chosen_image_url, f"{chosen_desc}.jpg")
                    print("Image saved!")
                    break

if __name__ == "__main__":
    main()

_j · October 17, 2023, 1:51am

Sorry, nope. You neither have GPT-4 with vision nor do you have GPT-4 with some built-in “get this URL” feature.

Topic		Replies	Views
DALL-E API to generate json data from image API api	12	2898	December 19, 2023
GPT-4 API and image input API	49	57579	December 12, 2023
Image to text description in the API? API	7	8847	April 1, 2024
Need Assistance with ChatGPT-Image Integration API	2	751	December 19, 2023
Open AI vision model claiming it's text only? API	4	627	January 18, 2024

Sending image generation requests to dalle and sending it to gpt 4

Related Topics