Is this a right JSON structure of this function?

salemmo409 · April 11, 2024, 5:03am

Hi,

I am trying to make a function for creating images using DALLE3 to be a part of an API Assistant, I tried to imitate to some extent the object found here: https://platform.openai.com/docs/assistants/tools/defining-functions

I picked some parameters from here: https://platform.openai.com/docs/api-reference/images/create
So the description I added to the properties in my JSON is the description of these parameters in that reference.

I added a description also when there is an “enum” in there.

The JSON I made:

{"tools": [{
      "type": "function",
      "function": {
        "name": "createImage",
        "description": "Create an image using DALL.E",
        "parameters": {
          "type": "object",
          "properties": {
            "prompt": {"type": "string", "description": "A text description of the desired image. The maximum length is 4000 characters"},
            "size": {"type": "string", "enum": ["1024x1024", "1792x1024", "1024x1792"], "description": "The size of the generated images"},
            "style": {"type": "string", "enum": ["vivid", "natural"], "description": "The style of the generated images. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images."}
          },
          "required": ["prompt"]
        }
      }	
    }

So is my JSON right or does it need fixing or improvements?

_j · April 11, 2024, 5:26am

DALL-E 3 on API has its own AI that rewrites prompts. The more this AI writes, the longer it takes to get back an image.

The real maximum length that can be passed into DALL-E 3 after its own AI pre-filter is 256 tokens.

So don’t waste time and expense having your chatbot AI write a novel as a prompt one token at a time. Instruct your AI and the DALL-E 3 AI (perhaps by your own backend injection of more “jailbreak” prompt text within the function handling) to prefer passing the user input without any alterations if it conforms.

You can write a much more massive multi-line function description than just “create an image”.

You should inform the AI function of the internal content prohibitions on real people, real recent artists and styles, etc. and to shape the language sent correctly and intelligently. The API rewriter will significantly distort the meaning when it rewrites over these prohibitions, and yet the final “content policy” of DALL-E will give you a costly error if something like “Mickey Mouse” goes through and blocks the image generation.

You should inform the AI that negation, trying to discourage imagery elements by writing about them, simply does not work.

You should inform the AI of the user language, style and content, or instructions that would trigger selection of square, wide, tall images.

Tall images especially need more prompt language about “tall portrait-aspect ratio full-body-length” passed in order for DALL-E to not produce a rotated image. You can do this also by injection after receiving the tool call.

“Natural” now produces poor results from a significantly different model method: people that look like they were Photoshopped in.

The tool spec itself looks fine. The list of tools needs to be closed with ]

salemmo409 · April 11, 2024, 5:55am

Thank you @_j ,
These are really insightful points.
Are there any Keywords to google or resources to learn more about these points?

_j · April 11, 2024, 7:55am

Idea when you’re at the advanced level: another tool property “send_unaltered: boolean” - that lets the user or AI decide that the DALL-E 3 isn’t allowed that rewriting it does.

Also, add a DALL-E 2 function, taking just a prompt. v2 is something ChatGPT Plus doesn’t have. Describe to the AI the function is useful for abstract artistic images like paintings (and the user gets 1024px) - and it costs less.

Much of that is my own knowledge from solving other’s problems.

This forum has its own search, where you get useful responses members have written instead of Google results full of bait videos. You can add “@_j” to the search terms to get what I’ve said before. It even has an AI.

First, I’d recommend reading the image generation docs thoroughly. https://platform.openai.com/docs/guides/images/usage?context=python

Really pore over what they write. For example, they offer a prompt to get less AI rewriting by DALL-E 3, but it is not as effective as wrapping your prompt in a subterfuge of lies to get absolutely no alterations in what you sent…

Other suggestions like actual token length, after which input is discarded, are from a Discord chat with DALL-E developers, others from probing ChatGPT into revealing how it uses a DALL-E tool, others, simply by employing the black box of image generation and seeing what it will produce for you. (As far as having its rewritten prompt be the AI dumping out its own programming )

OpenAI doesn’t even document “hey, we programmed our AI to output markdown formatting at you”. Other non-novel API mechanisms are often regarded as secrets, are documented by misdirection and curtailed by blocking. They don’t tell you how to make a chatbot that exceeds ChatGPT, either…

advanced tricks, for me to know…
dalle-e determinism

johncain194 · April 11, 2024, 7:57am

Hey, how can you still produce images? Is your end alright and functional? My experience with the chat and API were abysmal. It almost always refused my prompts

_j · April 11, 2024, 8:16am

Prompts that “trigger” content policy keywords that the authoring AI is not aware of, and make sense only when viewed by OpenAI’s motivations, will be denied. Such as trademark violations, political imagery, real persons, copying copyrighted works, etc. Understand how it works, don’t try to skew the narrative by dropping “Israel” or “Hamas” in the prompt, contemplate how you’d write a word filter yourself under similar publicity pressures, and you’ll have better success.

Topic		Replies	Views
Differences between Image Generation using API and ChatGPT API gpt-4 , image-generation	13	8036	June 25, 2024
API Image Generation in Dall-E-3 changes my original prompt without my permission API dalle3 , tp-1	28	32431	February 6, 2024
Is OpenAI punishing people for investing in their platform? Community api	9	476	August 11, 2024
Dalle3 prompt to generate pencil sketches keeps including pencils in image Prompting dalle3 , dalle , dalle3-bugs	27	7067	July 2, 2024
Collection of Dall-E 3 prompting tips, issues and bugs, Simplified Prompting dalle3 , dalle , dalle3-bugs	1	667	October 28, 2024

Is this a right JSON structure of this function?

Related topics