Assistant Thread Message File Upload

marcord700 · March 2, 2025, 11:56pm

Hello to all,

I have been trying to upload a file using the assistant api in java, I have tried openai-java and simple-openai and I keep getting different errors, I don’t know if it the api but I cant get it to work, I do the upload with type assistant but when I add it to the message thread I get errors.
For reference I’m upload a selfie and a picture id to validate the identity of the user and then I’m going to pass it to a function to save it into a database.

If anybody has had any luck with uploads any guidance is very much appreciated.

Thanks,

Marco

sergeliatko · March 3, 2025, 12:28am

Hi Marco,

Do you have the error logs to share, usually errors (and logs) are pretty useful to get an idea of what is going on. In your message there are no details about the errors.

marcord700 · March 3, 2025, 12:57am

Hi,
The error I get is the following:

Response : {
“error”: {
“message”: “Missing required parameter: ‘content’.”,
“type”: “invalid_request_error”,
“param”: “content”,
“code”: “missing_required_parameter”
}

The code I’m trying is:

ChatMessage.UserMessage userMessage = ChatMessage.UserMessage.of(List.of(
ContentPart.ContentPartText.of(“Requested upload”),
ContentPart.ContentPartType.IMAGE_FILE,
ContentPart.ContentPartImageFile.ImageFile.of(fileId, ImageDetail.AUTO) // detail level “auto”
));
        ThreadMessageRequest messageRequest = ThreadMessageRequest.builder()
                .role(ThreadMessageRole.USER)
                .content(userMessage)
                .build();
        ThreadRunRequest runRequest = ThreadRunRequest.builder()
                .assistantId(Constants.ASSISTANT_REGISTER)
                .additionalMessage(messageRequest)
                .parallelToolCalls(false)
                .build();
        return openAI.threadRuns().createAndPoll(getThreadId(), runRequest);

_j · March 3, 2025, 1:12am

The first thing: decide if you are using a file for computer vision, or if you are sending it into code interpreter to be used with the Python environment.

If it is for image recognition with vision, then the uploaded file purpose needs to be “vision”, not assistants.

Then you can construct a vision user message with a text part and an image part using the file ID.

github.com/openai/openai-java

openai-java-example/src/main/java/com/openai/example/AssistantExample.java

main

package com.openai.example;

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.*;
import com.openai.models.Thread;

public final class AssistantExample {
    private AssistantExample() {}

    public static void main(String[] args) throws Exception {
        // Configures using one of:
        // - The `OPENAI_API_KEY` environment variable
        // - The `AZURE_OPENAI_ENDPOINT` and `AZURE_OPENAI_KEY` environment variables
        OpenAIClient client = OpenAIOkHttpClient.fromEnv();

        Assistant assistant = client.beta()
                .assistants()
                .create(BetaAssistantCreateParams.builder()
                        .name("Math Tutor")

This file has been truncated. show original

        Thread thread =
                // TODO: Update this example once we support `.create()` without arguments.
                client.beta().threads().create(BetaThreadCreateParams.builder().build());
        client.beta()
                .threads()
                .messages()
                .create(BetaThreadMessageCreateParams.builder()
                        .threadId(thread.id())
                        .role(BetaThreadMessageCreateParams.Role.USER)
                        .content("I need to solve the equation `3x + 11 = 14`. Can you help me?")
                        .build());

You’ll need to replace the content string with the multi-part typed content object. The Java SDK docs don’t have direct examples of this.

marcord700 · March 3, 2025, 1:33am

Hi,

I’m upload 2 images one selfie and a picture id, they should be passed as a byte array to a function, the function saves the information collected to a database and runs a validation process

marcord700 · March 3, 2025, 2:45am

I uploaded the image as a bytearray but when I run it I get an error

[ForkJoinPool.commonPool-worker-3] ERROR io.github.sashirestela.cleverclient.client.HttpClientAdapter - Response : {
“error”: {
“message”: “Missing required parameter: ‘content’.”,
“type”: “invalid_request_error”,
“param”: “content”,
“code”: “missing_required_parameter”
}
}]]

marcord700 · March 3, 2025, 4:15am

When I upload the image as a string representation of the image I get a run failed Request too large for gpt-4o so uploading multi-part content won’t work.

a4alina.mail · March 3, 2025, 5:40am

checked if file uploaded response returns a valid file ID before adding it to the thread?
Maybe it’s an issue with how the file is being referenced in the message payload.

marcord700 · March 4, 2025, 3:21am

Hi,

I found a work around, If I don’t upload but use a local file as in the example using chat completions it works

var chatRequest = ChatRequest.builder()
.model(“gpt-4o-mini”)
.messages(List.of(
UserMessage.of(List.of(
ContentPartText.of(
“What do you see in the image? Give in details in no more than 100 words.”),
ContentPartImageUrl.of(ImageUrl.of(
Base64Util.encode(“src/demo/resources/machupicchu.jpg”, MediaType.IMAGE)))))))
.temperature(0.0)
.maxCompletionTokens(500)
.build();
var chatResponse = openAI.chatCompletions().createStream(chatRequest).join();
chatResponse.filter(chatResp → chatResp.getChoices().size() > 0 && chatResp.firstContent() != null)
.map(Chat::firstContent)
.forEach(System.out::print);

But if I try to use an upload file for vison I get the following error:

[ForkJoinPool.commonPool-worker-3] ERROR io.github.sashirestela.cleverclient.client.HttpClientAdapter - Response : {
“error”: {
“message”: “Invalid value: ‘image_file’. Supported values are: ‘text’, ‘image_url’, ‘input_audio’, ‘refusal’, ‘audio’, and ‘file’.”,
“type”: “invalid_request_error”,
“param”: “messages[0].content[1].type”,
“code”: “invalid_value”
}
}]]

the code I’m using is

var chatRequest = ChatRequest.builder()
.model(“gpt-4o-mini”)
.messages(List.of(
ChatMessage.UserMessage.of(List.of(
ContentPart.ContentPartText.of(
“You are a chat assistant to an ai assistant, the response you give will go to the assistant who will then send it back to the user, start your response with {{ The uploaded image contains }}. What do you see in the image? Give in details no more than 30 words.”),
ContentPart.ContentPartImageFile.of(ContentPart.ContentPartImageFile.ImageFile.of(fileid))))))
.temperature(0.0)
.maxCompletionTokens(500)
.build();
var chatResponse = openAI.chatCompletions().createStream(chatRequest).join();
chatResponse.filter(chatResp → chatResp.getChoices().size() > 0 && chatResp.firstContent() != null)
.map(Chat::firstContent)
.forEach(sb::append);

sashirestela · March 5, 2025, 1:38am

Hi @marcord700

I see that you are mixing completely different things: Chat Completions and Assistants API are different approaches and they shouldn’t be mixed.

Trying to interpret your messages I think that you want to use the Assistants API with the vision feature and uploading images to be compared, right?

If that is the case, I have prepared an example using the simple-openai:

Example Code

package io.github.sashirestela.openai.demo;

import java.nio.file.Paths;
import java.util.List;

import io.github.sashirestela.openai.SimpleOpenAI;
import io.github.sashirestela.openai.common.content.ContentPart.ContentPartImageFile;
import io.github.sashirestela.openai.common.content.ContentPart.ContentPartImageFile.ImageFile;
import io.github.sashirestela.openai.common.content.ContentPart.ContentPartText;
import io.github.sashirestela.openai.common.content.ContentPart.ContentPartTextAnnotation;
import io.github.sashirestela.openai.common.content.ImageDetail;
import io.github.sashirestela.openai.domain.assistant.AssistantRequest;
import io.github.sashirestela.openai.domain.assistant.ThreadMessageDelta;
import io.github.sashirestela.openai.domain.assistant.ThreadMessageRequest;
import io.github.sashirestela.openai.domain.assistant.ThreadMessageRole;
import io.github.sashirestela.openai.domain.assistant.ThreadRunRequest;
import io.github.sashirestela.openai.domain.assistant.events.EventName;
import io.github.sashirestela.openai.domain.file.FileRequest;
import io.github.sashirestela.openai.domain.file.FileRequest.PurposeType;
import io.github.sashirestela.openai.domain.file.FileResponse;

public class Example {

    private SimpleOpenAI openAI;
    private String fileIdSelfie;
    private String fileIdPictureId;
    private String assistantId;
    private String threadId;

    public Example() {
        openAI = SimpleOpenAI.builder()
                .apiKey(System.getenv("OPENAI_API_KEY"))
                .build();
    }

    public void run() {
        var question = "Tell me if both images correspond to the same person";
        System.out.println("Question: " + question);

        fileIdSelfie = uploadFile("src/demo/resources/sam_selfie.jpg", PurposeType.VISION).getId();
        fileIdPictureId = uploadFile("src/demo/resources/sam_id.jpg", PurposeType.VISION).getId();

        assistantId = openAI.assistants().create(AssistantRequest.builder()
                .model("gpt-4o")
                .instructions("You are an expert comparing images")
                .build())
                .join().getId();

        threadId = openAI.threads().create()
                .join().getId();

        openAI.threadMessages().create(threadId, ThreadMessageRequest.builder()
                .role(ThreadMessageRole.USER)
                .content(List.of(
                        ContentPartText.of(question),
                        ContentPartImageFile.of(ImageFile.of(fileIdSelfie, ImageDetail.LOW)),
                        ContentPartImageFile.of(ImageFile.of(fileIdPictureId, ImageDetail.LOW))))
                .build())
                .join();

        var responseStream = openAI.threadRuns().createStream(threadId, ThreadRunRequest.builder()
                .assistantId(assistantId)
                .build())
                .join();

        System.out.print("Answer: ");
        responseStream.forEach(e -> {
            switch (e.getName()) {
                case EventName.THREAD_MESSAGE_DELTA:
                    var messageDeltaFirstContent = ((ThreadMessageDelta) e.getData()).getDelta().getContent().get(0);
                    if (messageDeltaFirstContent instanceof ContentPartTextAnnotation) {
                        System.out.print(((ContentPartTextAnnotation) messageDeltaFirstContent).getText().getValue());
                    }
                    break;
                default:
                    break;
            }
        });
        System.out.println();
    }

    public void clean() {
        openAI.files().delete(fileIdSelfie).join();
        openAI.files().delete(fileIdPictureId).join();
        openAI.assistants().delete(assistantId).join();
        openAI.threads().delete(threadId).join();
        System.out.println("\nAll resources were deleted");
    }

    private FileResponse uploadFile(String filePath, PurposeType purpose) {
        var fileRequest = FileRequest.builder()
                .file(Paths.get(filePath))
                .purpose(purpose)
                .build();
        return openAI.files().create(fileRequest).join();
    }

    public static void main(String[] args) {
        var example = new Example();
        example.run();
        example.clean();
    }

}

Example Output

Question: Tell me if both images correspond to the same person
Answer: Yes, both images correspond to the same person, Sam Altman.

All resources were deleted

Moreover, simple-openai has a demo code for Vision on ThreadMessages:

github.com/sashirestela/simple-openai

src/demo/java/io/github/sashirestela/openai/demo/ThreadMessageV2Demo.java

main


      
          public void retrieveThreadMessage() {
              var threadMessage = assistantProvider.threadMessages().getOne(threadId, threadMessageId).join();
              System.out.println(threadMessage);
          }
          
          public void listThreadMessages() {
              var threadMessages = assistantProvider.threadMessages().getList(threadId).join();
              threadMessages.forEach(System.out::println);
          }
          
          public void visionThreadMessage() {
              var question = "Do you see any similarity or difference between the attached images?";
              System.out.println("Question: " + question);
              var file = fileDemo.createFile("src/demo/resources/machupicchu.jpg", PurposeType.VISION);
              var assistant = assistantProvider.assistants()
                      .create(AssistantRequest.builder()
                              .model(this.model)
                              .instructions("You are a tutor on geography.")
                              .build())
                      .join();
              var newThread = assistantProvider.threads().create().join();

marcord700 · March 5, 2025, 2:37am

Thanks for the examples, I’ll try them to see if they work out, let me give you a little context on what I’m doing and how I got it to work.

I got it working with chat completions and the assistant, I know that the APIs shouldn’t mix, let me explain what I’m doing, I have a Yard Management software and I’m currently adding a new feature, using a WhatsApp channel I’m pre registering driver who are going to pickup or deliver, after a series of questions I get the required information from the drivers the selfie and picture of their driver’s license is the las step and after I validate both I add the driver to a database using a function.

How I got it to work, first I get all info using the assistant, when the images are passed from WhatsApp I download the image and upload it to the store as a vison file so I can download it at the end from the function before I save to the database (I pass the file id), and I’m using chat completions to validate the image I get, and the response from the chat completions I pass it into the assistant, and it work perfectly, actually I think it’s better as I don’t have to add more instructions to the assistant and I can define specific tasks to the chat completions and pass them in to the assistant as additional assistant messages.

Thanks,

Marco R.

Topic		Replies	Views
Using images to discuss with an assistant API	14	10011	September 14, 2024
Can Assistants API understand image files uploaded? API	11	10880	September 28, 2024
How to send base64 images to Assistant API? API gpt-4 , chat-with-images , assistants-api , gpt-4o	18	23843	September 18, 2024
Assistance Needed with File Processing Tracking in OpenAI GPT-4 API API knowledge-files	7	2421	February 2, 2024
Chatgpt 4o API For Sending Both PDF and Images API	9	11849	February 12, 2025

Assistant Thread Message File Upload

Related topics