Missing required parameter: 'file_id'

I am new to this OpenAI API stuff…

and have attempted to follow this documentation to implement the File Search, as required by my project, but in that project’s language: Groovy.

As of right now, I am only tryna test the uploading of files to a vector store that gets created programmatically, and once I have that working, work on the actual prompts that use these files (e.g. extract data from this PDF).

My code, as of right now, looks like this:

public final class OpenAIv4oUtils {
	public static final String apiKey = System.getenv("OPENAI_API_KEY");
	public static Object VectorStore = null,
	Assistant = null;
	
	private static Request.Builder initRequestBuilder() { 
		return new Request.Builder()
			.addHeader("Authorization", "Bearer ${apiKey}")
			.addHeader("OpenAI-Beta", "assistants=v2");
	}
	public static Object UploadFiles(File file) {
		OkHttpClient client = new OkHttpClient()
		
		def payload = [
			file: [
				file_id: file.getName(),
				name: file.getName(),
				content: Base64.encoder.encodeToString(Files.readAllBytes(file.toPath())),
			]
		]
    
	    RequestBody requestBody = RequestBody.create(
	        MediaType.parse("application/json"),
			JsonOutput.toJson(payload),
	    )

		def request = this.initRequestBuilder()
				.url("https://api.openai.com/v1/vector_stores/${this.CreateVectorStore().id}/files")
				.addHeader("Content-Type", "application/json")
				.post(requestBody)
				.build()

		def response = client.newCall(request).execute()
		return new groovy.json.JsonSlurper().parseText(response.body().string())
	}

	public static final Object CreateVectorStore() {
		if (this.VectorStore != null)
			return this.VectorStore;

		def client = new OkHttpClient()
		def request = this.initRequestBuilder()
				.url("https://api.openai.com/v1/vector_stores")
				.post(RequestBody.create(MediaType.parse("application/json"), '{ "name": "Court Cases" }'))
				.build()

		def response = client.newCall(request).execute()
		return new groovy.json.JsonSlurper().parseText(response.body().string())
	}
}

When I run the OpenAIv4oUtils.UploadFiles([somePdfFile]), I get response telling me that the file_id parameter is missing.

idk what that is, what its role is in this request to upload a file, or how I am supposed to fix this…

Did you seriously create that new sock account just to harass me?

If you have not already, then also have a look at the API specs for file-related actions, which offers more information about the file_id parameter.

https://platform.openai.com/docs/api-reference/files

Basically, prior to attaching a file to a vector store, you must upload it via the OpenAI file endpoint with the purpose assistant. As part of the process a file_id is created which you then need to reference in your vector store related operations.

The basic failure here is that as above, you need to first upload a file to OpenAI file storage, to then get a file ID supplied to you by the endpoint as a response after a successful request.

Your assumptions about JSON header are incorrect for your combined attempt to upload and create somehow. The uploading is by multipart/form-data, sending one part as the mime-encoded attachment, and then one part as the purpose part, which must be “assistants”.

Then the next step is create vector store. You can add file_id of multiple files when creating the store, or you can add them after. You cannot upload files directly.

Then include vector store ID obtained as a parameter for create or modify assistant.

Thank you for your detailed response! I not only learn from it, but when I apply it, I get the file_id in success response, that I was expecting!

My code now looks like :

	public static Object UploadFiles(File file) {
		OkHttpClient client = new OkHttpClient()

		RequestBody requestBody = new MultipartBody.Builder()
				.setType(MultipartBody.FORM)
				.addFormDataPart("file", file.getName(), RequestBody.create(MediaType.parse(FileUtils.GetMediaType(file)), file))
				.addFormDataPart("purpose", "assistants")
				.build()

		def request = this.initRequestBuilder()
				.url("https://api.openai.com/v1/files")
				.addHeader("Content-Type", "multipart/form-data")
				.post(requestBody)
				.build()

		def response = client.newCall(request).execute()
		return new groovy.json.JsonSlurper().parseText(response.body().string())
	}

Before I close out this question, and mark your response as the solution, could you do me a favor and walk me through how to perform a File Search with some initial prompt, for example, initial prompt of :

You are a court case examiner. 

You have been tasked with pulling property owner(s) names and mailing addresses, as well as the address of the property that they have Foreclosure Case on, from a complaint PDF.

Extract those from this PDF, and output the data in JSON format that looks like the following: 

{
  owner1Name,
  owner2Name,
  mailingAddress,
  propertyAddress,
}

and the follow-up prompt being either the PDF file to parse, or its text contents ?

The AI is the one that does the searching by writing queries. The AI decides when calls to its myfiles_browser tool will be useful, which gives the search results.

However, the way file search has been implemented in this myfiles_browser by OpenAI is poor. Knowing when searching is useful is slightly impossible for the AI. Irrelevancy is always returned, at your expense.

The instructions that accompany the tool specification tell the AI that any files there would have been uploaded by the user, the one the AI is talking to, even when they are part of domain knowledge for a specialized AI.

I can directly say that I uploaded the knowledge that is part of a programmer’s Assistant.

You have to overcode this poor orientation, and also tell the AI what it will find when doing a search. This recent post gives a prompt to tune-up the functionality.