I asked GPT4 to guess what the API call would look like it said:
As of my last training data in January 2022, OpenAI hadn’t publicly released a specific “ChatGPT-4” API. My training data includes information up to ChatGPT-3, and there was an API for that.
However, let’s assume the API would be somewhat similar to ChatGPT-3’s. In the case of ChatGPT-3, the API call would be something like:
python
import requests
def call_chatgpt4_to_transcribe(image_path):
url = "https://api.openai.com/v1/engines/chatgpt-4.0-turbo/completions" # Assuming hypothetical endpoint
headers = {
'Authorization': 'Bearer YOUR_OPENAI_API_KEY',
'Content-Type': 'application/json',
}
data = {
"prompt": "Transcribe the following image:", # This is just an example; the actual API might require direct image input
"image": image_path # Hypothetical, since ChatGPT-3 doesn't support image input directly
}
response = requests.post(url, headers=headers, json=data)
result = response.json()
# Assuming the transcribed text is available under the 'choices' key
return result['choices'][0]['text']
You can expect when the API is turned on, that role message “content” schema will also take a list (array) type instead of just a string.
Array elements can then be the normal string of a prompt, or a dictionary (json) with a key of the data type “image” and bytestream encoded image data as the value.
const completion = await openai.chat.completions.create({
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Here are two pictures of vegetables I took from the supermarket, which one will be good for curry?" }
],
files: [
fs.createReadStream("IMG0001.png"),
fs.createReadStream("IMG0002.png")
],
model: "gpt-4-vision",
});
Of course, these are just wild guesses, for science! Can’t wait until DevDay when they unveil everything.
How did this site become Reddit, where you simply just make your nonsense look plausible?
You will: try supplying a “prompt” parameter to a nonexistent chat model through an “engines” endpoint and fail…
If you like code:
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
image = b"..." # binary image stream
completion = openai.ChatCompletion.create(
model="gpt-4-0xxx",
messages=[
{
"role": "system",
"content": "You are a helpful assistant and can describe images.",
},
{
"role": "user",
"content": ["What's in this screenshot?", {"image": image}],
},
],
)
print(completion["choices"][0]["message"]["content"])
{
“role”: “user”,
“content”: [“What’s in this screenshot?”, {“image”: image}],
},
This makes sense as I cannot imagine how to put the image in context with separate file input parameter. It definitely needs to be attached in the messages!
The only “enterprise” offering by OpenAI that would mention 32k is ChatGPT Enterprise. It is not “the API”, it is a business-oriented version of ChatGPT, with centralized billing and user management to the web interface chatbot.