Hopefully I haven’t missed something here, but I’m struggling to get my assistant to properly call it’s function. The function should be used whenever the assistant gets an image as part of the message. If I give the assistant just text it works fine, but if I give it an image and text it hallucinates my entire input.
some examples of the createMessage functions I’ve tried:
V1:
const createMessage = async (threadId, userMessage) => {
console.log('create message triggered')
try {
if (userMessage.image) {
console.log(userMessage.image.split(',')[0])
console.log('create message file:', userMessage.file)
await fetch(`https://api.openai.com/v1/threads/${threadId}/messages`, {
method: "POST",
headers: {
"Content-Type": "application/json",
'Authorization': `Bearer APIKEY`,
'OpenAI-Beta': 'assistants=v1',
},
body: JSON.stringify({
role: "user",
content: userMessage,
tool: "analyzeImage"
})
});
console.log('User message sent to API contains image:',);
}
V2:
const createMessage = async (threadId, userMessage) => {
console.log('create message triggered')
try {
if (userMessage.image) {
console.log(userMessage.image.split(',')[0])
console.log('create message file:', userMessage.file)
await fetch(`https://api.openai.com/v1/threads/${threadId}/messages`, {
method: "POST",
headers: {
"Content-Type": "application/json",
'Authorization': `Bearer APIKEY`,
'OpenAI-Beta': 'assistants=v1',
},
body: JSON.stringify({
role: "user",
inputs: {
text: userMessage.text, // Pass along any user-entered text
data: {
file: userMessage.file,
image: userMessage.image.split(",")[1] // Split the string and take only the base64 part
}
},
tool: "analyzeImage"
})
});
console.log('User message sent to API contains image:',);
}
The function tool is called analyzeImage
"name": "analyzeImage",
"parameters": {
"type": "object",
"properties": {
"image": {
"type": "string",
"contentMediaType": "image/jpeg",
"description": "The base64-encoded string of the image."
}
},
"required": [
"image",
]
},
I saw an example from Geoligard where he uses an actual file url, but since this is being done on mobile I was trying to use just the base64 version of the image. Any input as to why I’m getting the hallucinations and it almost never triggers the function would be appreciated.