I am paying +$1 for a single request on analysing a 200kb image

Can someone explain me why I need to pay so much money for a single request? The API calculator showned me that it should cost less than 5 cents

I am submitting an image file with less than 200kb in size and a variaton less than 2k square resolution

def analyze_chart_with_gpt4o(image_path):
    base64_image = encode_image(image_path)
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    payload = {
        "model": "gpt-4o",
        "messages": [
                "role": "user",
                "content": [
                    {"type": "text",
                     "text": "Analyze this chart with listed / bulletpointed keypoints and price targets. Keep it within 250 characters and use a casual tone."},
                    {"type": "text", "text": f"data:image/jpeg;base64,{base64_image[:50000]}"}
                    # Truncate the base64 string to avoid context length issues
        "max_tokens": 1500

        response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
        response.raise_for_status()  # Raise an HTTPError for bad responses (4xx and 5xx)

        response_json = response.json()
        logger.debug(f"API Response: {response_json}")

        if 'choices' in response_json and len(response_json['choices']) > 0:
            return response_json['choices'][0]['message']['content']
            logger.error(f"Unexpected API response structure: {response_json}")
            return "GPT-4o analysis unavailable due to an unexpected response structure."
    except requests.exceptions.HTTPError as http_err:
        logger.error(f"HTTP error occurred: {http_err}")
        logger.error(f"Response content: {response.content}")
        return "GPT-4o analysis unavailable due to a request error."
    except requests.exceptions.RequestException as req_err:
        logger.error(f"Request error occurred: {req_err}")
        return "GPT-4o analysis unavailable due to a request error."

Please read the vision guide.

You are sending your image data as text,

{"type": "text", "text": f"data:image/jpeg;base64,{base64_image[:50000]}"}

50,000 seemingly random base-64 characters is going to be a ton of tokens, probably not $1.00 worth. Are you exaggerating a bit maybe? It should be not more than about $0.25 for each call.

The solution is almost certainly to just send the image data as the correct type,

          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"

Thanks for your insights! Because of your help I am currently paying 1 penny for the expected results

Exaggerating? possibly. I paid $7.5 for a couple of tests that ran really bad. I got scared probably. The results were terrible too. I asked the bot to analyse a crypto coin with $0,002 value but it reponded with $300 targets. probably because it was barely able to read the chart

1 Like

Glad we got you sorted.

Good luck!

1 Like

That is a very interesting use case. Do you mind sharing if the vision performs as well or even better than just copy-pasting the price data as text? Or, any other reason why to use vision instead of numeric data, such as CSV?

I’ve tried with some articles on vision and text, and I have not benchmarked it yet, but sometimes the photos work quite well, possibly even better, and that is very surprising. I would have expected text to be 10x better.

Now, with data analysis, if it can do equally or even better with charts than numeric data, that would be very surprising.

I don’t feel very conformtable sharing automated marketing strategies (usecase) on a forum filled with developers.

Throw a chart at vision and see what it does

1 Like