Why? Docs say to use URL instead of base64

ruki · May 26, 2024, 3:47pm

For long running conversations, we suggest passing images via URL’s instead of base64

at https://platform.openai.com/docs/guides/vision/managing-images

Just why?

_j · May 26, 2024, 4:18pm

For long-running conversations, it may be that OpenAI caches the image request data, so a new download from the remote source is not needed when continuing to pass the URL in past chat, whereas if you sent the BASE64 every time (as you must do with chat completions), your re-sending the same data over and over would have more network delay.

wclayf · May 26, 2024, 5:27pm

I’m surprised they even support base64. That’s the worst way mankind has ever invented to handle images, unless it’s a tiny icon or something. The entire internet uses the URL way to access images, and that’s all that was needed.

EDIT: as mentioned below, despite horrible inefficiency of base64 it might be needed for sending from localhost.

RonaldGRuckus · May 26, 2024, 5:29pm

Well, no.

If I have a local image I would prefer to send it over as b64 (as most email providers do) then upload it to a remote source

wclayf · May 26, 2024, 5:32pm

I should have said “most wasteful way” (of bandwidth), rather than “worst way” which is too vague, but you’re right I wasn’t considering cases when sending from localhost. I’ll edit that post. Thanks for pointing that out.

RonaldGRuckus · May 26, 2024, 5:33pm

Fair. Yeah. Not as efficient as directly downloading the file.

_j · May 26, 2024, 5:43pm

Fortunately, modern HTTP takes care of any inefficiencies for you.

Summary of Instructions

Get the file size: Determine the size of the uploaded image file.
Encode the image in UTF-8 BASE64 string: Convert the image to a BASE64 encoded string.
Compress the BASE64 string using Brotli: Compress the BASE64 encoded string using the Brotli compression algorithm.
Provide the final size: Output the size of the compressed data.

Let’s execute these steps.

Step 1: Get the File Size

First, let’s get the size of the uploaded image file.

import os

# Path to the uploaded file
file_path = '/mnt/data/redheadartifact-bing.jpg'

# Get file size in bytes
file_size = os.path.getsize(file_path)
file_size

Step 2: Encode the Image in UTF-8 BASE64 String

Next, we’ll read the image file and encode it in BASE64.

import base64

# Read the image file and encode in BASE64
with open(file_path, 'rb') as image_file:
    base64_encoded_str = base64.b64encode(image_file.read()).decode('utf-8')

len(base64_encoded_str)

Step 3: Compress with Brotli

Now, we’ll compress the BASE64 encoded string using Brotli.

import brotli

# Compress the BASE64 encoded string using Brotli
compressed_data = brotli.compress(base64_encoded_str.encode('utf-8'))

compressed_size = len(compressed_data)
compressed_size

Step 4: Provide the Final Size

Finally, we will provide the size of the compressed data.

I’ll run this code in the Python environment to get the results.

Results

Original File Size: 94,680 bytes
BASE64 Encoded String Length: 126,240 characters
Compressed Size (Brotli): 87,279 bytes

The image file was successfully encoded and compressed. The final size after Brotli compression is 87,279 bytes.

wclayf · May 26, 2024, 7:54pm

Yeah you can take a compressed image, decompress it, then put it in a less efficient format (base64) and then recompress it again, and send that compressed output. I realize that’s sensible for sending data from localhost, but for most networked components it’s the worst way possible, when you can just use a URL. I stand by my original statement, but it needed more context.

Diet · May 26, 2024, 8:01pm

The whole IP stack is just a gigantic layer cake, I don’t know if nickel and dimeing a couple of megabytes here and there is gonna break the bank on your monthly terabits

LinqLover · May 27, 2024, 1:48pm

Not just sending from localhost, any serverless app that just works classically on a device (or even web apps that are hosted statically) require this feature.

wclayf · May 27, 2024, 4:55pm

Sounds like we’re all in agreement that base64 should be avoided whenever possible, as it is a far less efficient approach than a URL.

LinqLover · May 27, 2024, 5:12pm

I would say only if using URLs does not require a major change in your architecture (like introducing a server). Anyway, has someone measured the real performance impact?

wclayf · May 27, 2024, 6:02pm

We don’t need to conduct any experiments, when we know for sure base64 is 40% larger than the image. (on average). Some people will care. Some people will not care.

LinqLover · May 27, 2024, 6:19pm

Only if network transmission or base64 encoding/decoding is the bottleneck. I would not be surprised if that was just a small share in comparison to the processing inside the model.

_j · May 27, 2024, 7:08pm

I just showed that the BASE64 image in transit is even smaller than the source jpg file when using Accept-Encoding: gzip,deflate,br. Which you misclassed as re-encoding.

wclayf · May 27, 2024, 8:55pm

Well it’s definitely news to me if you’re claim is that taking JPGs, and encoding to BASE64, and then compressing the BASE64, gives a net-net overall decrease in size. But if so, that’s just a measure of how inefficient at compression JPG is then.

And that implies if you did take the actual raw image bytes (from an uncompressed format) and use your same compression algo, then you would get something even MORE highly compressed. Because you wouldn’t be compressing something that’s ALREADY been compressed, if that makes sense.

I was never critiquing the concept of using compression to solve the inefficiency of base64.

Topic		Replies	Views
Gpt-4-vision code example for Base64 encoded images is missing API gpt4-vision	3	5848	November 9, 2023
Moving from gpt-4-vision-preview to gpt-4o Image URL Base64 API gpt-4 , api , gpt-4-vision	2	424	September 11, 2024
Use base64 encoded images or urls within prompts? API gpt-4	2	1344	August 7, 2024
Evaluating image processing costs if sending images using Base64 API	2	60	December 17, 2024
API 4o Assistant thread unable to take base64 image as input API assistants-api , gpt-4o , api-vision	3	148	October 18, 2024