What should the size of string be for which to fetch vector embeddings from openai api

AgentX4000 · August 21, 2025, 1:23pm

I have several documents that I need to fetch vector embeddings for using the OpenAI API, I’m using the large model which has a vector length of 3072, for this purpose I’m dividing my documents into segments. What should the length of segment be for which to fetch the vector embedding. I’m asking because I don’t want to segment into too small substrings and end up using a lot of extra memory to store the embedding vectors

_j · August 21, 2025, 1:47pm

Great question! I really like the idea of rolling your own document retrieval so you can tune it exactly the way you want, not someone’s idea of a pay-per-use service.

You’re already thinking in the right direction about the trade-off between vector size and the text it represents. Here’s how it breaks down:

At the large end, you’ve got 3072 dimensions, which works out to about 12kB per embedding if you’re using full dimensionality at 32 bits.
When you look at that, it makes sense to wonder: shouldn’t I be pairing that with roughly 12kB of text, otherwise I’m actually storing more data in the vectors than in the original strings? And honestly, that’s a fair way to think about it.

The nice part is the text-embeddings-3-[large|small] models are flexible. You can use the API’s dimensions parameter to truncate them, or you can do your own reduction (like taking the first portion and re-normalizing). With that, the large model can still perform really well—even down at 256 dimensions or fewer.

To put numbers on it:

256 dimensions × 32 bits ≈ 1kB per embedding, which is a lot lighter.
You can also store them in float16 format to cut the size further, with only a tiny quality drop.

From there, it really comes down to what you’re trying to do. Do you need whole passages that make sense to both an AI and a human? Or are you more focused on retrieving many smaller chunks and ranking/sorting them back into document order? The way you define your “document” and its semantic units is what will guide the best setup.

Then: the queries that are written, and how well they match large vs. small segments of text, are important also. Embeddings themselves are an AI that must understand what a document section and a query are talking about, hopefully to find something that distinguishes each.

Hope that gives you a solid starting point to balance size and performance in the way that feels right for your application!

Topic		Replies	Views
Optimal token size for embeddings model? API	1	4404	February 9, 2023
Embedding Longer Texts API	7	16168	June 19, 2022
Reasonable text length for embedding API	4	2546	January 11, 2023
Searching Using Vectors Derived from Long Text Segments in an Embedding Model API embeddings , api	3	2724	August 8, 2023
Text-embedding-3-large at 256 or 3072 dimensions API gpt-4	0	994	October 3, 2024

What should the size of string be for which to fetch vector embeddings from openai api

Related topics