Does openAI provide API that takes Embeddings as an input?

shure.alpha · December 16, 2023, 11:22pm

I found there is an API for text embeddings, can I use this embedding as an input for GPT? I’m not sure if this is possible.

_j · December 17, 2023, 12:02am

The output of the embeddings endpoint is a very large vector

{
  "object": "embedding",
  "embedding": [
    0.0023064255,
    -0.009327292,
    -0.0028842222,
(1500 more dimensional values
  ],
  "index": 0
}

This is useful for comparing algorithmically with other embeddings results to find how similar the language sent is. This can be employed not just for scoring, but also for searching and information retrieval.

As you can imagine, the AI can make no sense of a long list of numbers.

You can use this technology to, for example, make an actions API that returns AI-powered knowledge database. However, furnishing that to public GPTs means you have to pay the bill.

shure.alpha · December 17, 2023, 12:26am

Thank you for your reply.
I’m sorry, I made a misunderstanding.
I know normal GPT cannot accept embeddings as an input. I meant to say does OpenAI provide other APIs that can take text embeddings as an input?
What I want to do is I want to encode the image to text context and give API images as text embeddings.

curt.kennedy · December 17, 2023, 12:43am

You would use GPT-4-Vision to generate the text description of the image.

Then use ada-002 to embed the above text to create your vector.

So a 2-step process.

moonlockwood · December 17, 2023, 1:13am

This is an entirely valid question that many AI researchers are asking. It’s the holy grail for real understanding, and like many things it’s more complicated than it seems.

This is misleading. All operations in a neural networks are ‘long list of numbers’ - also known as high-dimensional vectors.

tokens are a hacky and ugly midpoint to vectors. The question shows insight.

This is useful for comparing algorithmically

That’s because vectors are representations of meaning and they are how neural networks ‘think’ - the weights are fixed, the vectors do the work.

_j · December 17, 2023, 1:18am

Not applicable to someone asking in a roundabout method, “how can I make a ChatGPT GPT do xxx”, though.

“Embeddings” is being used ambiguously, like “stick some data in somewhere”, when it should be clear that it has a very distinct meaning in natural language AI processing.

curt.kennedy · December 17, 2023, 1:25am

It’s probably semantics, but embeddings, or a list of numbers, as input to a language model generally results in gibberish.

We get this question a lot, and @_j is right, basically the input needs to be text, but yes, the internals are numbers.

The OP want’s to encode an image to text and then form an embedding. So it’s a 2-step process.

First have an AI model describe what is in the image and output text. Then take the output text and embed it to create the embedding vector.

matcha72 · December 17, 2023, 4:28am

What is the use-case behind converting your image to text and then to embeddings ?

shure.alpha · December 18, 2023, 2:19am

we can directly integrate text and multiple images

if this is possible, we don’t have to give them to GPT4 separately
GPT4 can understand text and images without connecting them manually by texts

For example, now we have to specify text and images with “first image is …” or “second images is …”

moonlockwood · December 18, 2023, 2:14pm

You can’t pass embeddings. They would be destroyed by tokenisation and no longer embeddings if you were silly enough to try.

The OP want’s to encode an image to text and then form an embedding. So it’s a 2-step process.

This is very fascinating, cool stuff. I do the same kind of things with my own and open source models, where you can stay in the latent and embedding space.

moonlockwood · December 18, 2023, 2:16pm

Yep, clarity is key. How you think about things is pretty much the key to success with llms and neural networks.

Topic		Replies	Views
Embeddings as model input API embeddings , api , prompt	3	2383	June 16, 2023
Get embeddings for images API embeddings , gpt-4-vision	8	25526	February 12, 2025
Embedding - Usage for all GPT text to number applications? Documentation embedding	6	384	October 13, 2024
Embedding tokens vs embedding strings? API	12	7949	February 11, 2024
Using a Custom Tokenizer with GPT Embeddings API	5	3590	March 4, 2024

Does openAI provide API that takes Embeddings as an input?

Related topics