Embeddings + answers Endpoint

mounika.alavala · March 29, 2022, 4:33am

Hi,

I am experimenting on a possible use case of querying across large documents. I am currently using embeddings followed by answers endpoint setup. For 1.2 MB of data “text-search-curie-doc-001” model is taking 6 hrs. Is there any way to reduce the time or any alternate approach?

Thank you.

hallacy · March 29, 2022, 10:06am

Hi @mounika.alavala! I think I can help. A few questions:

How many queries does that 1.2MB come out to?
Are you batching the queries or are you sending them one at a time?
I’m a little confused about how you’re using embeddings and answers together. Can you walk me through your steps?

daveshapautomator · March 29, 2022, 11:12am

I noticed it took about 20 seconds for 70 queries even on ADA but I was doing them individually. I suspect that batching them will be much faster.

mounika.alavala · March 29, 2022, 11:44am

Hi @hallacy
Following are the steps:

I clubbed 1.2MB of text data and each line is sent to get embeddings using “text-search-curie-doc-001” model. This step is taking 6hrs time.
After that this csv of embeddings is used by “text-search-curie-query-001”. From the output of “text-search-curie-query-001”, I take 100 search results and feed them to “answers endpoint”.

→ Only 1 query is asked at once.
→ To get embeddings it taking the maximum time. To get the query results it takes only 25 sec (Using “davinci” for answers endpoint).

Reference used : openai-python/Semantic_text_search_using_embeddings.ipynb at main · openai/openai-python · GitHub

daveshapautomator · March 29, 2022, 11:47am

How many transactions/queries is this in total?

Topic		Replies	Views
OpenAI Embeddings - use case Community embeddings , gpt-35-turbo , chatgpt , api	30	4016	October 31, 2023
Semantic text search using Embeddings in a web application API	1	761	December 17, 2023
Embedding Longer Texts API	8	13690	December 25, 2023
Semantic embedding: super slow 'text-embedding-ada-002' API	12	8067	December 24, 2023
Use file with text-davinci-001 to increase tokens in prompt Prompting	13	2532	December 15, 2023

Embeddings + answers Endpoint

Related topics