APIConnectionError or APIError

reniaffer · June 19, 2023, 8:37am

Hi, I am running the following code:

import os

import openai

import pandas as pd

# Set your OpenAI API key

openai.api_key = "[my API]"

# Load the original dataframe from a CSV file or any other source

df = pd.DataFrame(df)

# Split the dataframe into smaller chunks

chunk_size = 1000 # Number of rows per chunk

chunks = [df[i:i+chunk_size] for i in range(0, len(df), chunk_size)]

# Create an empty list to store the embeddings

embeddings = []

# Process each chunk separately

for chunk in chunks:

# Iterate over the dataframe and create embeddings for each text

for index, row in chunk.iterrows():

response = openai.Embedding.create(input=row['text'], engine='text-embedding-ada-002')

embedding = response['data'][0]['embedding']

embeddings.append(embedding)

@retry(delay=1, backoff=2, max_delay=120)

def failsModeration(prompt: str) -> bool:

return openai.Moderation.create(

input=prompt

)["results"][0]["flagged"]

# Assign the embeddings to the dataframe

df['embeddings'] = embeddings

# Save the dataframe with embeddings to a CSV file

df.to_csv('/Python_script/embeddings.csv', index=False)

The code worked with the small amount of tokens (around 400k), but once I try to complete the embedding with 8mln tokens I cannot complete the process. Each time I get one of the two mistakes:

1. APIConnectionError: Error communicating with OpenAI: (‘Connection aborted.’, RemoteDisconnected(‘Remote end closed connection without response’)
1. APIError: Bad gateway.

How can I fix it to complete embedding?

sps · June 19, 2023, 10:42am

Hi @reniaffer

Welcome to the OpenAI community.

This could be the DDoS protection kicking in, because at 8M tokens, you’re just sending so many chunks together in a short amount of time.

Also, what are your rate-limits for the particular embeddings model?

reniaffer · June 19, 2023, 11:16am

Hi,

I’ve decreased the number of chunks to 200 and I now I received the following error: “Timeout: Request timed out: HTTPSConnectionPool(host=‘api. openai. com’, port=443): Read timed out. (read timeout=600)”.

I use text-embedding-ada-002. The limit is 3000 RPM / 250,000 TPM.

sps · June 23, 2023, 9:25am

It looks like the API is taking too long to respond.

Could be because of server side issues.

Also, what is the token size per chunk?

Topic		Replies	Views
Embedding model token limit exceeding limit while using batch requests API embeddings , token , batching	8	26285	October 15, 2023
RateLimitError: Error code: 429 - API chatgpt , api	3	3570	November 18, 2023
Is the embedding service overloaded? API	6	1505	December 18, 2023
Semantic embedding: super slow 'text-embedding-ada-002' API	12	8884	December 24, 2023
APIError - Error 500 using API for Embeddings API embeddings , api , ada	2	2261	June 21, 2023

APIConnectionError or APIError

Related topics