Async calls made to openai endpoint from .py file is taking longer than the time taken by same async calls from notebook

I’m trying to speeden up multiple openai chat completion api calls by calling them asynchronously. While this improvement is visible in jupyter notebook, but when the same code is run in a .py file it is taking way longer.
This is the code that I’m trying to run in both 1) jupyter notebook and 2) python file. Note: import nest_asyncio and nest_asyncio.apply() are useful only when used in jupyter notebook.

import asyncio
import openai
import time
import nest_asyncio
nest_asyncio.apply()

openai.api_base = ""
openai.api_version = "2023-09-15-preview"
openai.api_key = ""

prompts = ["What are prime numbers", #some random questions
           "Translate this to Spanish : How are you", 
           "Explain the evolution of milkyway galaxy"]

async def process_prompt(prompt):
    loop = asyncio.get_event_loop()
    response = await loop.run_in_executor(None, lambda: openai.ChatCompletion.create(
        engine="development",
        messages=[{'role':'user','content':prompt}]
    ))
    return response.choices[0].message['content']

async def main():
    tasks = [process_prompt(prompt) for prompt in prompts]
    results = await asyncio.gather(*tasks)
    
    for result in results:
        print(result)

start=time.time()
asyncio.run(main())
end=time.time()
print('Time take',end-start)

The avg time it took in jupyter notebook is around 3.5s. The avg time it took in python is around 10s.

I even tried another approach using concurrent.futures, but the results were similar. Why is this happening? Is this because of some restrictions from openai?

I’ll blatantly use GPT-4-Turbo to try to answer your question, while you wait for an actual answer. I work with Python and it also puzzled me, but GPT4T has some great thoughts on why, especially points 2, 4, 5 and 7. Also cProfiler.


The difference in performance you’re observing between Jupyter Notebook and a standalone Python script is unusual because the underlying Python interpreter is the same. However, there are a few potential reasons why you might be experiencing this discrepancy:

  1. Environment Differences: Ensure that the Python environment (including the version of Python, openai library, and any other dependencies) is the same for both Jupyter Notebook and the standalone script. Different environments might have different performance characteristics.

  2. Event Loop Implementation: Jupyter Notebook uses its own event loop, which might be more optimized for I/O operations compared to the default event loop used when running a standalone Python script. This could potentially explain the difference in performance.

  3. Network Latency: If you’re running Jupyter Notebook and the Python script on different machines or under different network conditions, network latency could play a role in the observed performance difference.

  4. Concurrency Limits: The openai API might have concurrency limits that could throttle the number of simultaneous requests. However, this should affect both environments equally unless the API has some sort of session-based throttling that recognizes Jupyter Notebook sessions differently.

  5. Caching: Jupyter Notebook might be caching results or connections in a way that the standalone script is not, leading to faster subsequent calls.

  6. System Load: The system load at the time of running the script can affect performance. Make sure you’re comparing the runs under similar system load conditions.

  7. Output Rendering: Jupyter Notebook might be more efficient at rendering output compared to the console output of a standalone script, although this should not have a significant impact on the timing of the API calls themselves.

  8. Profiling: Use profiling tools to measure where the time is being spent in both environments. This can help you pinpoint if the delay is in the network I/O, the event loop, or somewhere else.

To troubleshoot this issue, you can try the following:

  • Ensure that you’re using the same Python environment for both Jupyter and the standalone script.
  • Profile the code using Python’s built-in cProfile module or other profiling tools to see where the bottleneck is.
  • Run the code multiple times and at different times of the day to account for potential network variability.
  • Check the OpenAI API documentation for any rate limits or concurrency restrictions that might apply.
  • Use logging or debugging to check if there are any errors or retries happening in the standalone script that aren’t occurring in Jupyter Notebook.

If you’ve accounted for all these factors and the discrepancy remains, it might be worth reaching out to OpenAI support to see if they have any insights into whether different environments might trigger different performance profiles on their end.