I don’t think we’ll see a Code Interpreter API anytime soon because of the risk of runaway generation.
Code Interpreter responds after every code execution, so it’s theoretically possible for a bad actor to engineer a situation which could lead to an essentially infinite loop of token generation from a single prompt.
So, I think OpenAI will need to create some safeguards to put in place to guarantee that cannot happen before we’ll ever see an API endpoint.
If you do that with current CI model it will exit after (I think) either 10 attempts for some predefined token limit (not sure on those numbers exactly, but in the ballpark of 4-5 mins of generation) Yu could presumably have similar limits on the API, but it would be super compute heavy.
You can already get an AI model with the skill of code interpreter:
use a dummy function that doesn’t get called,
Add the ChatGPT code interpreter prompt to the system prompt,
Get python function calls when AI wants to answer by running code (along with things the AI wants to say at the same time).
To chat history, add the AI chat content response and actual unseen AI-generated function language correctly as “assistant” role, and the last line of generated variables from python as function role, and call API again.
Repeat until you don’t get any function calls.
The complete chat history makes the number and type of function call operations apparent, better than ChatGPT history function if it required 10 automatic rewrites and tracebacks for the AI to fix the code it wanted to run.
Just add your actual python notebook sandbox - or not sandboxed. Add new features like mountpoint browsing and URL file location responses.
There very well may be some pre-defined threshold, I’ve not probed enough to begin to identify it.
I’ve definitely had “single” generations between 10k and 20k tokens. I can’t say how long they ran for because I tabbed away and did other things while they were running, but I’m sure it was on the order of at least 4–5 minutes.
Much of this would depend on the actual implementation of a CI API endpoint, specifically when control is returned to the calling function. For instance, if after a code execution it’s logical to kick it back to the caller who would then decide whether or not to direct things to the user or assistant. OpenAI seems to always make a call to the assistant before halting so I could imagine that getting baked into the API. ¯\_(ツ)_/¯
Do you have any source for the existence of a fail-safe?
We’ve interacted enough that I would hope you’d be aware that I’m not a moron—I’m not sure why you thought I needed this.
First, no.
There is a huge difference between calling an OpenAI API which can run code on OpenAI’s servers and running code on my own server based on a response from an OpenAI API.
I did, in point of fact, write my own “Code Interpreter” months before OpenAI announced theirs.
I code mostly in R[1] for my research, so I wrote a plugin for R Studio that calls the OpenAI API to write, edit, and document code. Code is run either in a sandbox or my current interactive session and the results are sent back to the assistant to iterate on or verify correctness.
And yes it does bail out when it has been running too long.[2]
Though, for full transparency, I’ll admit this was a bit of an afterthought as I was using the codex model when it was in free-beta and wasn’t worried about usage. ↩︎
And what is that huge difference then? Perhaps what you haven’t tried to see is that the AI model used when a function is included is the exact same as would be used by code interpreter, with all the fine-tuning. It will make python function calls to complete calculation answers to answer the user without any python function at all, even preferring that over calling a Wolfram Alpha calculator.
Freedom, permanence, your own modules, and ability to dedicate any compute resources or virtualenv you want when it is not running on someone else’s spun up Jupyter instance.
Sorry to insult you with useful information, I’ll make sure that never happens again. The helping, that is.