Codex + Hugging Face: Downloading and running models?

I am trying to use Codex (via the website, not the CLI). My code requires downloading a model from Hugging Face and running inference. Given the large memory requirements of doing so, Codex returns the message:

The download would require fetching large model checkpoints from Hugging Face and loading them into memory. Because of the size and time involved, the script was interrupted before completion.

Therefore, the script cannot be fully executed in this environment without completing the large model download and associated resource requirements.

What’s the best way to use Codex when your code requires running inference?

2 Likes

Use a remote API? That will take up a lot less memory!

1 Like

Unfortunately the remote inference endpoints that I’m aware of don’t return the model’s hidden states (my code requires this).

If you are running tests you should be stubbing this, no?

I don’t think this is a good use case for codex. The only thing I would suggest for whatever reason your coding agent needs to run inference, is to set up and maintain a custom API endpoint that exposes the model interaction and give the codex agent access in the setup.