Which latest pandas version does OpenAI use to train LLMs?

piterok123 · March 20, 2024, 8:57am

Hi there,

Could you please tell me on which latest pandas version OpenAI LLMs are trained?

Thanks in advance!

Diet · March 20, 2024, 9:07am

Welcome to the community!

Have you tried to ask the model what the last version is that it’s aware of?

While some models say the training cutoff was april 2023, more general knowledge is often limited to somwehere mid 2022.

piterok123 · March 20, 2024, 9:22am

@Diet, thanks for the quick response! I wanted to generate a simple code snippet on pandas 2.0 with ChatGPT. While chatting with ChatGPT 3.5 it told me “As of my last update in January 2022, Pandas 2.0 hadn’t been released yet”. I was wondering if more recent versions of ChatGPT support pandas 2.*?

Diet · March 20, 2024, 9:35am

Unlikely. That said, you can still try to give the model the information it needs to accomplish the task. Maybe give it the list of enhancements so it knows what to do?

piterok123 · March 20, 2024, 9:58am

My intention is to use a latest pandas code, which a model would generate, since libraries (pandas in this certain case) introduce a lot of optimizations in recent releases to achieve ultimate performance. So I would like to get the code of latest pandas from the model if it is possible. Do you think when OpenAI LLMs will be trained on latest pandas versions?

vb · March 20, 2024, 10:58am

ChatGPT 4 has a more recent knowledge cut-off. It’s April 2023.

When using code interpreter it’s also possible to upload a Python wheel and install packages. But this process has never been straightforward and may not be successful.

piterok123 · March 20, 2024, 12:47pm

Given that, is there a chance to know if ChatGPT 4 was trained with pandas 2.0? pandas 2.0 was released in April 2023.

vb · March 20, 2024, 1:29pm

The latest version of the Pandas library I’m aware of is 1.5.2. However, please note that newer versions could have been released after my last update. You can check the latest version by visiting the official Pandas website or by checking the PyPI (Python Package Index).

Got this reply in 2 from 2 tries.

_j · March 20, 2024, 1:34pm

The AI is trained on a corpus of knowledge and code, not directly on “here’s the pandas documentation, now you know it”.

So the actual training and ability to write code is by the sequences of tokens in books, stackexchange, github, etc. and isn’t really divided into versions except for context sequences that lead up and into correct usage (the same way that if you code in python it usually doesn’t switch to FORTRAN or farsi).

piterok123 · March 20, 2024, 2:03pm

Oh, I see, thanks!

Does this mean that if a model generates the code that is back- and forth-compatible with pandas 1.* and 2.*, it is possible to use pandas 2.* to execute the code?

vb · March 20, 2024, 2:11pm

Yes. If the model can code for python 3.10 and there are no game breaking changes in version 3.11 then it can work.

My experience is that if there is a change in between versions you can spot and correct it. But since coding with ChatGPT often resembles a ‘coding at slow typing speed’ process this is quite cumbersome.

If possible sign up for a single month and see if the model can do what you need.

piterok123 · March 24, 2024, 4:16pm

Thank you for answering my questions! That makes sense to me. There is probably one last question. What is the cadence of training models?

Topic		Replies	Views
What do the numbers on GPT models actually mean? API gpt	3	3769	February 12, 2024
Is it time for a GPT-3 Training Data Refresh? API	17	2936	December 19, 2023
It had been a long time since it's been updated Community chatgpt	1	1418	January 14, 2024
Getting full API effectiveness without upgrading Python SDK API	0	229	November 16, 2023
The gpt models are trained with upto DEC 2023 data. How to update the model with latest data of 2024? API gpt-4 , gpt-35-turbo , api	4	5006	April 19, 2024

Which latest pandas version does OpenAI use to train LLMs?

Related topics