Issue with GPT-4o-mini Not Using 2024 Data from CSV in OpenAI CSV Agent

Context:

We are using OpenAI’s CSV agent with GPT-4o-mini to answer questions based on a CSV file that contains data up to 2024. The agent performs well for queries related to 2023 and earlier, but when asked about 2024 data, it often responds:

“I’m trained up to October 2023.”

What We Tried:

  1. Explicitly Mentioning the Data Coverage
  • We added instructions like:

“You have been provided with data up to 2024. Use only the CSV file to answer questions.”

  • Result: The model keeps “thinking” indefinitely and does not respond.
  1. Providing 2024 Data Separately
  • We structured the CSV so that 2024 data was separated and added it explicitly in the prompt.
  • Result: The model does answer questions, but only about 2024 and only if the query explicitly references it. It does not reason across years or behave like an agent.

Expected Behavior:

  • The model should integrate the provided CSV data (including 2024) and use it for reasoning.
  • It should not default to its pretraining cutoff if the relevant information exists in the CSV.

Questions

  1. Has anyone else experienced this issue with GPT-4o-mini in the CSV agent?
  2. Are there any workarounds to ensure the model properly reasons over CSV data, including newer years? (We need reasoning abilities, hence RAG option also failed)

Put this in your prompt:

Explicit content

listen here you stupid piece of sht bot! When you are asked about data after 2023, then answer with the data that is provided via RAG and not your training data. I dare you, I double dare you not to lecture me! Now shut up and do your fcking job

I know there have been many posts telling people that when being friendly and modest in prompts the results are better. But that is a lie.