Fine-tuning with timestamp or metadata

I want the AI to recognize the training data by date.

Issue: The current setup for fine-tuning does not provide a direct way to incorporate timestamps into the data.

Attempted Solutions:

  • Considered adding timestamps to the data examples, but encountered an error during the fine-tuning process (Error: “Invalid file format”).

What I’m Looking For: A method to incorporate timestamps or metadata(for “timestamp”) into the training data and use it in the fine-tuning process.

Hi @AokiintG - welcome.

You could technically construct your message as a JSON with timestamp being one of the keys.

But can I ask: what are you looking to achieve by including the timestamp in your data set?

@jr.2509 Thank you for the reply.
I want to ensure that the learning model can recognize the date associated with each piece of content in the training data, as the data consists of media content like blog posts that are posted daily.
Is it possible?

Thanks for clarifying.

Some bad news for you. The fine-tuning isn’t actually intended for knowledge injection/acquisition. So you would not have a benefit in including the timestamp in the training data set. The fine-tuning is predominantly designed for adjusting the way a model behaves or the style of output provides. So for example, if you wanted you blog posts to be written in a certain style, a fine-tuned model would be one way to achieve that.

See also here:
https://platform.openai.com/docs/guides/fine-tuning/when-to-use-fine-tuning

If you are looking to build a system that you can use for knowledge retrieval, then you should look into embeddings instead.

Here’s a link that discusses embeddings in more detail:
https://platform.openai.com/docs/guides/embeddings

3 Likes

Actually, I already built a system using embeddings but ended up facing issues with token limits. So, I tried to switch to using fine-tuning. However, it seems that embedding was best for my project. Maybe I should consider adding more metadata beyond the timestamp for search filtering.

1 Like