I want to fine-tune a model with a specific prompt: smooth the data using particle filtering methods and predict stock returns using an LSTM model, adjusting parameters to optimize the prediction. The training set includes stock data from the past ten days as input, with the next day’s return as the output.
While I understand that the code might not yield accurate predictions immediately, I’m concerned about whether the model is truly being trained according to my instructions. How can I verify that the model is actually applying the particle filtering and LSTM methods as specified, rather than just performing a basic operation like averaging numbers, especially when the loss remains consistently high? How to trace the training steps or thoughts of the model?
A more general question is whether it’s reasonable to fine-tune such a project by asking an LLM to perform numerical forecasting.
I think what you’re asking could in theory work, but it’s not gonna be using an LSTM model, and it’s not gonna do particle filtering as such.
Best case, the model will pretend to be or do these things, but not actually do it.
I think if you’re set on these methods, you’re probably better off getting chatgpt to walk you through implementing those things in python or something.
You could also use the assistants code interpreter feature to ask the model to build these things and use them, but I don’t think that’s gonna be efficient in your case.
Internal Thoughts
I think it’s important to keep in mind that the model doesn’t “think” in the background, as such. Any “thought” (for most intents and purposes, it gets kinda tricky) would have to be expressed as text first, that’s why a lot of people are using “Chain of Thought” prompts - to get the model to reason out loud about the subject before coming to a conclusion.
Claude, for example “can” “think” in the “background” - but if you take a deeper dig into what’s actually going on, claude is really just generating xml tags into which these thought processes will be generated, and then hidden from the user.
So to answer your question here: it’s up to you to tell the model how to think, and if it’s not talking about what it’s thinking, it’s probably not thinking much.
I think I covered most of your questions here. Hope this helps you get started, but do feel free to ask more!
Thank you for your response. It’s quite clear, but I was wondering if you could provide any references or sources to support your point, so I can have a more solid understanding.
What specifically would you like a reference to? Most of this stuff is operational experience from working full time with this tech for a bit more than two years now.
I’m not an expert on fine-tuning specifically, but I generally steer people away from it because from experience it’s a quick way to give you the illusion like you’ve done a lot of work and created some IP, with relatively little actual benefit. Perhaps @jr.2509 can give you a better assessment of your use-case here.
So you work for OpenAI? That’s really cool! Specifically, do you have any documentation showing that fine-tuning won’t actually perform machine learning algorithms as I requested, but instead just mimics that behavior? Also, any documents showing that the model won’t have Chain-of-Thought like human would be really helpful. I really appreciate your assistance!
Also, @jr.2509, I would greatly appreciate any thoughts you could share on this. Thank you!
No, I don’t work for OpenAI, I’m just a user like you
Fine tuning, or even retraining won’t change the architecture of the LLM. The LLMs are transformer based models, but you’re asking for an LSTM. You can change some of the numbers, but you can’t change the architecture. (especially not by using OpenAI’s fine tuning endpoint)
If you’re familiar stuff like keras and pytorch, you probably know what I mean. If you really want to get into it, you can try to build a GPT from scratch just to see how it works. Here’s a reputable resource on that: