Hi and welcome to the developer forum!
Can you give some example of the data you intend to train on, and also of the tasks you need the model to perform?
Also, as of right now the largest context size model that can be trained is 4000 tokens, there are plans by the end of the year to have a trainable 8k model and possibly 16k and 32k but those last two I am not sure about.