Could Someone Give me Advice on Best Practices for Training Large Language Models?

lizaclarraa · April 29, 2024, 8:52am

Hello there,

I am currently working on a project that involves training a large language model similar to GPT-3. I have some experience with smaller models, but this is my first time working with such a large scale.

How did you approach preprocessing and cleaning large datasets for training?

Are there specific modifications or adjustments you made to the architecture to handle the scale?

What training strategies did you find most effective for large models?
How did you manage resources such as GPUs, memory, and storage during training?

Any tips on evaluating and fine-tuning a large language model once it’s trained?

I am also curious to hear about any challenges you faced and how you overcame them.

Any advice, resources, or insights you can share would be greatly appreciated.

Thank you in advance for your help and assistance.

Topic		Replies	Views
Building Own Knowledge Base LLM Community embeddings , chatgpt , api , assistants-api	3	6064	April 8, 2024
Efficient Methods for Processing Large Volumes of Tabular Data with ChatGPT Prompting gpt-4	1	563	August 11, 2024
Pretraining a Model From Scratch. Help a dude Community chatgpt	21	796	May 25, 2024
Strategies for Enhancing Large-Scale Data Analysis and Output GPT builders gpt-4	8	987	February 2, 2024
Training a Custom GPT Model for Call Transcripts Analysis GPT builders chatgpt	4	3049	May 17, 2024

Could Someone Give me Advice on Best Practices for Training Large Language Models?

Related topics