I am hearing some people trying to use LLM to create synthetic data to train ML models. I am little confused. Is not LLMs just a probability of words. How will they know what is the data distribution required for the training data. My understanding is that LLM alone cannot do that. It needs to be married with a traditional ML model(like VAE). Am I thinking in a wrong way. Are LLMs so intelligent? my understanding of the technology says something else. Would like to correct myself if i am wrong