Synthetic data generation with LLM models

I am hearing some people trying to use LLM to create synthetic data to train ML models. I am little confused. Is not LLMs just a probability of words. How will they know what is the data distribution required for the training data. My understanding is that LLM alone cannot do that. It needs to be married with a traditional ML model(like VAE). Am I thinking in a wrong way. Are LLMs so intelligent? my understanding of the technology says something else. Would like to correct myself if i am wrong

You can play with the settings to change the probabilities.

You can also do prompt engineering to enforce uniqueness.