How data modelling has been created for ai models like for chatgpt, gemini and deepseek

I know chatgpt, deepseek uses vector databases like pinecone but they also use cloud data warehouses like snowflake, powerbi to create a data modelling right?.