Scale Multi-modal pipeline azure open ai

I’m working on a project with multi-modal data answering user queries about videos. I was wondering what are common approaches using azure open ai? I’m interested in scalability and security as well. If anyone has any tips or can point me to good blog forum posts I’d appreciate it.