Video/audio transcription

Any thoughts out there on how to get started with Codex to transcribe audio and video?

As said above I would use one of the existing cloud services.

I did this a while back for a project and used Google Cloud and it worked really well (as long as you have good quality input and realistic about your expectations!)