I’ve seen that the Whisper API response has so many options such as timestamps and is also providing segments. How to generate this using the Whisper open-sourced model? Like I am looking to use the Whisper Large model. Any specific commands in payload or it involves using more custom models for silence detection and additional prompting?