Seeking Advice on Integrating TTS-1 API for Reading PDF Book Aloud


I’m currently trying to get my book being read out to me using the TTS-1 api integration. My plan is to use a regular pdfparser to extract the text from the pdf, store it in a variable and then fetch that as the input to the model.

Is this a bad approach or how would you do this?

I’ve done this a lot. In my experience the PDF to text is no where near perfect but may greatly depend on type of PDF content. Now I convert the PDF to a text file, then I clean up the text and then I run that through TTS. I do it page by page and create one audio file per page but you may not need that. Either way you will need to segment the text and merge audio files together.