Seeking Advice on Integrating TTS-1 API for Reading PDF Book Aloud

Hey,

I’m currently trying to get my book being read out to me using the TTS-1 api integration. My plan is to use a regular pdfparser to extract the text from the pdf, store it in a variable and then fetch that as the input to the model.

Is this a bad approach or how would you do this?

Best regards,
Gustaf

I’ve done this a lot. In my experience the PDF to text is no where near perfect but may greatly depend on type of PDF content. Now I convert the PDF to a text file, then I clean up the text and then I run that through TTS. I do it page by page and create one audio file per page but you may not need that. Either way you will need to segment the text and merge audio files together.

1 Like

Hi, maybe I’m asking a stupid question, but it seems like you know the answer. What do you use to split the text correctly and how do you write the request to the API? Could you give an example of how it works? Thank you.