Title: Implementing Text Extraction, Modification, and File Regeneration with OpenAI Assistant

Hello everyone,

I’m working on a project where I need to integrate the OpenAI Assistant to perform a specific workflow involving file handling and text manipulation. The process is as follows:

  1. File Upload: A user uploads a document (the formats could vary: PDF, DOCX, etc.).
  2. Text Extraction: The OpenAI Assistant needs to extract the text from this uploaded document.
  3. Text Modification: Once the text is extracted, it needs to be updated or modified based on values inputted by the user. This could involve inserting specific phrases, changing certain words, or updating numbers/dates.
  4. File Regeneration: After the text has been updated, the OpenAI Assistant should generate a new file that preserves the original format, style, and font of the uploaded document but contains the updated text.

I’m seeking advice on the best tools, libraries, or APIs that could facilitate each step of this process, especially focusing on maintaining the original formatting and style in the regenerated document. It’s crucial that the output document mirrors the input in terms of visual presentation.

Additionally, any insights on how to seamlessly integrate these functionalities with the OpenAI Assistant or tips on handling different file formats effectively would be greatly appreciated.

4 Likes

I’ve yet to receive any feedback on this post. Could someone please offer their assistance?

Well, it’s technicall possible, with docx at least.

Use the model to understand the text, and then code interpreter to alter it.

What have you tried so far?

well, still i don’t have possible solution for this requirement. I’m looking for better solution. Please guide me how actually is this possible. I mean need to gain the style and format of the file after alter the text.
Please share some resources if possible. Thanks

could you please elaborate this feature? How can we do this

Well, code interpreter can open and create docx files with python…

Hello.
I need more help on this requirement. can you please share some resources?