I completely agree with everything Curt just said.
I would just say THE INTERNET in general, the data is still being sent even if you’re encrypting and anonymizing everything. Bad actors can target your connection without access to either endpoint. If you want to make absolutely sure that everything is private, you’ll have to keep it locally, unconnected and behind a locked door.
This may be the most relevant in your case:
I’ll suggest you play around with the en_web_core_sm
from the spaCy library, I’ve been using that for dealing with large amounts of pdf files, it’s very simple but it should be able to most of if not all your exaction task’s locally