Train My GPTs Properly using external sources

I have created a couple of “My GPTs” that have helped a lot in creating content as the voice that we have in different companies, with our opinions and point of views. What I am missing is a way to train them with our blogs, not to be copying and pasting, or generating a word for each one of the posts.

Has anyone done this training of your GPTs properly that does not require too much human intervention ? I have several years of podcasts, blogs, etc. that i would love to use.

First, you’re using the wrong term. There is no way to “train” a GPT. You can give it knowledge files for retrieval, but there is no training happening.

Other than that, I’m not entirely sure what it is you’re wanting to do.

How do you give it knowledge files? I’m totally new to this. I own digital copies of all my midwifery textbooks and am trying to create a GPT that can reference the books and answer questions I have or find something within the reading material. Is this even possible?

Possible but very much illegal.

You cannot upload documents for which you do not own the rights.

you can use notebook lm it does exactly that, but it will only give answers with the knowledge already in the files you give it. It’s still great though

Create your own vector database like Apache SOLR with some data stream management with Apache Kafka.

The SOLR data ingestion features and search options are what you need. Read up on Kafka for streamed/event based data ingestion.

Hope it helps.