Considerations for Creating Datasets for GPT Model Input: Directly Creating with GPT

I’m currently working on building a service that extracts keywords from articles. I’ve crawled approximately 700 articles. However, I need help creating the dataset, so I’m writing this post. Which of the following three options should I pursue?

  1. List the articles to be input into GPT, and have humans manually extract keywords from them to create a dataset in CSV format

  2. List the articles to be input into GPT and use GPT to extract keywords to create the dataset.

  3. Or should I directly apply the list to the GPT model, using it as a knowledge base, without any data preprocessing?

I’m still learning about AI , and I’d like to discuss this matter as I’m inexperienced. :neutral_face: :smiling_face: :kissing_closed_eyes: :melting_face: :melting_face:

1 Like