Use GPT-3 to find the frequency of keywords within my own dataset

How could I use GPT-3 API to find the frequency of keywords (including their synonyms and variations) within my own dataset?

Some more details. I have dataset of 24k documents, where each document has about 200 words. I’d like to extract the most frequent keywords of this dataset. How could I do that using GPT-3 API?

Thanks

I’m not sure what do you mean by keywords, will you elaborate on this? Or give me an example

Hi Kevin,
Thank you for your kind reply.

Here what I mean by keywords. This is an example I extracted from chatGPT.

image

I’d like to extract these keywords and their relatively importance (frequency, TF-IDF or any importance metric like this) but considering only the documents in my dataset. In the example above, the frequency is estimated considering the whole dataset chatGPT was trained on.

Thanks,
Alex

Try a few shot examples, along with instructions ; for instance

Given the following text, identify and list the most relevant keywords, terms associated with the subject matter

{example text}

Keywords:

—-
{example text2}

Keywords:

—-

{example text3}
Keywords:

I’m using my phone and not testing this prompt yet, but if does not work try fine-tuning

Hi Kevin,

It works really well. GPT is able to extract the keywords when prompted with a document.

Now that I have the keywords, how would we estimate the relevance of these keywords within my dataset? The relevance of these keywords in a bigger dataset (such as GPT’s dataset) does not solve my problem.

Thanks,
Alex

One option would be to calculate the cosine similarity of the keywords and the document using embedding