Hello,
I am attempting to create a fine-tuned model for detecting phishing. I downloaded the DOM of a website and extracted various features. These features are passed to a prompt with the verdict as the completion.
Example prompt:
{"prompt": "https://danieltolson.com/', 'https://danieltolson.com/confirm.php?regh=:[http://www.ficohsa.com/]-$%&\n", "completion":" ph-$%&"}
The first problem I encountered was that it returned more options than “clean” or “ph” as the completion. I added stop values and forced the return of “clean” and “ph” completions with logit_bias.
I use the Ada model and these are my parameters:
openai.Completion.create(
model=models[method],
prompt=prompt_line,
max_tokens=1,
temperature=0,
logprobs=2,
logit_bias={27773: 100, 746: 100},
stop=["-$%&"]
I have over 1000 phishing examples and more than 600 clean examples. The result.csv
file contains a score of 1.0 for every metric. However, in reality, I get a probability of >95% for nearly every website that this is phishing, even for clean files from the training set.
Do I misinterpret the results, or are my parameters the cause of this problem?
Side question: With embeddings, is it possible to use the entire DOM for training instead of extracted features? Also, is it possible to pass the entire DOM to get a verdict?