I have created my own ChatGPT trained on a pdf document,
However I was expecting to see the trained model in the selection dropdown list on the OpenAI Playground which I do not see.
I have created my own ChatGPT trained on a pdf document,
However I was expecting to see the trained model in the selection dropdown list on the OpenAI Playground which I do not see.
It seems we have an influx of people “training” without understanding what’s happening, how to fine-tune a base model (which is not ChatGPT the web site nor the gpt-3.5-turbo model that ChatGPT uses), or when it is appropriate to fine-tune a model or instead when it is appropriate to use a vector database for retrieval of relevant information stored in a local database.
The only way you would create your own model is by the lengthy procedure of creating dozens if not thousands of question-and-answer exchanges in the correct fine tune format, upload them to the API’s file storage, and then perform a processor-intensive training operation.
So can you explain what is happening then if I created some python code to feed into the chatgpt3.5-turbo model to produce a trained model id which was output on the screen?
Surely there would be no Model Id output if it was not a model that I had trained based on the ChatGPT3.5?
Python code as follows:-
import openai
import PyPDF2
with open(‘doc.pdf’, ‘rb’) as file:
pdf_reader = PyPDF2.PdfReader(file)
document_text = “”
for page in pdf_reader.pages:
document_text += page.extract_text()
openai.api_key = my own key
max_token_length = 4096
chunks = [document_text[i:i+max_token_length] for i in range(0, len(document_text), max_token_length)]
model_id = None
training_data = “”
for i, chunk in enumerate(chunks):
# Prepare messages for the chunk
messages = [
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: chunk}
]
# Fine-tune a new model with the chunk
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages
)
# Get the trained model ID
model_id = response['id']
print(“Trained model ID:”, model_id)
You will find your answer where I responded to your near-duplicate topic here: