I am able to instantiate a OpenAI instance and get a ChatCompletions object but my prompt is not good enough to show all the comps of the webpage (saved as a html file in my drive).
My prompt is as :
prompt = “”"
You have a content page from a website.
Get the individual components on this page like header, footer, image, title, body, metadata, links or any other content.
For each component, return:
- a title or heading
- a text content from that component if any
- any other metadata if present
- any images if available
Output the result in form of a JSON array with each element representing a component with its extracted fields.
Here is the HTML content:
\“\”\“{html}\”\“\”
Return only the JSON array.
“”"
How do I improve it?
My get_components function is as follows:
def get_components(html):
messages = \[
{“role”: “system”, “content”: prompt},
\]
client = OpenAI(api_key=OPENAI_API_KEY)
completion = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=messages,
response_format= { "type":"json_object" },
temperature=0.7
)
print('\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*Response is \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*')
print(completion)
response = completion.choices\[0\].message.content.strip()
try:
data = json.loads(response)
return data
except json.JSONDecodeError as e:
print(e)
return \[\]
Thanks