Will GPT-3 understand data better with HTML tags?

yonish3 · November 12, 2022, 3:10pm

I’m wondering if GPT-3 can use basic HTML tags to better understand the content.
The idea is to feed GPT-3 with chunks of text from a webpage.

The use case is the send webpage content via prompt and ask GPT-3 some questions about it.

Option 1: get pure text out of the webpage
Option 2: leave basic HTML tags like <h1>, <h2>, <p>, etc., and remove token-consuming chars like CSS and non-text tags like empty div or sections.

I was thinking maybe GPT-3 can use these tags to better understand the structure like titles and subtitles.

What are your thoughts?

jeffinbournemouth · November 18, 2022, 10:02am

I am currently using Davinci 2 for extracting information (used for content summarization and categorisation) from full raw website pages.

The results are excellent with or without HTML.

Example (prompt in bold):

<“homepage content”>

Now we read the information on the homepage and list all of the categories and tags for business category, business facilities, business features, equipment, security, parking, opening times, and classes etc at this location, as a csv:

Business category: Gym
Business facilities: Cardio equipment, weights, group classes, personal training
Business features: 24/7 access, security, hygiene standards
Equipment: Cardio equipment, weights
Security: 24-hour security, secure key access
Parking: Yes
Opening times: 24/7
Classes: Group classes

Topic		Replies	Views
GPT-3 to markup a document API	1	1157	December 16, 2023
Context as HtML vs plain text API	3	2297	December 4, 2023
Html in text uploaded via files api API	2	1461	May 4, 2022
Is it possible to get the response text in HTML format? Prompting	6	16772	July 8, 2024
Use XML tags to structure my prompts Prompting gpt-4 , api	1	3369	January 5, 2025

Will GPT-3 understand data better with HTML tags?

Related topics