Chatbot for construction related topics

Cleveland · December 20, 2022, 9:17am

Hi all,

I want to fine-tune GPT-3 to answer complex questions, but I cannot figure out if it is possible.
The questions are about building construction. Here are two examples:

{
prompt: “What are the requirements for grouts between plaster walls and ceilings?”,
"completion: “The answer and related references to building regulations and so on and so forth”
}

or

{
prompt: “What is the maximum allowable humidity for load-bearing wooden structures such as rafter structures?”,
"completion: “The answer and related references to building regulations and so on and so forth”
}

I have been reading a few questions related to this topic like:

And I have seen a few videos by @daveshapautomator

vid1
vid2
and many more…

But can I fine-tune GPT-3 to answer questions like those I suggested? After reading the posts and answers and seeing the videos I am a bit confused. Contrary to the case with court cases these questions and answers would be within a range of 2000 tokens, and not necessarily many pages long.

Looking forward to your thoughts about this topic.

/cleveland

sergeliatko · December 20, 2022, 10:10am

Hi Cleveland,

Yes, it is totally doable, but it may be a bit complicated initially.

The approach to take (all of this is my personal opinion, so please take it like that):

Get your “facts” in a row (Extract regulations etc., facts into shorter pieces of text, label them with what they apply to and where they belong to, then embed, maybe use a database like waviate or similar for search) - they will be where the GPT3 model get knowledge from.

And display that prompt to you

Note: items may need to contain not only the text but the references, etc. - so please make sure all of those needed things are included in the prompt because the model needs them to build the answer. Some coding may be needed to find context based on what you have found in the database (like getting items from the database, sorting the best candidates, and searching the source docs to get the complete context items).

Then you need to run the model (text-davinci-003 at the beginning to get the best results) to get the answer and output that answer into an editable text area for you to review.

The goal here is to teach the model to use the prompt to form the correct answer when possible and teach it how to answer when the prompt does not give enough information (not make up the answer). Do not try to teach it what to answer but rather how to answer based on the prompt (the goal of fine-tuning in your use case).

When you’re happy with the edits (or the reply is great out of the box), you need to have the button to save both the prompt (without the “Provide an informative answer to the user inquiry based on the context items below:” part) and the answer into a training file (see formatting in API docs).

Run this at least 500 times (1500 is better) to build your fine-tuning file. Then train a fine-tune (I would try davinci first).

Connect the fine-tuned model to your app and continue running to see if the results suit you (and continue growing the training data file).

Once you’re happy with the results your model gives you, try training curie on the training data and connect it to your app (this will save you response time and money).

Check the curie responses, and if they are good, you have your core model.

Then build a production app based on the core model.

sergeliatko · December 20, 2022, 10:15am

But you will have to count the prompt into those 2k tokens…

daveshapautomator · December 20, 2022, 10:41am

The questions you mentioned are not complex because they have no specific context. They are general knowledge questions, even if they are within a specific domain. You can easily finetune a chatbot to do what posted as an example so long as you know how to synthesize the data

But…

If you were to ask “What is the current tort reform in Germany?” this is a complex question because it requires very specialized knowledge.

This cannot be achieved with finetuning, despite what everyone seems to think. This can only be done with semantic search and Q&A. And anyways, semantic search and Q&A is 10,000,000 times cheaper than finetuning (not to mention 10,000,000x faster) so why bother with a bad method?

Cleveland · December 20, 2022, 11:21am

Hi @sergeliatko and @daveshapautomator,

Thank you both for taking the time to answer, it means a lot to me!

It is an interesting approach you suggested @sergeliatko, I don’t fully grasp all the concepts yet, so I might ask for a few more in the coming days. However, in the documentation about Embeddings under this section, I got a bit confused. I am not sure if this will work in my native language (danish). GPT-3 and ChatGPT usually perform well in danish so maybe the limitations are mostly about mainstream language?
Do you @daveshapautomator agree with the suggested approach? Regarding synthesize the data I guess I could go watch your video about Synthesizing Training Data ? Do you have more videos/blogs you would suggest?

I also have a lot of “real” world high-quality questions and answers, so perhaps synthesizing won’t be necessary?

/cleveland

daveshapautomator · December 20, 2022, 11:23am

Take some time to watch my videos and look at my code - I have already answered most of your questions and even many questions you do not yet know to ask. I promise you will find my content invaluable, like many thousands of others before you!

sergeliatko · December 20, 2022, 12:01pm

The piece of embedded text to answer the question (source: https://procurement-notices.undp.org/view_file.cfm?doc_id=91630 ) should look like that:

Document ID: 91630
Source: https://procurement-notices.undp.org/view_file.cfm?doc_id=91630
Path: SECTION 09200 | LATH AND PLASTER | PART 3 - EXECUTION | 3.02 INSTALLATION | Installation of Plaster: | 3
Text: Grout hollow metal frames, bases and similar work occurring in plastered areas, with base-coat plaster material, and prior to lathing where necessary. Except where full grouting is indicated or required for fireresistance rating, grout 150 mm lengths at each anchorage.

Once you’ve found this from your embeddings, refer to the document and find the “Grout hollow metal frames, bases and similar work occurring in plastered areas, with base-coat plaster material, and prior to lathing where necessary. Except where full grouting is indicated or required for fireresistance rating, grout 150 mm lengths at each anchorage.”

Grab the context around it from the document:

…
B. Installation of Plaster:

General: Unless otherwise specified, comply with the applicable requirements of BS 5492.

Mechanically mix plaster materials; do not hand mix except where small amounts are needed which require less than one bag of cementitious material.

Grout hollow metal frames, bases and similar work occurring in plastered areas, with base-coat plaster material, and prior to lathing where necessary. Except where full grouting is indicated or required for fireresistance rating, grout 150 mm lengths at each anchorage.

Sequence plaster installation properly with the installation and protection of other work, so that neither will be damaged by the installation of the work.
…

Do the same for the other items.

Then form the prompt (approximate):

General: Unless otherwise specified, comply with the applicable requirements of BS 5492.
Mechanically mix plaster materials; do not hand mix except where small amounts are needed which require less than one bag of cementitious material.
Grout hollow metal frames, bases and similar work occurring in plastered areas, with base-coat plaster material, and prior to lathing where necessary. Except where full grouting is indicated or required for fireresistance rating, grout 150 mm lengths at each anchorage.
Sequence plaster installation properly with the installation and protection of other work, so that neither will be damaged by the installation of the work.
…<|endofitem|><|endofcontext|>
<|userinquiry|>What are the requirements for grouts between plaster walls and ceilings?<|endofuserinquiry|>
<|reply|>

And run it (even in the playground) to get:

The requirements for grouts between plaster walls and ceilings are specified in item 3 of Section 09200 - Lath and Plaster - PART 3 - EXECUTION - 3.02 INSTALLATION - Installation of Plaster. According to the source document (https://procurement-notices.undp.org/view_file.cfm?doc_id=91630), item 3 states that grout hollow metal frames, bases and similar work occurring in plastered areas, with base-coat plaster material, and prior to lathing where necessary. Except where full grouting is indicated or required for fireresistance rating, grout 150 mm lengths at each anchorage.
<|endofreply|>

I know the answer is not the perfect one (even wrong I would say), because I didn’t have enough context, but you have those.

sergeliatko · December 20, 2022, 12:03pm

another one

According to BS 5492, full grouting is indicated or required for fire-resistance rating. Grout 150 mm lengths at each anchorage when full grouting is not required. <|endofreply|>

sergeliatko · December 20, 2022, 12:06pm

If your real-life questions contain the data to form the answer or just “common sense,” - yes, it will be useful. Otherwise, see if the model will “make up replies” and the “common sense” is very dangerous in your use case.

That section applies to English only (basically its dialects) which is not your case as you’ll be using modern Danish (no dialects).

Cleveland · December 20, 2022, 2:06pm

Thanks again @daveshapautomator and @sergeliatko,

I definitely have a good starting point - much appreciated!

jfanou · December 26, 2022, 6:02am

Hi Cleveland
I am trying to build something similar for the construction industry but I have limited programming knowledge. I would love to know what you end up coming up with.

For others on the forum. I’d be willing to hire some programmers to help me fine tune / train a model for a similar use case. Please DM me if interested.

Topic		Replies	Views
What to do when fine-tuning is not working? API	21	7961	December 24, 2023
System Instructions for Fine Tuning Community fine-tuning	8	3119	April 22, 2024
QueryGPT - NodeJS QnA chatbot trained on local file using embedding and completion Community	3	4206	April 8, 2023
Fine-tuning GPT-3 for niche language API	7	3290	January 20, 2024
Fine-Tuning with help of massive amount of documents API	25	7007	July 20, 2024

Chatbot for construction related topics

Related topics