Hello! I am developing software to perform web test automation. The way it works is that you use a graphical interface to compose the test, concatenating several atomic operations (such as “Click”, “Scroll” or “Text typing”), and saving the test produces a JSON file containing a test definition, always structured according to a specific (arbitrary) schema I decided to use.
Each atomic operation has inputs and outputs: for example, the “Click” operation has an input which is merely a string providing an XPath expression pointing to the element to click on.
There are many atomic operations. Every one of them is well-documented according to a specific standard. Specifically, we provide a description of the operation (what does it actually do), a list of inputs and outputs that specific operation can be associated with, and for each of those there’s a description as well. Something like:
"clickOperation": {
"inputs": [
"xpath": {
"type": string,
"description": "XPath expression pointing to the element to be clicked on"
}
]
"outputs": [
"next": {
"type": "pointer",
"description": "Next operation to perform"
}
]
}
I would like to develop a chatbot module for my software using OpenAI APIs, so that I can give the software natural language input and receive a test definition in output. The whole operation documentation is about 10k tokens, and I would like to make use of it.
However, I think that the documentation alone is not sufficient for the model to learn how to create test definitions, so I thought about providing some test examples via fine-tuning, something like {"prompt": "Define a test that opens Google and clicks on the button with XPath '/some/xpath', "completion": {"..."}}
. This would take care of the “training examples” part.
But, how should I provide the documentation? For example, how can I provide the documentation above for fine-tuning? Since it’s not an example of how the model should behave, but rather just a corpus of additional information for the model to better interpret the training examples, I don’t think it’s fit for a prompt-completion pair.
Should I go for a “hybrid” approach? Use embeddings to learn the documentation, and then fine-tuning to train the model to create well-formed test definitions?