Background
I have implemented, thanks to simple-openai
library, an OpenAI util method for parsing court case complaint PDF.
I wrote a Test Case against it, run it multiple times, individually, and it passes every time I run it.
Ok, what’s the problem?!
However, when I call this Test Case in a wait-loop:
for (int i = 0; i < 10; i++) {
WebUI.callTestCase(findTestCase("Test Cases/Unit Tests/Parse PDF File"), null)
WebUI.delay(5);
}
…it fails after a few iterations, and the fail is because we somehow don’t get a response text back.
Console logs
Here is the console info that I see after running it:
2024-07-17 23:37:50.842 INFO c.k.katalon.core.main.TestCaseExecutor - -------------------- 2024-07-17 23:37:50.844 INFO c.k.katalon.core.main.TestCaseExecutor - START Test Cases/Unit Tests/Parse PDF File Consistent data
2024-07-17 23:37:51.164 DEBUG testcase.Parse PDF File Consistent data - 1: for ([i = 0, i < 10, (i++)])
2024-07-17 23:37:51.167 DEBUG testcase.Parse PDF File Consistent data - 1: callTestCase(findTestCase("Test Cases/Unit Tests/Parse PDF File"), null)
2024-07-17 23:37:51.326 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:37:51.326 INFO c.k.katalon.core.main.TestCaseExecutor - CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:37:51.419 DEBUG testcase.Parse PDF File - 1: foreclosureCaseModel = GetInstance().parseCourtCasePdf(new java.io.File($WebDriverUtils.GetDownloadDirectory()/ECF Complaint.pdf.pdf))
2024-07-17 23:37:53.106 INFO com.kms.katalon.core.util.KeywordUtil - Thread created with id: thread_7PDKrxEgus9Rc6uj9CFy0Xch
=====>> Thread Run: id=run_ICKoSlz9bmfDVuY3tHhm5U7R, status=QUEUED
Based on the extracted information from the provided PDF, here is the data in JSON format:
```
{
"caseNumber": "49D01-2404-MF-018805",
"owner1Name": "Terra Property QOZ Fund III LLC",
"owner2Name": "Yohan Naraine",
"mailingAddress": "504 Main Street, Beech Grove, Indiana 46107",
"propertyAddress": "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"
}
```
This information was extracted from the following sections of the document:
- Case Number: "49D01-2404-MF-018805"【4:1†source】
- Owner 1 Name: "Terra Property QOZ Fund III LLC"【4:1†source】
- Owner 2 Name: "Yohan Naraine"【4:1†source】
- Mailing Address: "504 Main Street, Beech Grove, Indiana 46107"【4:1†source】
- Property Address: "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"【4:0†source】【4:1†source】
=====>> Thread Run: id=run_ICKoSlz9bmfDVuY3tHhm5U7R, status=COMPLETED
2024-07-17 23:38:08.184 DEBUG testcase.Parse PDF File - 2: assert foreclosureCaseModel.getOwnerFirstName() == "Terra Property QOZ Fund III LLC"
2024-07-17 23:38:08.187 DEBUG testcase.Parse PDF File - 3: assert foreclosureCaseModel.getOwnerLastName()!= "LLC"
2024-07-17 23:38:08.189 DEBUG testcase.Parse PDF File - 4: assert getPropertyAddress().getAddress() == "3133-3135 Sutherland Avenue"
2024-07-17 23:38:08.193 INFO c.k.katalon.core.main.TestCaseExecutor - END CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:08.193 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:08.200 DEBUG testcase.Parse PDF File Consistent data - 2: delay(5)
2024-07-17 23:38:13.236 DEBUG testcase.Parse PDF File Consistent data - 1: callTestCase(findTestCase("Test Cases/Unit Tests/Parse PDF File"), null)
2024-07-17 23:38:13.310 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:13.310 INFO c.k.katalon.core.main.TestCaseExecutor - CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:13.327 DEBUG testcase.Parse PDF File - 1: foreclosureCaseModel = GetInstance().parseCourtCasePdf(new java.io.File($WebDriverUtils.GetDownloadDirectory()/ECF Complaint.pdf.pdf))
=====>> Thread Run: id=run_sPzEKlWWug3PNvT34otPqOre, status=QUEUED
Based on the extracted information from the provided PDF, here is the data in JSON format:
```
{
"caseNumber": "49D01-2404-MF-018805",
"owner1Name": "Terra Property QOZ Fund III LLC",
"owner2Name": "Yohan Naraine",
"mailingAddress": "504 Main Street, Beech Grove, Indiana 46107",
"propertyAddress": "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"
}
```
This information was extracted from the following sections of the document:
- Case Number: "49D01-2404-MF-018805"【8:0†source】
- Owner 1 Name: "Terra Property QOZ Fund III LLC"【8:1†source】
- Owner 2 Name: "Yohan Naraine"【8:1†source】
- Mailing Address: "504 Main Street, Beech Grove, Indiana 46107"【8:1†source】
- Property Address: "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"【8:0†source】【8:1†source】
=====>> Thread Run: id=run_sPzEKlWWug3PNvT34otPqOre, status=COMPLETED
2024-07-17 23:38:23.636 DEBUG testcase.Parse PDF File - 2: assert foreclosureCaseModel.getOwnerFirstName() == "Terra Property QOZ Fund III LLC"
2024-07-17 23:38:23.637 DEBUG testcase.Parse PDF File - 3: assert foreclosureCaseModel.getOwnerLastName()!= "LLC"
2024-07-17 23:38:23.638 DEBUG testcase.Parse PDF File - 4: assert getPropertyAddress().getAddress() == "3133-3135 Sutherland Avenue"
2024-07-17 23:38:23.638 INFO c.k.katalon.core.main.TestCaseExecutor - END CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:23.638 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:23.639 DEBUG testcase.Parse PDF File Consistent data - 2: delay(5)
2024-07-17 23:38:28.646 DEBUG testcase.Parse PDF File Consistent data - 1: callTestCase(findTestCase("Test Cases/Unit Tests/Parse PDF File"), null)
2024-07-17 23:38:28.710 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:28.710 INFO c.k.katalon.core.main.TestCaseExecutor - CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:28.728 DEBUG testcase.Parse PDF File - 1: foreclosureCaseModel = GetInstance().parseCourtCasePdf(new java.io.File($WebDriverUtils.GetDownloadDirectory()/ECF Complaint.pdf.pdf))
=====>> Thread Run: id=run_CfXrjkCnbZl4KyLR2VEqY9oM, status=QUEUED
Based on the extracted information from the provided PDF, here is the data in JSON format:
```
{
"caseNumber": "49D01-2404-MF-018805",
"owner1Name": "Terra Property QOZ Fund III LLC",
"owner2Name": "Yohan Naraine",
"mailingAddress": "504 Main Street, Beech Grove, Indiana 46107",
"propertyAddress": "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"
}
```
This information was extracted from the following sections of the document:
- Case Number: "49D01-2404-MF-018805"【12:0†source】
- Owner 1 Name: "Terra Property QOZ Fund III LLC"【12:0†source】
- Owner 2 Name: "Yohan Naraine"【12:0†source】
- Mailing Address: "504 Main Street, Beech Grove, Indiana 46107"【12:0†source】
- Property Address: "3133-3135 Sutherland Avenue, Indianapolis, Indiana 46205"【12:0†source】【12:0†source】
=====>> Thread Run: id=run_CfXrjkCnbZl4KyLR2VEqY9oM, status=COMPLETED
2024-07-17 23:38:51.293 DEBUG testcase.Parse PDF File - 2: assert foreclosureCaseModel.getOwnerFirstName() == "Terra Property QOZ Fund III LLC"
2024-07-17 23:38:51.296 DEBUG testcase.Parse PDF File - 3: assert foreclosureCaseModel.getOwnerLastName()!= "LLC"
2024-07-17 23:38:51.297 DEBUG testcase.Parse PDF File - 4: assert getPropertyAddress().getAddress() == "3133-3135 Sutherland Avenue"
2024-07-17 23:38:51.299 INFO c.k.katalon.core.main.TestCaseExecutor - END CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:51.299 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:51.300 DEBUG testcase.Parse PDF File Consistent data - 2: delay(5)
2024-07-17 23:38:56.318 DEBUG testcase.Parse PDF File Consistent data - 1: callTestCase(findTestCase("Test Cases/Unit Tests/Parse PDF File"), null)
2024-07-17 23:38:56.382 INFO c.k.katalon.core.main.TestCaseExecutor - --------------------
2024-07-17 23:38:56.382 INFO c.k.katalon.core.main.TestCaseExecutor - CALL Test Cases/Unit Tests/Parse PDF File
2024-07-17 23:38:56.394 DEBUG testcase.Parse PDF File - 1: foreclosureCaseModel = GetInstance().parseCourtCasePdf(new java.io.File($WebDriverUtils.GetDownloadDirectory()/ECF Complaint.pdf.pdf))
=====>> Thread Run: id=run_gXiRstIxDqYR6rz3FLuFlv6M, status=QUEUED
2024-07-17 23:39:06.855 ERROR c.k.katalon.core.main.TestCaseExecutor - ❌ Test Cases/Unit Tests/Parse PDF File FAILED.
Reason:
java.lang.Exception: We got a response back, that doesn't contain the JSON:
''
at me.mikewarren.myCaseScraper.utils.openAI.OpenAIUtils.parseCourtCasePdf(OpenAIUtils.groovy:86)
at Parse PDF File.run(Parse PDF File:8)
What does your code look like:
The relevant OpenAIUtils
methods look like this:
public ForeclosureCaseModel parseCourtCasePdf(File pdfFile) {
final String responseText = this.sendConversationMessage(pdfConversationHelper, pdfFile);
if (responseText.indexOf("```json") == -1)
throw new Exception("We got a response back, that doesn't contain the JSON:\n\n'${responseText}'")
return ForeclosureCaseModel.FromJSON(responseText.substring(responseText.indexOf("{\n"),
responseText.indexOf("```\n")));
}
public String sendConversationMessage(BaseConversationHelper conversationHelper, File file) {
final String threadId = conversationHelper.getThread().getId(),
assistantId = conversationHelper.getAssistant().getId();
openAI.threadMessages()
.create(threadId, ThreadMessageRequest.builder()
.role(ThreadMessageRole.USER)
.content(conversationHelper.getContent())
.attachment(Attachment.builder()
.fileId(this.uploadFile(file))
.tool(AttachmentTool.FILE_SEARCH)
.build())
.build())
.join();
return this.handleRunEvents(openAI.threadRuns()
.createStream(threadId, ThreadRunRequest.builder()
.assistantId(assistantId)
.build())
.join(),
conversationHelper);
}
public String uploadFile(File file) {
return openAI.files()
.create(FileRequest.builder()
.file(Paths.get(file.getPath()))
.purpose(PurposeType.ASSISTANTS)
.build())
.join()
.getId();
}
/**
* SOURCE: https://github.com/sashirestela/simple-openai/blob/main/src/demo/java/io/github/sashirestela/openai/demo/ConversationV2Demo.java#L120
* @param runStream
*/
private String handleRunEvents(Stream<Event> runStream, BaseConversationHelper conversationHelper) {
String responseText = "";
runStream.forEach({ event ->
switch (event.getName()) {
case EventName.THREAD_RUN_CREATED:
case EventName.THREAD_RUN_COMPLETED:
case EventName.THREAD_RUN_REQUIRES_ACTION:
ThreadRun run = (ThreadRun) event.getData();
System.out.println("=====>> Thread Run: id=" + run.getId() + ", status=" + run.getStatus());
if (run.getStatus().equals(RunStatus.REQUIRES_ACTION)) {
FunctionExecutor functionExecutor = conversationHelper.getFunctionExecutor();
if (functionExecutor == null)
throw new IllegalStateException("Somehow, we have a run event that is in the 'REQUIRES_ACTION' state, but no function executor to use for it!");
def toolCalls = run.getRequiredAction().getSubmitToolOutputs().getToolCalls();
def toolOutputs = functionExecutor.executeAll(toolCalls, { toolCallId, result ->
ToolOutput.builder()
.toolCallId(toolCallId)
.output(result)
.build()
});
def runSubmitToolStream = openAI.threadRuns()
.submitToolOutputStream(conversationHelper.getThread().getId(), run.getId(), ThreadRunSubmitOutputRequest.builder()
.toolOutputs(toolOutputs)
.stream(true)
.build())
.join();
handleRunEvents(runSubmitToolStream, conversationHelper);
}
break;
case EventName.THREAD_MESSAGE_DELTA:
ThreadMessageDelta msgDelta = (ThreadMessageDelta) event.getData();
def content = msgDelta.getDelta().getContent().get(0);
if (content instanceof ContentPartTextAnnotation) {
ContentPartTextAnnotation textContent = (ContentPartTextAnnotation) content;
final String textValue = textContent.getText().getValue();
responseText += textValue;
print textValue;
}
break;
case EventName.THREAD_MESSAGE_COMPLETED:
System.out.println();
break;
default:
break;
}
});
return responseText;
}
and the BaseConversationHelper
looks like:
public abstract class BaseConversationHelper {
protected SimpleOpenAI openAI;
protected String assistantName, assistantInstructions;
protected Assistant assistant;
protected VectorStore vectorStore;
protected Thread thread;
public BaseConversationHelper() {
super();
}
public BaseConversationHelper(SimpleOpenAI openAI, String assistantName, String assistantInstructions) {
super();
this.openAI = openAI;
this.assistantName = assistantName;
this.assistantInstructions = assistantInstructions;
}
public Assistant getAssistant() {
if (this.assistant == null) {
Assistants assistants = openAI.assistants();
Assistant existingAssistant = assistants
.getList()
.get()
.find { Assistant assistant -> return assistant.getName() == this.assistantName };
this.assistant = existingAssistant;
if (existingAssistant == null) {
this.assistant = assistants
.create(this.createRequest())
.join();
KeywordUtil.logInfo("Assistant was created with id: ${this.assistant.getId()} and name '${this.assistant.getName()}'");
}
}
return this.assistant;
}
public AssistantRequest createRequest() {
AssistantRequestBuilder builder = AssistantRequest.builder()
.name(this.assistantName)
.model("gpt-4o")
.instructions(this.assistantInstructions)
.tool(AssistantTool.fileSearch())
.toolResources(ToolResourceFull.builder()
.fileSearch(FileSearch.builder().vectorStoreId(this.vectorStore.getId()).build())
.build())
.temperature(0);
if (this.functionExecutor != null)
builder.tools(functionExecutor.getToolFunctions());
return builder.build();
}
public Thread getThread() {
if (this.thread == null) {
this.thread = openAI.threads()
.create(ThreadRequest.builder().build())
.join();
KeywordUtil.logInfo("Thread created with id: ${this.thread.getId()}")
}
return this.thread;
}
public FunctionExecutor getFunctionExecutor() {
return null;
}
public abstract String getContent();
}
Why am I getting empty response after the third/fourth iteration??
UPDATE
Even if I turn the streaming off in my OpenAIUtils.sendConversationMessage()
util method:
public String sendConversationMessage(BaseConversationHelper conversationHelper, File file) {
final String threadId = conversationHelper.getThread().getId(),
assistantId = conversationHelper.getAssistant().getId();
openAI.threadMessages()
.create(threadId, ThreadMessageRequest.builder()
.role(ThreadMessageRole.USER)
.content(conversationHelper.getContent())
.attachment(Attachment.builder()
.fileId(this.uploadFile(file))
.tool(AttachmentTool.FILE_SEARCH)
.build())
.build())
.join();
ThreadRun threadRun = openAI.threadRuns()
.createAndPoll(threadId, ThreadRunRequest.builder()
.assistantId(assistantId)
.build())
Page<ThreadMessage> messages = openAI.threadMessages()
.getList(threadId, PageRequest.builder().build(), threadRun.getId())
.join();
if (messages.isEmpty())
throw new Exception("Somehow, our request got back a response with no messages!")
ContentPart contentPart = messages.first()
.getContent().first();
return ((ContentPart.ContentPartTextAnnotation)contentPart).getText().getValue();
// return this.handleRunEvents(openAI.threadRuns()
// .createStream(threadId, ThreadRunRequest.builder()
// .assistantId(assistantId)
// .build())
// .join(),
// conversationHelper);
}
I still face the issue!