Objective: I aim to utilize AI for data analysis from our company system.
Scenario: The user has access to AI directly within our company’s system environment. Communication with OpenAI is exclusively through the API. The user should be able to ask, “What tasks is Paul working on?” and receive an accurate response from the AI.
Approach:
- I initiate a call to an AI assistant - The assistant has a defined function for data retrieval, which it calls.
- Retrieve data from my database - Based on this function, I export the underlying data (such as tasks from the system) and subsequently upload it to the assistant (I’ve tested the json and xlsx formats). The source files have about 300 rows and 15 columns in my testing.
- The assistant’s response is then relayed back to the user.
- Alongside the file, a description of the table columns is provided to the assistant for better navigation.
Used AI models: ChatGPT 4 and ChatGPT4 Turbo models in the assistant, which have shown similar outcomes.
Problems Encountered:
- Long Processing Time: Minimum of 60 seconds on the Assistant’s side, sometimes extending to 5 minutes (exclusive of data export time).
- High Token Consumption: Ranges between 4,000 to 30,000 tokens per question.
- If I ask the same question 5 times, the AI responds correctly 3 times and 2 times it responds that it cannot retrieve data from my system at all.
Request for Assistance: I am seeking guidance from anyone who has tackled a similar scenario on how to efficiently and effectively manage this use-case.
Thank you very much for the answers