I have been using OpenAI for domain model drafts for months, but it has become absolutely useless. Even llama 3.1 8b does better. This test speeks for it self. Try running this simple prompt towards what ever model you can think of: “ List of entity names for enterprise grade task system. Only names.” It illustrates quite well how the gpt-4 models quickly slides of topic.
LLAMA 3.1 8b:
Project Manager
Task
User
Role
Department
Team
Status
Priority
Assignee
Deadline
Category
Subtask
Comment
Attachment
Workflow
Template
Report
Dashboard
Calendar
Notification
Claude sonnet 3.5:
Here’s a list of entity names for an enterprise-grade task system:
Task
Project
User
Team
Milestone
Priority
Status
Category
Tag
Comment
Attachment
Workflow
Notification
TimeEntry
Role
Permission
Dashboard
Report
Sprint
Dependency
Assignee
Stakeholder
Deadline
Subtask
Template
OpenAI gpt4o:
TaskMaster
TaskPro
TaskCentral
TaskFlow
TaskManager
TaskPrime
TaskNavigator
TaskFusion
TaskTrack
TaskForge
TaskVision
TaskPulse
TaskSync
TaskLink
TaskWorks
TaskSphere
TaskWise
TaskGrid
TaskCloud
TaskElite
TaskMatrix
TaskHub
TaskLogic
TaskEngine
TaskNet
OpenAI gpt4o-mini:
TaskMaster
TaskWave
TaskForge
TaskVault
TaskLink
TaskStream
TaskSphere
TaskPulse
TaskFusion
TaskNet
OpenAI gpt4 (legacy):
TaskTraverse
TaskForceOne
EnterpriseTaskNet
TaskTitan
TaskMajestic
TaskPioneer
TaskConductor
TaskMasterPro
TaskVertex
TaskEnterpriseHub
Conclusion: Iam paying 25$ a month for a completely useless service. Iam using different llm models for lots of developer related task, and I am about to give up on OpenAI. The only reason iam still paying my subscription is to be in line when OpenAI’s next major version is published. But I am loosing faith that it is ever going to happen. I have in the past experienced better performance from OpenAI. What is happening? I would very much like a trustworthy roadmap from OpenAI. Right now I feel very stupid holding on to my 25$ subscription.
Nice to see you back even if it’s for a complaint.
I ran your prompt with 4o and got 25 results. Not sure if these are the words you are looking for though. What you can try is to deactivate Browsing, DALL-E and maybe the code interpreter in your customizations.
As you see in the very light test I ran, many of the suggestions for entity names are totally of topic on OpenAI models. Other models listed in my example has no hard time meeting request/ staying on topic. When it comes to other more complex domain model suggestions things gets even worse for OpenAI models.
No. I used it in past with no responses like what I get now. It’s degraded. Iam also using the APIwith same results. Have been running workflows I have been running long time ago, that are completely useless now. If the current responses are supposed to be high school grade and next generation of models to deliver phd grade answers I guess iam going to be disappointed. Now that Claude/ Anthropic sonnet 3.5 and LLAMA 3,1 is out, it does really make sense with such kind of excuses, like bad user prompting in cases that are pretty straight forward.
Otherwise, I will leave it at this. It would be possible for me to create a prompt that will elicit the answer for this single question but I don’t think you are interested in a solution.
I ran nikolaimanigoff’s prompt just as he posted it in 4o and got 40 results. That was with browsing, DALL-E and code interpreter activated. His list with every entity prefixed by “Task…” just seems odd.
Here are some entity names for an enterprise-grade task management system:
Task
Project
Milestone
Subtask
User
Team
Role
Permission
Assignment
Status
Priority
Comment
Attachment
Label
Due Date
Time Log
Notification
Template
Tag
Checklist
Recurring Task
Dependency
Audit Log
Activity Stream
Report
Dashboard
Calendar
Reminder
Goal
Resource
Workload
Sprint
Kanban Board
Gantt Chart
Integration
Webhook
API Key
Custom Field
Workflow
Approval
These entities cover a broad range of functionalities typically found in comprehensive task management systems, enabling efficient project and task tracking, user management, and integration capabilities.