GPT ignoring instructions

I’ve had some success creating GPTs for various functions in the past. I’m currently trying to create a set of GPTs to review tests. I have crafted a detailed prompt (with assistance from ChatGPT) that seems pretty comprehensive. However, the GPT simply does not do what it says it will in its summary. It also only ever reviews the first few questions and I’ve tried multiple ‘tactics’ to get it to do what I’ve asked but none have been successful. I’ve given it a one-shot example but that doesn’t seem to help. Any tips?

This is the latest prompt I’ve used:

"Review the provided draft of the XXX exam, ensuring that your feedback adheres to the specific requirements for structure and content. Your review must be organized in a table format, with distinct sections for each of the following areas:

Overview of the Assessment: Provide a summary evaluation of the entire exam, focusing on its alignment with the curriculum and subject guide, the appropriateness for the target student group, and its overall coherence and suitability.

Question-by-Question Analysis: Analyze each question individually in the exam. For each question, assess and provide feedback on the clarity of the question, its relevance to the the programme’s learning objectives, the fairness to all students, and its alignment with the curriculum standards. Ensure that no question is overlooked, and each is scrutinized for its merit and potential impact on students’ understanding and performance.

Markscheme Analysis: Evaluate the markscheme for its clarity, comprehensiveness, and effectiveness in offering a fair and accurate method for evaluating student responses. Discuss the consistency of the marking criteria across different questions and its alignment with the educational outcomes expected from the students.

Present your feedback in a structured table, with clear headings for each section and sub-sections for each question in the ‘Question-by-Question Analysis’. Ensure your analysis is thorough, covering every aspect of the exam draft, and provide actionable suggestions for improvement where necessary.

Carefully review the provided draft of the XXX exam. Your feedback is crucial and must adhere to specific structural and content requirements. Organize your review in a detailed table format, comprising the following key sections:

Overview of the Assessment: In the first section, summarize the exam’s alignment with the curriculum, appropriateness for the target student group, overall coherence, and suitability. This overview should provide a holistic evaluation of the exam’s design and objectives.

Question-by-Question Analysis: Proceed with a meticulous analysis of each individual question on the exam. For every question, create a new row in the table and include detailed feedback on the following criteria, which should form the headings for each row:

The clarity of the question’s wording and presentation.
Its direct relevance to the learning objectives and the subject guide.
The fairness and accessibility of the question to all students, considering diverse learning backgrounds.
Alignment with curriculum standards and the intended educational outcomes.
Suggested enhancements or improvements.

Ensure every single question is accounted for and evaluated in depth, leaving no question unaddressed.

Markscheme Analysis: In another section, assess the markscheme for each question, also in separate rows. Your evaluation should focus on:

The clarity and detail of the marking criteria.
Comprehensiveness in covering appropriate responses and interpretations.
The scheme’s effectiveness in facilitating fair, accurate, and consistent evaluation of student responses.
The alignment of marking standards with the educational goals of the subject guide.
Suggested enhancements or improvements.

Each section of your table should have clear headings, and each question’s analysis should be distinctly separated to ensure clarity and thoroughness. Your review must not only cover every aspect of the exam draft but also provide specific, actionable recommendations for enhancement where needed. Aim for depth in your analysis, substantiating your assessments with clear rationales and references to the curriculum standards where applicable."

2 Likes

I’m really sorry to say this, but apparently what you are trying to do is no longer possible with GPT4.

A lot of prompt are disfunctional by now.

1 Like

Oh wow, thanks for the link to that thread. I’m glad it’s not just me but this is extremely frustrating and surely is detrimental to OpenAI’s future if users are having to jump ship

Hi and welcome to the Developer Forum.

The challenge with your prompt is the scope of what you are asking the model to perform. You are consolidating a lot of requests into a single prompt instead of breaking down the requests into smaller more manageable steps. The issue therefore is not necessarily related to lazyness but the how of what you are asking the model to perform.

At a minimum, you’d need to break down the work into the four building blocks - i.e. overview of the assessment, question-by-question analysis, markscheme analysis, consolidation of feedback - with each being performed in a separate step. That said, even for the question-by-question analysis you are unlikely going to be successful by presenting it with all questions at once. Depending on the complexity of the questions, you might need to process them one by one or in smaller batches.

Take a look at prompt engineering strategies here if you have not already. A few of these issues are addressed there.

I’m pretty confident that you are able to achieve all these things if you just re-structure the approach and perform the steps sequentially/iteratively rather than all at once.

2 Likes

Thanks very much @jr.2509, I tried a step-by-step approach but it then failed to combine all of the parts into one report that I could download (In fact, it’s failed at providing a downloadable document every time I’ve asked for one) but I had reached the conclusion that step-by-step was going to have to be the way forward.

2 Likes

yeah, in the creation of more complex documents ChatGPT or custom GPTs tend to still have higher failure rates unfortunately.

2 Likes