Hi all,
I’m working on a school project focused on creating an advanced coding evaluation system that goes beyond simple output matching. Our goal is to assess logic, efficiency, and problem-solving ability in a more nuanced way. I’ve been reading IEEE papers and attended an HPE workshop on LLMs, but I’m not sure yet if I’ll be focusing on prompt engineering or training a database. We’re planning to use the O1 model, but it’s only me and a friend, and we have six months to deliver. I believe we can do a great job, but I’m looking for advice from the community on the best approach.
Here’s what we’re planning to implement:
Objective:
• A coding evaluation system that considers not just outputs but also evaluates the candidate’s logic, efficiency, and problem-solving approach.
Key Features:
• Nuanced Grading:
• Code Logic and Structure: Assess the logical flow of the code, even with minor syntax errors (e.g., missing semicolons).
• Error Tolerance: Focus on the candidate’s intent rather than penalizing for small mistakes.
• Efficiency: Measure time and space complexity to see how optimized the solution is.
• Problem-Solving Approach: Understand the thought process and award partial credit for good logic, even if the code doesn’t fully run.
• Scoring System:
• Understanding and Approach (40% of the score): How well the candidate understood the problem and applied an effective method.
• Efficiency (30%): How optimized the code is.
• Correctness (30%): How close the solution is to the expected output.
I’d appreciate any tips, advice, or tricks for building something like this within our timeline. What do you think the best approach would be from your experience?
Thanks in advance!