The Q* project is most certainly the continuation of this result from May 2023, where they used Reinforcement Learning for removing hallucinations in mathematical reasoning: Improving mathematical reasoning with process supervision
The tagged question-answering set is published together with the results. So itโs up to you if you would like to try to create AGI yourself ![]()