Improving DALL·E at AGI Level 3 for AutoCAD Tasks and Accurate Technical Visualizations

Context:
Hello OpenAI community, I’m Haris , and I recently conducted several tests with DALL·E to generate technical visualizations of the Pagani Huayra hypercar.

While the tool is impressive in many respects, there are several limitations that I believe need to be addressed—particularly if we want DALL·E (and similar models) to meet the demands of global industries such as automotive design, engineering, and construction.
I’m sharing my experience here in the hopes of sparking a discussion with other developers and researchers about potential solutions. Any feedback, thoughts, or suggestions from the community would be greatly appreciated!

Key Areas for Improvement:

  1. Engine Bay Visualization Inaccuracies:
    During my tests, DALL·E generated engine bay designs that were not technically accurate (e.g., including carburetors where they shouldn’t exist). This indicates that while DALL·E can generate detailed visuals, it lacks the depth of technical knowledge to consistently create realistic representations of complex machinery. How can we improve DALL·E’s technical knowledge base to handle such tasks better?

  2. Interior Design Errors:
    Similarly, in generating the interior of the Pagani Huayra, DALL·E missed key details like the correct placement of the steering wheel and dashboard layout. It seems that DALL·E struggles to align real-world references with its internal representations. Could this be resolved with better training data or an iterative refinement process?

  3. Power Output Specification Issues:
    DALL·E required manual intervention (PDF uploads) to correct technical specifications (such as 900 horsepower and 770 Nm torque for the Huayra). This manual process resulted in delays and a negative impression to clients. What improvements could be made to automate and streamline the handling of technical data in a way that ensures high accuracy right from the start?

  4. Competitiveness with Leading Design Tools:
    When compared to industry tools like Microsoft Designer and Google Gemini Imagen 3, DALL·E’s outputs still fall short in terms of realism, image quality, and precision. How can we bring DALL·E to a competitive level, especially in technical design fields?

Request for AGI Level 3 Capabilities:
I believe that at the AGI Level 3 AI agent stage, DALL·E (and similar AI tools) should be capable of handling complex technical tasks, such as AutoCAD operations, to truly meet global industry needs. Integrating capabilities for generating precise AutoCAD designs would be a game-changer for professionals in engineering, architecture, and related fields. What steps would the community recommend to help bridge the gap between where DALL·E is now and achieving AGI Level 3 capabilities for these tasks?

Specific Discussion Points:

  1. Enhanced Learning from Real-World Data:
    How can DALL·E be trained to more accurately interpret and replicate real-world designs from input images or data?
  2. Increased Accuracy in Technical Outputs:
    What strategies (e.g., data refinement, AI-model enhancements) can improve the accuracy of DALL·E’s technical visualizations, especially in complex fields like automotive or architectural design?
  3. Automated Handling of Technical Specifications:
    How can we develop solutions to eliminate the need for manual data inputs like PDF corrections and ensure that DALL·E can autonomously handle technical specifications?
  4. Improved Image Quality to Compete with Market-Leading Tools:
    What enhancements are necessary to bring DALL·E’s visual outputs to the level of professional tools currently dominating the market?
  5. Capability to Handle AutoCAD Tasks:
    How can we integrate the ability to generate AutoCAD designs into DALL·E, and what challenges might we face in making this happen?

Conclusion:
I hope to hear from the community about potential solutions and ongoing efforts to address these issues. I believe that by working together, we can significantly improve DALL·E and other AI tools, bringing them closer to AGI Level 3 and expanding their applicability across a wide range of technical and creative industries. Thank you in advance for your insights!

Looking forward to a collaborative discussion!
Best regards,
Haris