Re Learn and understanding in the field of DALL-E

chieffy99 · February 25, 2024, 3:34pm

DALLE is not just a visualization tool.

"I’ve noticed that DALL·E’s technical development seems significantly behind, especially after SORA was launched. Many people might think that DALL·E would become unnecessary after seeing the capabilities of SORA. However, I’ve found evidence to the contrary, which is even more significant considering DALL·E acts like a parameter for SORA. The various anomalies highlighted in the introduction clip of SORA are due to developmental imperfections. We can collaboratively address these issues through DALL·E for several reasons:

Language Understanding: According to OpenAI’s research on video generation models as world simulators (Video generation models as world simulators), they apply the re-captioning technique introduced in DALL·E 3 to videos. Currently, DALL·E’s ability to distinguish left from right is lacking. How effectively can SORA operate under these limitations?
Real-Time Learning: ChatGPT learns through RLHF from conversations with users. Many have likely noticed an improvement in how ChatGPT provides feedback, as OpenAI is developing RLHF into a more efficient system. This development could almost entirely replace the need for importing training data. The almost yearly update cycle for training data, yet the capability to support new stories, indicates that we can enhance SORA’s language understanding through DALL·E with widespread usage.
Spatial and Dimensional Understanding in ChatGPT-DALLE: ChatGPT-DALLE has shown a high level of learning about direction, space, and dimension. Not long ago, I encountered a situation requiring a correction with “I think you show the image on the wrong side. Can you flip it?” where the elements of the new image were a continuation from a previous image, simulating the interior of an L-shaped building. If each image represented each arm of the L and the user stood at the corner of L, the revised image 2 showed the side area outside the window as an open area on the inner side of L, which had not been imaged or commanded to that area yet. It was as if it had an understanding of what the area should be like from the overall picture. I initially wondered why techniques for sensing and understanding changes in dimensions and direction at this level in AI were not used to drive vehicles. Upon discovering the relationship between SORA and DALLE, it indicated the necessity for their application and what the future of AI from OpenAI might hold. However, DALL·E is still confused about left and right, indicating a poor development in interpretation and understanding related to direction, space, and dimensions.

What we can do is either wait for AI to mature on its own or accelerate it by pushing for more relevant applications. As for how to use them, I have prepared them in detail.

For me, DALL·E has never been just a tool for creating beautiful images. The first thing I learned about using it was as a tool for learning and developing interpretation and communication skills from perception through human eyes.

We have been able to develop techniques for using DALL·E that go beyond gen id, seed, noise, which has become knowledge that confines us. On the outside, you can create dozens of images from one message from DALL·E and develop it for various uses, creating an understanding of the resulting images and the extent to which they have been randomly changed. What hasn’t changed is the proper use of gen IDs, the appropriate use of GPTs in learning visualization. We haven’t had much discussion or understanding about these things.

Topic		Replies	Views
DALL·E Mastery: Seeking Vibrant Community and Top-Tier Talent API api , dall-e-3 , dalle3	5	732	May 13, 2024
DALL-E API to generate json data from image API api	12	4355	December 19, 2023
Dall-E is sooo bad at recognizing letters and numbers - any advice? Prompting gpt-4 , chatgpt , dalle3 , dalle3-feedback	11	3016	May 17, 2024
How do I replicate browser interface dalle 3 gpt behavior using the API? Community dalle3	20	1789	March 18, 2024
Spelling mistakes in Dalle-3 generated images API gpt-4 , dall-e-3 , dalle3	15	11052	July 31, 2024

Re Learn and understanding in the field of DALL-E

Related topics