Limitation from resizeing

Limitations of Image Processing and Spatial Dimensions in Vision:

Currently, ChatGPT’s Vision-based image processing still has fundamental limitations. Although the model has been trained with large datasets, it cannot always accurately determine object coordinates in images, even when the image isn’t particularly complex. There are also other related limitations. Some may not be difficult to address with usage adjustments, but developing these capabilities into AI knowledge is different. For example, the distance and size of objects observed or the images received may not align with the actual files provided by the user.

These limitations arise from several factors, ranging from system issues to various fluctuations that may affect the image, leading to inaccurate coordinate identification and object size calculations in the files. This causes discrepancies in processing. The use of ratio scale can help the model adjust the size of the image being processed without causing issues, thus supporting applications like automatic object detection, object identification in images, design, image editing, and document files.

One key caution regarding errors from other limitations that can result in incorrect distance or size is the dimensional twist. This involves more than just roll or flip, as it alters the spatial reference points, making previously valid measurements inaccurate. It is possible to communicate the increase or decrease in x and y directions as top, bottom, left, and right based on our perspective. However, this doesn’t mean x and y can’t be twisted.

This method was developed to compensate for the gap between the model’s current development and what is necessary. It has been authorized for disclosure by OpenAI after addressing concerns. Correctly recognizing the distance or size of objects can still have errors, and if used carelessly, it can cause harm to users and those around them, such as in tool control. Moreover, once the model is capable of solving these issues independently, this method will no longer be needed.

Prompt: Use the ratio scale instead of the system-provided image scale.

2 Likes

I would say you are thinking at least 2 years ahead. You got to use something like yolo…

Research papers for that say that best results can be achieved with CNN - but I am not really pleased with that and therefor have to use a hybrid solution.

I am combining the result from a CNN with measurements in floor plan drawings where the bounding boxes of that are closest to an identified wall and then have to recalculate the dimensions and positions of all polygons from that.

I was also thinking about using stuff like toilets and bathtubes with more or less standardized sizes to make assumptions on the surrounding walls as well. Let’s see how much I can prioritize that project in the future.

But I see we are going in the same direction on that.

I Dont sure what do you mean.

Is it in this picture?




I am talking about taking the labels which contain values like 4.5m which defines the length of a feature.

This is why I was saying it’s not an AI problem :sob: it’s too soon :disappointed:

What do you want from the model? processing to output or output only. you use CNN together with measurements to create a floorplan. I’m different from you. And the topics I mentioned are not about it. It is not required in my floorplan
.