Hi, has anyone tried to use the vision capabilities to perform Human pose estimation? Traditionally, this is usually done with models like OpenPose, Mediapipe, PoseNet, etc and involves taking in an person image/video and identifying body landmarks like shoulders, hips, ankles, knees, face, etc. I have been looking to see if the vision capabilites of open a are able to perform this but if you have tried it please do let me know
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Did anyone try new gpt4 o model for text extraction from an image? | 2 | 1212 | June 10, 2024 | |
GPT using vision capabilities for images returned from actions? | 4 | 779 | March 18, 2024 | |
Can a train a gpt-4 vision model on a dataset of images? | 15 | 4322 | June 3, 2024 | |
GPT4O finetuning with vision capabilities | 2 | 626 | July 24, 2024 | |
Does gpt4-o has a image recognising technology | 0 | 250 | June 19, 2024 |