Interesting Research: PIGEON, an AI-based location identifier

, ,

Hey all!

We’ve been sharing around recent research lately done through companies like Anthropic, which gave out examples for aiding in long-context conversations. Recently, google also released their own research regarding an LLM’s skill in handling higher level abstract mathematics.

Today, I just noticed this NPR article about a student project called PIGEON. It uses CLIP, OpenAI’s text and image neural network, amongst other things. I find it pretty relevant to a lot of the discussions I’ve seen around the subject.

This project is particularly interesting, because it also leverages the GeoGuessr game here, whereby the students were seeing if they could build an AI that could play the game better than a human. Surprisingly, it beat Trevor Rainbolt, one of the best players of the game, in multiple rounds.

Now, I’d also like to point out, while we don’t know precisely how they supplemented CLIP to aid it in identifying locations around the map, so this might be far less impressive if it was able to read metadata. Most people, even the author of this article, seems to forget that geolocation data is still embedded in a photo (and is likely to be analyzed by something like GPT-V). However, what we do know is that CLIP does not read metadata, so if this AI can accurately identify locations around the world based on a single photo without metadata, I find that to be extremely interesting, at least personally.

Enjoy.

6 Likes

You might want to explore the service from picarta.ai as well.

Appreciate it

Nice! I would be pretty entertained if someone enters that AI into the GeoGuessr world cup :rofl:

(Applications open on January 5th)

4 Likes

Oh lord…

Okay, so I’d recommend reading the actual article and removing your image in this post. It was an experimental AI that used OpenAI tools, this is not ChatGPT-vision.

Please don’t post images of yourself or someone else on a public forum like this unless it’s some topic directly related to sharing such pictures. While not against community guidelines, I personally get worried about posting this kind of stuff willy-nilly.

So far, I’m not aware of any publicly-released AI that achieves what this article is talking about. You could probably look at the metadata though.

3 Likes

Wow, this took me a while I will admit. But I eventually pinned it down. I’m sorry if I butcher this name but this picture was taken on Kobikicho Dori street in Ginza, Tokyo. It was a photo taken by a japanese Ameba blog titled “Discover the downtown area in the back alleys of Ginza, Walking around Higashi Ginza area”, and I pinned it down firstly by identifying the country, then where it would be using an ai called picarta. Sadly, it didnt lead me to this spot and only gave me the coordinates of the direct center of Tokyo.

Using that knowledge, I couldn’t figure out how to copy that text on the red banner, so I asked google’s new AI Gemini what it said, and it said post office. I went to the 3 Chome (either some tower or a neighborhood) which I was looking at after some information the blog said, and found a post office nearby. Looking on street, view, I caught this image to prove I found it.


Where is it in Iran?