GPT4-o mini that looks at your screen generates logs of your day

this is here screen-pipe/examples/typescript/daily-log at main · louis030195/screen-pipe · GitHub

works with openai or ollama :slight_smile:

2 Likes

Very nice demo! How much did it cost per day?

1 Like

you can use it for free with ollama

What hardware requirements does that have?

1 Like

i have a macbook pro m3 32gb ram and my mac is chill, not suffering

you can probably try smaller models like phi3

1 Like

added an example to build an AI time tracker that look at everything you’ve done, said, heard last week and give you insights, tips:

link: screen-pipe/examples/typescript/daily-tracker at main · louis030195/screen-pipe · GitHub

if you want i can help you set this up

I’m against this concept completely as it will become an inevitable surveillance tool by employers.

What happens if the AI hallucinates what it’s seeing? Or minimizes the actual work being put in.

In your screenshot you apparently spent an hour scrolling to download, install & upload a file. An hour. Sure there’s reasons but this will be the exact reporting that HR would give.

073024 - 9.05 AM - 9.10 AM - Emails & LinkedIn

       - 3 min: Scrolled emails 
       - 2 min: Opened LinkedIn and scrolled posts

Your entries are heavily influenced by the example you provided in the prompt.

So I worry. In your code you have a 5 minute interval. Which is completely unfair for actually understanding what a user is doing. If I am coding and it takes a screenshot at 00:05, and then another at 00:10 and I’m on the same page will it say “Scrolling through IDE”? I’m betting: Yes.

I want to see some differentiating between screenshots, and they would need to be in a second time-frame. Not minutes. You have given a constant LOGGING_INTERVAL variable, sure, but I would desire more depth in these intervals. There is a massive difference of context between log differences of ~5s, and ~5m.

All I see here is haphazardly slapping AI onto screenshots. I want MOAR. How does it manage multiple windows open? Different monitors? I think these are easy questions but I don’t see any answers. I want depth but only see puddles of information.

I do respect the hustle. Drug dealer mentality. If I don’t do it, someone else will but I really worry about the accuracy of this tool and the greater implications if it were to be become standardized in remote working.

2 Likes

Hey thanks a lot for the feedback, this project just started a month ago and we got 800 stars now, trying to make everyone happy, shipping daily.

I’m against this concept completely as it will become an inevitable surveillance tool by employers.

Why do you think so?

Again, this example is very basic and has indeed a lot of flaws but it’s very low hanging fruit, even a nonprogrammer can make it better through ChatGPT

Added a perplexity-alike agent to query your 24/7 screen & audio recording data with more relevancy:

code here

open source rewind AI that work on linux, windows, linux :point_down:

for education/learning: