🎥 Surveillance Video Summarizer: AI-Powered Video Analysis and Summarization jni

kondaraviteja1 · September 13, 2024, 10:44am

Hey everyone!

I’ve been working on a VLM-driven system that processes surveillance videos, extracts frames, and generates detailed annotations to highlight notable events, actions, and objects. The system is powered by a fine-tuned Florence-2 Vision-Language Model (VLM), which I specifically trained on the SPHAR dataset. And, it utilizes the OpenAI API to summarize and extract the most relevant content, ensuring a comprehensive and coherent overview of the surveillance footage.

How it Works:

Frame Extraction: Extracts frames from video files at regular intervals using OpenCV.
AI-Powered Annotation: Each frame is analyzed by the fine-tuned Florence-2 model, generating accurate annotations of the scene.
Data Storage: Annotations and frame data are stored in a SQLite database for easy retrieval and future analysis.
Gradio-Powered Interface: Easily interact with the system through a Gradio-based web interface. By specifying time ranges, you can retrieve detailed logs with comprehensive analysis. The interface leverages the OpenAI API to summarize video content, ensuring temporal coherence by analyzing the sequence of frames, allowing for a more contextually aware understanding of the events captured in the footage.

aamini · September 14, 2024, 6:32am

Interesting.
I am new to all this but really interested in this topic. (VLM and security/safety).
I this project a home grown activity or part of a bigger project?
Thanks,
-Afshin

kondaraviteja1 · September 14, 2024, 2:14pm

Hi ,
Thank you for the response

This project is a homegrown activity.

Topic		Replies	Views
Harnessing the Power of OpenAI: Introducing the Audio Insights Generator! Community gpt-4 , gpt-35-turbo , openapi , api , summarize-text	1	3805	June 30, 2023
Video analysis with Open AI Community gpt-4-vision	3	4223	July 17, 2024
AI powered youtube research assistant Community cool-project	0	325	April 30, 2024
How to summarise a lengthy video using image frames greater than 20 using gpt-4o API gpt-4	5	1587	November 12, 2024
Having trouble in advanced multimodal reasoning beyond the surface Prompting api , multimodal , gpt-4o	1	88	November 10, 2024

🎥 Surveillance Video Summarizer: AI-Powered Video Analysis and Summarization jni

Related topics