This is my first open-source project after stepping into AI large language models. It’s still pretty simple, but every star means a lot to me. Thanks for your support!
Hi and welcome to the community!
Can you tell us a little bit more about your project?
What’s the motivation and your goals? What exactly is it that you have build using the oss models?
Intelligent Database Query System
Supports database metadata management, table structure mapping, vectorized storage, and natural language intelligent querying powered by large language models. The system can automatically generate SQL and visualize query results.
Features
-
Database Metadata Retrieval: Supports MySQL, Oracle, PostgreSQL, DM (Dameng), and Kingbase databases.
-
Intelligent Table Structure Mapping: Extracts Chinese table descriptions from user queries and automatically maps them to actual database table names and fields.
-
Vectorized Knowledge Base Construction: Vectorizes table descriptions and stores them in Milvus to enable semantic retrieval.
-
Natural Language Intelligent Querying: Users can ask questions in natural language, and the system automatically generates SQL and returns query results.
-
Query Result Visualization: Supports Markdown table rendering and statistical chart generation.
-
Multi-Stage Exception Handling: Automatically handles SQL execution errors, system errors, and function call exceptions to ensure query stability.
-
Security Mechanism: Automatically rejects operations that may compromise data (e.g., insert, update, delete, or drop table operations).
Technical Highlights
-
Embedded Vector Model: Uses the Qwen3-Embedding-0.6B model (locally deployed via [Sentence-Transformers]) for table structure vectorization.
-
Language Generation Model: Utilizes the GPT-OSS-20B model (locally deployed) for natural language understanding and SQL generation.
-
Message Stream Parsing: Employs the OpenAI Harmony format and StreamableParser technology to achieve efficient streaming generation and incremental output.
-
Database & Vector Database Support: Compatible with MySQL, Oracle, PostgreSQL, DM (Dameng), Kingbase; vector data is stored in Milvus for fast semantic retrieval.
-
Frontend Interaction: Built with Gradio Blocks + ChatInterface.
-
Security Strategy: Filters out high-risk database operations at the system level to ensure query safety.
My Motivation
-
I wanted to enable database queries through natural language and automatically generate statistical reports, thereby lowering the technical barrier for users.
-
I have a strong personal interest in large language models (LLMs). I’ve always wanted a concrete project to learn and build intelligent agents based on LLMs. This project aligns closely with my daily work, making it a natural choice.
-
While there are already several open-source projects with similar functionalities, I intentionally chose to “reinvent the wheel” to learn by doing. My goal was not only to study LLM-based agents but also to truly build a system that supports queries across different databases, reduces development complexity, and extends support to more database types.
For this reason, I decided to develop and open-source this project.
During my exploration, I found that GPT-OSS was particularly developer-friendly. The Harmony message format made message parsing and formatting straightforward, resulting in a smooth development experience. Additionally, GPT-OSS offers a low-cost deployment option since it requires relatively modest hardware resources. These advantages made GPT-OSS the natural choice to power my db_llm project.
Well done for launching your project as open source and sharing it here.
Love what your doing we need more like this i was thinking of doing something similar when i retire and have more time.
Thanks, I’ll keep working hard.![]()