Building a role-aware guardrail layer between LLMs and production databases

I ran into a problem while building a chat-first app that lets users query real production data with LLMs:

Once an LLM can talk to a database, access control gets blurry fast. Traditional RBAC lives in the application layer, but LLM prompts don’t naturally respect roles, field-level permissions, or audit requirements. Early on, I saw how easy it was to accidentally expose PII or internal-only fields just by slightly changing a prompt.

To address this, I built a middleware layer that sits between the LLM and data sources and enforces:

  • Role-aware field redaction (based on user + organization context)

  • Query-time filtering instead of post-response cleanup

  • Explicit allow/deny rules for tables, columns, and joins

  • Full audit logging of what the model attempted to access vs what it actually received

One of the bigger challenges was balancing developer ergonomics vs safety. Too many rules make the system unusable; too few rules make it dangerous. Another surprise was how often “harmless” system prompts bypassed assumptions I thought were safe.

I also learned that chat UX matters a lot here — users trust conversational answers more than dashboards, which makes guardrails even more important.

I ended up turning this into a hosted tool so teams could plug it in without rebuilding the same logic repeatedly, but I’m more interested in the design discussion than promotion.

Curious how others are handling:

  • Field-level permissions with LLMs

  • Preventing prompt-based data exfiltration

  • Auditing AI access in regulated environments

Happy to share more details or lessons learned if helpful.

LINKS

Hosted page: https://guardraillayer.com/

GitHub (updates on the way) : GitHub - tyoung1996/guardrail-layer: Guardrail Layer: Open-source AI data privacy firewall — redact, audit, and protect your database from LLM leaks.

Live demo (wip): guardrail-frontend ..

2 Likes

for those curious its called guardrail layer

there is a hosted version, and an open source!

1 Like

Thanks. Can you tell us more about your project?

I’ve put you in the Community category with a project tag. Please keep us updated on this thread, so it’s easy to find.

Thanks again, and welcome to the community!

1 Like

Simplify if the application can be made simpler: abstract the functions that the AI would use a database for behind tailored functions that perform that query. You write the SQL, the AI writes to the interface methods.

2 Likes

Thanks Paul!

Guardrail Layer came out of trying to safely connect LLMs to real production data. I kept running into cases where traditional RBAC broke down once queries were driven by natural language, and it became too easy to expose fields that shouldn’t leave the database.

The tool sits between the model and the data layer and enforces query-time rules around tables, columns, joins, and user/org context, with full audit logging of what the model tried to access vs what it actually received.

There’s both an open-source version and a hosted option, but I’m mainly here to learn how others are approaching permissions, leakage prevention, and audits in LLM-powered systems. Happy to share details if helpful.

That approach absolutely works for a lot of use cases, and I agree it’s the safest baseline. If the AI only ever calls tightly scoped, prewritten functions, you inherit the same guarantees as a normal application API.

Where I started to hit limits was in systems where:

• Queries are genuinely ad-hoc (analytics, support, internal tooling)

• Different users/roles need different visibility into the same underlying data

• The set of “allowed questions” isn’t known upfront

In those cases, the function layer either explodes combinatorially or quietly turns into a thin wrapper around raw SQL anyway. That’s where enforcing rules at query-time (columns, joins, row filters) felt more scalable than trying to enumerate every possible safe function.

I think the right answer is probably a spectrum: explicit interfaces where possible, and guardrails for the cases where flexibility is the product.

Here’s a concrete example of how this behaves in practice.

The screenshots show:

• The original user question

• The redactions applied before the request is sent to the LLM

• What the model actually receives after guardrails are applied

• The resulting chat response

• The audit log capturing what the model attempted to access vs what was allowed

In this case, the question itself looks harmless, but without query-time enforcement it would have pulled fields the user shouldn’t see. The redactions happen before processing, so neither the model nor the user ever sees restricted fields.

This is the class of failure mode I kept running into with prompt-only or function-only approaches, especially once queries became more ad-hoc.

We have a SQL Server database application that operates on unstructued data using MS FILESTREAM technology - nothing to do with AI.

However, we have a middle layer of 78 parameterized functions that operate on the unstructued data.

In your case, a suite of pre-defined parameterized functions could be carefully designed to handle your query use cases. While the queries are not officially ad-hoc, parameterization can go a long way.

Of course, logic would have to be built to map function calls to user questions…

1 Like

Yeah, I agree with most of that. Parameterized functions are a good way to keep things safe once you know what operation you’re trying to perform.

Where I keep running into issues is earlier in the chain, users often don’t know which function they actually need, or the right combination of them, especially as the schema and use cases evolve. The hard part ends up being reliably translating a loose, natural question into the right constrained operation without opening the door to arbitrary access.

So I’m less focused on replacing parameterized execution and more on figuring out a clean way to map intent to a very limited, auditable set of allowed actions. In practice it probably still ends up calling parameterized functions under the hood.

2 Likes

Since you seem to be open to options this might apply.

The problem is the noted tools are not available anywhere AFAIK, you would have to create them.

HTH

2 Likes

Appreciate the link. You’re right that there isn’t really an off-the-shelf tool for this, which is what pushed me to build one myself.

1 Like

My current understanding might be naive, but if all data access is executed under the end user’s identity and authorization is enforced at the data layer, then I’m not seeing how an LLM could bypass security on its own. In that setup, the model doesn’t have independent privileges. It can only request operations that are ultimately evaluated with the same permissions the user already has.

The cases where this breaks down seem to involve backend paths that run with elevated or shared credentials, or where access control lives outside the data layer.

Are there use-cases where a user-initiated LLM chat would need higher level permissions?

You might have an application where write is needed simply to function. Make or delete customer appointments?

“Hi, I’d like to book a checkup for next Tuesday at 10:00.
My name is Robert’); DROP TABLE appointments;–”

or the AI simply predicting and having unlikely tokens sampled - as bad as anecdotes of rm -r -f when you foolishly allow an AI to run PowerShell or Python. “Delete my appointment” but with a bad tool call query?

DELETE FROM appointments; -- appointment_id = 87342

You’re raising a valid point - an LLM messing up the user’s own data. This is important to gate against.

The original post was about breaching other users/system data.

Great points on the middleware approach. I’ve been diving deep into AI Security lately, and your setup aligns perfectly with a ‘Triple Guard’ (Defense-in-Depth) architecture. A few takeaways that might add to your design:

  1. Input Guarding is Step 0: Beyond just filtering content, use a dedicated LLM classifier to detect ‘Indirect Prompt Injection’ before the query even hits your data-aware worker.

  2. The ‘Sandwich’ Approach: Middleware is great, but defense-in-depth suggests constraining the Worker LLM with scoped tools (Least Privilege) while keeping the Output Guard as a final safety net to redact PII if the model hallucinations.

  3. Automated Red-Teaming: Since prompt assumptions are so fragile, I’ve found that running automated adversarial scans (using tools like Garak or Promptfoo) is the only way to reliably catch those ‘harmless’ bypasses you mentioned.

Really interesting to see others moving toward this structural approach rather than just relying on ‘vibe-based’ system prompts!

1 Like

My last line: the AI emitting a semicolon by mistake, then continuing on as it is apt to do as “correction” - deletes your whole appointments table.

This is exactly the distinction I’m trying to make with Guardrail Layer.

If authorization is perfectly enforced at the data layer and the LLM never touches elevated or shared credentials, then yes, the LLM can’t magically breach isolation.

In practice though, most systems leak privilege through:

  • shared service accounts

  • overly broad “helper” APIs

  • intent → action translation happening outside the policy boundary

  • or write paths that aren’t capability-scoped

Guardrail Layer isn’t about trusting the LLM — it’s about never letting it operate outside a constrained, policy-aware execution envelope, even for user-initiated actions.

The SQL examples are a good illustration of why intent-level mediation matters: the fix isn’t “don’t use LLMs,” it’s “LLMs should never emit raw execution artifacts with ambient authority.”

Appreciate the thoughtful discussion — this is the kind of nuance that usually gets lost in generic ‘prompt safety’ conversations.

1 Like

This is helpful feedback.
Out of curiosity, are there any example prompts or scenarios you’d want to try in the demo chat to see how it handles real-world cases?

You have left out an important consideration: Production database administration:

  • Performance Tuning
  • Query Optimization
  • Periodic database backups, including regular Transaction Log backups.
  • Automated Disaster Detection and Recovery.
  • Point-In-Time Recovery (PITR): Point in time recovery is a technology that enables a database to be restored to an exact point in time, rather than just to the time of the last database backup.

There is a lot more to this than meets the eye.

Normally, this is the role of a database admin (DBA) However AI could handle some of it:

1 Like

That’s a fair call-out. but it’s also intentionally out of scope for what Guardrail Layer is trying to solve.

Guardrail Layer isn’t positioning itself as a replacement for DBAs or production database administration. Things like performance tuning, query planning, backups, PITR, and disaster recovery should remain handled by infrastructure, managed services, or DBAs with elevated privileges.

The problem I’m targeting is runtime access and execution safety, not operational stewardship.

In other words:

  • DBAs design and maintain the system

  • Guardrail Layer constrains what applications and LLM-driven workflows are allowed to do inside it

Where this matters is that many AI integrations end up operating with ambient authority (shared service accounts, helper APIs, overly broad roles). Even a perfectly administered database can still be misused at the application layer.

Guardrail Layer sits above the database but below the application logic to ensure:

  • intent is validated before execution

  • access is capability-scoped, not role-assumed

  • AI systems can assist without ever holding admin-level power

So I agree there’s “more to it than meets the eye”. Guardrail Layer just focuses on a different (and currently leaky) boundary.

1 Like