Hi everyone,
I’m fairly new to this and have been exploring ways to generate n8n workflows using LLMs for the past couple of days. I’ve learned a few things along the way, but I’d really appreciate your guidance and advice on the best approach.
What I’m Trying to Do
-
Automatically generate n8n workflows in JSON format from prompts or examples.
-
Ideally, handle complex workflows reliably, including structured nodes and properties.
What I’ve Tried / Explored
-
Collected ~270 JSON workflow examples for testing.
-
Fine-tuned smaller subsets (~130 rows) using Meta-Llama-3.1-8B-Instruct-bnb-4bit.
-
Tested various models: distilroberta-base, GPT-2, Unsloth Llama 3.1, Genma, LitGPT.
-
Looked into RAG + embeddings (FAISS, all-mpnet-base-v2) for structured retrieval.
-
Explored schema enforcement and JSON formatting tools.
-
Checked out cloud GPU options (Modal, Colab) and resources.
Challenges
-
Small dataset and JSON schema inconsistencies
-
GPU/memory constraints, can only use google colab free plan right now
-
Paid API usage not feasible for now (testing for company purposes).
-
Even when using ChatGPT, sometimes resposnses are inaccurate or incomplete, so I don’t think any free open-source llm model could be better.
What I’ve Learned / Observed
-
Lightweight models (Genma, LitGPT) can give step-by-step guidance but struggle with full JSON generation.
-
Fine-tuning on small structured datasets is tricky and resource-intensive.
-
RAG + embeddings can help with retrieval but still don’t generate JSON.
-
Using ChatGPT directly is possible but often requires manual validation and iteration.
My Questions
-
What’s the best approach (paid or high-quality) for generating reliable n8n workflows with LLMs?
-
What’s the best free or open-source approach, if one exists?
-
Any tips for ensuring consistent JSON output or avoiding outdated nodes?
-
Are there ways to reduce manual checking and iteration when generating workflows via LLMs?
Any advice, workflow examples, or tool recommendations would be greatly appreciated!