Flash Learn - Integrate LLM in any pipeline
FlashLearn provides a simple interface and orchestration (up to 1000 calls/min) for incorporating Agent LLMs into your typical workflows and ETL pipelines. Conduct data transformations, classifications, summarizations, rewriting, and custom multi-step tasks, just like you’d do with any standard ML library, harnessing the power of LLMs under the hood. Each step and task has a compact JSON definition which makes pipelines simple to understand and maintain. It supports LiteLLM, Ollama, OpenAI, DeepSeek, and all other OpenAI-compatible clients.
Examples
Github
Installation
pip install flashlearn
Add the API keys for the provider you want to use to your .env file.
OPENAI_API_KEY=
High-Level Concept Flow
flowchart TB
classDef smallBox font-size:12px, padding:0px;
H[Your Data] --> I[Load Skill / Learn Skill]
I --> J[Create Tasks]
J --> K[Run Tasks]
K --> L[Structured Results]
L --> M[Downstream Steps]
class H,I,J,K,L,M smallBox;
Learning a New “Skill”
Like a fit/predict pattern, you can quickly “learn” a custom skill. Below, we’ll create a skill that evaluates the likelihood of buying a product from user comments on social media posts, returning a score (1–100) and a short reason. We’ll instruct the LLM to transform each comment according to our custom specifications.
from flashlearn.skills.learn_skill import LearnSkill
from flashlearn.client import OpenAI
# Instantiate your pipeline “estimator” or “transformer”
learner = LearnSkill(model_name="gpt-4o-mini", client=OpenAI())
# Provide instructions and sample data for the new skill
skill = learner.learn_skill(
df=[], #dif you want you can also pass data sample
task=(
"Evaluate how likely the user is to buy my product based on the sentiment in their comment, "
"return an integer 1-100 on key 'likely_to_buy', "
"and a short explanation on key 'reason'."
),
)
# Save skill to be used from any system
skill.save("evaluate_buy_comments_skill.json")
Input Is a List of Dictionaries
Whether you retrieved data from an API, a spreadsheet, or user-submitted forms, you can simply wrap each record into a dictionary. FlashLearn’s “skills” accept a list of such dictionaries, as shown below:
user_inputs = [
{"comment_text": "I love this product, it's everything I wanted!"},
{"comment_text": "Not impressed... wouldn't consider buying this."},
# ...
]
Run in 3 Lines of Code
Once you’ve defined or learned a skill, you can load it as though it were a specialized transformer in a standard ML pipeline. Then apply it to your data in just a few lines:
from flashlearn.skills.general_skill import GeneralSkill
with open("evaluate_buy_comments_skill.json", "r", encoding="utf-8") as file:
definition= json.load(file)
# Suppose we previously saved a learned skill to "evaluate_buy_comments_skill.json".
skill = GeneralSkill.load_skill(definition)
tasks = skill.create_tasks(user_inputs)
results = skill.run_tasks_in_parallel(tasks)
print(results)
Get Structured Results
FlashLearn returns structured outputs for each of your records. The keys in the results dictionary map to the indexes of your original list. For example:
{
"0": {
"likely_to_buy": 90,
"reason": "Comment shows strong enthusiasm and positive sentiment."
},
"1": {
"likely_to_buy": 25,
"reason": "Expressed disappointment and reluctance to purchase."
}
}
Pass on to Next Steps
Each record’s output can then be used in downstream tasks. For instance, you might:
- Store the results in a database
- Filter for high-likelihood leads
- Send them to another tool for further analysis (for example, rewriting the “reason” in a formal tone)
Below is a small example showing how you might parse the dictionary and feed it into a separate function:
# Suppose 'flash_results' is the dictionary with structured LLM outputs
for idx, result in flash_results.items():
desired_score = result["likely_to_buy"]
reason_text = result["reason"]
# Now do something with the score and reason, e.g., store in DB or pass to next step
print(f"Comment #{idx} => Score: {desired_score}, Reason: {reason_text}")
Supported LLM Providers
Anywhere you might rely on an ML pipeline component, you can swap in an LLM:
client = OpenAI() # This is equivalent to instantiating a pipeline component
deep_seek = OpenAI(api_key='YOUR DEEPSEEK API KEY', base_url="https://api.deepseek.com")
lite_llm = FlashLiteLLMClient() # LiteLLM integration Manages keys as environment variables, akin to a top-level pipeline manager
ollama = OpenAI(base_url = 'http://localhost:11434/v1', api_key='ollama', # required, but unused) # Just use ollama's openai client
KEY IDEA: JSON in, JSON out
Examples by use case
-
Customer service
-
Finance
-
Marketing
-
Personal assistant
-
Product intelligence
-
Sales
-
Software development
I really hope this lib will simplify integration of LLMs into your existing pipelines!