Lessons from building an async-first sales operating system

Hi everyone,

I’d like to share some practical lessons I learned while building an async-first sales operating system. This is not a sales pitch, but a look at the technical and process challenges I faced, and how I approached them.

The problem I wanted to solve

  • Endless alignment calls slowed projects down

  • Scope creep created confusion and wasted cycles

  • Deliverables often slipped because communication was too synchronous

The approach I tested

  • Documented scope agreements before any work started

  • Async briefs instead of recurring calls

  • Structured update formats to track progress clearly

What I observed in practice

  • Time from agreement to first deliverable shortened significantly

  • Fewer revision cycles per milestone

  • Transparency improved because every update was timestamped and stored

Technical/operational notes

  • Even simple templates reduced ambiguity if consistently used

  • Versioning async briefs (like code commits) helped trace decisions

  • Defining refund and dispute terms upfront reduced stress for both sides

Open questions I’d like to discuss

  • Where do async systems break down for others?

  • How do you balance async structure with cases where synchronous input is unavoidable?

  • Are there lightweight tools or frameworks that help enforce async discipline better than custom docs?

I’m sharing these notes to invite feedback. If anyone here is experimenting with async-first collaboration, I’d love to hear your perspective.

Great approach. Not too detailed, but I bet it would be too complicated to explain.

  1. First of all. There is nothing “synchronous” in this reality except for probably the situations where you put your sock on your right foot, and the other sock immediately becomes “left"… but we are not talking about quantum weirdness.
  2. Based on the above, all systems are asynchronous. It’s just that some developers don’t want to admit it and fight with the nature.
  3. Asynchronous interactions have a lot of states and steps. Well more than what you think of, even more than that. And most of what you have read on the subject and heard from “specialists" is likely to be BS. So be sceptical toward everything.

Enough theory. Now what breaks:

  1. Developers - often underestimate the workflows and their elements, what creates boxes too narrow to fit real life scenarios and edge cases. Which makes business not happy about the move to new system because simply it does not fit what they truly need. Try to get as abstract as possible in the drawing board stage, and as detailed as possible in planning, testing, execution phases. Slower design is better. More variety of minds / experts is better (but keep track of what matters and filter noise).
  2. Businesses - most of the time they have no clue/documentation about how things work in general, nor in their particular use case. More than that, often they don’t know what the issue is nor what they need. What they tell you is most likely the vague expression of what they think the solution should be based on their limited vision of the situation. So try to come up with a system that would “dig" into client’s head and analyse the process and problem to see how your system should solve them.
  3. Communication, messaging pipelines and queues. Not the tech part of it, but what are the entities, their responsibilities, properties, PROPERTY VARIANTS AND EDGE CASES, how they communicate to each other, what are the events, what are the effects. That’s what is the most complicated part : THE SYSTEM DESCRIPTION OF WHAT IT IS IN REAL LIFE, not how you or the business see it. Solution: more time spent on system analysis and simulation.

What to do when a synchronous communication needed: slow down and note the following:

  1. How do you know the synchronous communication is really needed?
  2. When did you know it is needed?
  3. Who needs it? How they notify the other party?
  4. Whom it is needed from? How they learn about the need?
  5. Why? What needs to be transmitted?
  6. What happens when it is delayed? What’s the max delay possible? How urgent it is?
  7. What happens when it is granted?
  8. What happens during this communication?
  9. What happens after this communication?

When you have all answers on the above, leave it for a day (don’t think about it), or give it to someone else and when you or the other person will see those answers for the second time - look at them as a raw description of an asynchronous flow to design the implementation for…

If you find, I’m a taker. Personally playing with Traefik + Directus + Weaviate + Redis + Postgres , all in docker, behind Traefik, only Directus is public.

Appreciate you laying this out especially the point on how most “synchronous” moments are really just state changes.

What I’ve shared above wasn’t meant as theory, but as field notes from running async-first systems with paying clients.

In practice, I’ve seen exactly what you mention:

  • Developers underestimate real-world workflows → async briefs expose the gaps.

  • Businesses often can’t articulate the real need → upfront scope forces clarity.

  • Communication drifts when systems aren’t explicit → structured loops keep it grounded.

So while your lens is system description and modeling, mine is execution under pressure.
Bringing both together theory + practice is probably the only way async-first adoption scales.

Yesterday I shared some lessons from building an async-first sales OS.
A thoughtful comment pointed out something important: in reality, almost everything is asynchronous the real challenge is describing it as a system that works in practice.

I agree.
In theory, async flows have endless states, edge cases, and exceptions.
But in practice, I’ve learned that execution pressure forces you to simplify:

  • Upfront scope removes ambiguity developers often underestimate

  • Prepaid contracts keep businesses from drifting into vague requests

  • Structured async updates prevent communication breakdowns

The theory lens says: “an async system is infinitely complex.”
The practice lens says: “make it simple enough to run without collapsing.”

Both are true and maybe async-first adoption only works when these two perspectives meet:

  • Theory → reminds us not to ignore complexity

  • Practice → reminds us not to get stuck and lose momentum

I’m curious how others here balance it:
Do you approach async-first work more from the system design side, or the execution under pressure side?

That theory was distilled from some 20 years of practice.

To find the balance, I often see me be abstract on state descriptions, but damn detailed in execution.

And in code: most of the things must emit events and provide filter hooks.

So code writing for real life ready systems in my case looks like;

  1. Tons of comments describing what we are going to do and WHY
  2. List of events emitted and data passed between entities
  3. Event hooks code (what happens and what data we have available) - start of event only
  4. Filter hooks (how data should be transformed during this event) - event middle
  5. Event hooks code - end of event only (so that everyone can unsubscribe from filters if needed, and confirm event state)

And only then I pas to the rest of the code. But my brain does not allow me to have pure events first, so I deal with it having the drawing board on the left, screen on the right, and from high level code down to the sand grains.

Events are needed even if there is no subscribers to them now, but it will save you months/years in maintenance of the code.

I think they call it event driven software engineering , but not sure as when I see their examples, those are kind of somewhat so basic that’s hard to tell for sure if it’s event or “with some events". And I came to development from linguistics, so no “official” software engineer box for me :joy:

Well it’s not that infinite in real business application, more complex than a prototype done by a young startup, but totally manageable if approached open-mindedly.

But complexity does not mean all of it must be reflected in the software, here some abstraction helps.

Thanks for breaking it down from the coding side.

It’s interesting to see how the same async-first principles show up both in code (events, hooks, filters) and in business systems (scope, briefs, loops).

Different layers, but the same discipline: structure first, then execution.

Appreciate that perspective.

I agree, when moving from prototype to real business systems, abstraction becomes the bridge.

Too much detail clutters execution, but the right level of structure keeps async systems both manageable and scalable.

Another thing to mention that the software layer:

  1. To help businesses fit processes into your software while staying flexible, YOU HAVE TO PROVIDE COMPREHENSIVE OPTIONS SUITE

By comprehensive, I mean everything ideally should be an option and have most likely several options to control the aspects of the thing.

So whenever in your code you are using any constants or branching decisions redirecting to other events, try to think of them as an option somewhere available in the user interface.

And probably start by building a system how you can create your options in a declarative way so that it is easy to maintain and update.

Each option should have its default value (ideally the value should not be empty or falsy, so that it stays easy to implement in the code without additional logic, and allows the user to simply delete the option to come back to the default value).

However, pay special attention to checkboxes: their default value should be empty. This saves the time on the client enrollment, skipping them the task of checking the boxes when they first deploy.

So for example if you have a feature which should be enabled by default in your software, the checkbox to control that feature should be:

  • Disable? (If checked, this feature will be disabled)

And for optional features disabled by default the checkbox will be:

  • Enable? (…)

Also try to foresee the multi-language support by default from the very beginning, because it is a hell to implement on existing solution. I usually approach it by splitting an entity into two entities related to each other:

  1. Primary entity contains everything except for language strings. And one-to-many field pointing to its language strings objects (alias)
  2. Language strings object contains language strings only, no business logic, nothing else but texts + languages primary key (en_us) + many-to-one pointer to the parent object.

Then your internal or external apis when returning the primary object should always attach all translation strings so that the client can filter whatever they need (unless you want to provide them a handy filter in the API itself).

And then in the software itself, all texts displayed to user should also support similar structure so that your software is ready to be translated. Hopefully today you can install some procedures and use apis to automatically translate your configurations to multiple languages, so that you develop your software in English and your development tools make it happen in all other languages you need. IMPORTANT THE USER INTERFACE STRINGS SHOULD BE AN OPTION SOFTWARE ADMINISTRATOR CONTROLS. This way if you want to modify the strings in your software user interface you just update the options on your private settings screen.

That makes a hell lot of options to develop, but it saves you so much time you can’t imagine.

So settings coverage is another metric to track in your software development.

Saying that because spent my last 15 years moving a fully customized softwares to Off the shelf solutions ready to deploy. Truly painful experience. But what a lesson.

Another point: the whole system should be API first by design, because it will be easier to work with for you. And the most important, businesses don’t like new softwares, they prefer additional tabs in the existing software (and what they see in that tab is actually what your API delivers). So having the robust API from the very beginning will open you doors otherwise closed because of that constraint from business.

This API first design suggests you adopt modular structure. Donc confuse modular structure with micro services. It’s way easier to maintain a monolith composed of multiple modules, than descend to the hell of micro services.

You nailed the critical foundation making every decision an option and thinking API-first from day one.

In my async-first sales OS I took the same path: scope-lock + 100% upfront makes the business side predictable, but under the hood the system is also optionized. Features can be switched on/off without branching chaos, and the loop is API-driven so it can plug into existing stacks instead of forcing a “new tool.”

And yes, multi-language strings and admin-side controls are already on my roadmap, because global B2B flows demand it.

Love how your 15-year lesson matches the survival architecture we’re building.

And probably the last thing about scaling:

Try to see how you can deliver individual instances for your clients, and potentially your own instance as a SaaS, all connected to payment system deployed separately.

This way you keep small clients on your instance and can spin off more instances if needed.

But the huge customers can deploy their own instance off-loading your servers.

Current payment systems basically scale enough to support all the instances you have so I don’t think you need more than what is available in that part.

When you take this approach from the very beginning it is way easier in the long run. And the extra time needed to adopt this approach usually pays well, especially considering that many of the clients would like the on-premise installations which you can bill way higher than usually.

Scaling insight is spot on separating client instances and keeping the payment layer decoupled is a huge lever.

In my async-first sales OS, the scope-lock + prepaid loop already keeps the flow predictable, but I see the same future:

  • small clients can stay on the shared loop,

  • large enterprise clients can run their own “on-premise” instance at a higher tier.

That aligns perfectly with the idea of spinning off modules instead of bloating the core. And yes, billing those on-premise setups higher makes the survival OS even stronger.