Enterprise AI Decision-Making: When the Model Decides and When the Human Does

Enterprise AI projects fail at the decision boundary, not at the model. The model usually works. What fails is the choice of which decisions to hand to it.

This is the conversation we have with every CTO and COO who calls about an AI deployment. They want to know which workflows to automate. The honest answer is that the question is the wrong shape. The right question is which decisions inside those workflows the model is allowed to make, which the model is allowed to propose, and which a human must still make.

The taxonomy is older than AI. Six Sigma teams have used it for process automation. Trading desks have used it for execution algorithms. Air traffic systems have used it for collision avoidance. AI does not change the taxonomy. It changes the cost of getting it wrong.

The decision spectrum

Decisions in an enterprise workflow fall on a spectrum from filter to actor.

Filter. The model triages. It separates the work that needs human attention from the work that does not. The model does not decide the outcome; it decides which decisions land in front of a human.

Example: an insurance claims intake model that splits incoming claims into clean-straight-through, needs-adjuster-attention, and likely-fraud. The straight-through path goes to settlement automation; the other two paths route to people.

Advisor. The model proposes; the human decides. The model surfaces options, ranks them, explains its reasoning, and waits.

Example: a sales-ops model that suggests the next best action for an account (renewal call, churn save, expansion pitch) and a salesperson picks one.

Recommender with default. The model proposes with a strong opinion. If the human does not act, the recommendation executes after a time-out. The human can override but does not have to.

Example: a procurement model that recommends invoice approval after matching to PO and receipt; if the AP clerk does not respond within 48 hours, the recommendation auto-approves.

Actor. The model decides and acts. No human in the loop for that specific decision.

Example: an inventory replenishment system that places restock POs based on demand forecast plus safety stock policy.

Most AI deployments live across multiple points on this spectrum. The deployment design is the choice of which decision goes where.

The four factors that anchor the decision

The choice between filter, advisor, recommender, and actor is not a matter of preference. It is a matter of four factors, each of which has a defensible answer at deployment time.

1. Reversibility

Can the decision be undone without lasting consequence?

Restocking a SKU that turns out to be wrong: reversible. Inventory rebalances at next cycle. Cost: a few weeks of carrying cost.

Sending a regulated investment communication: irreversible. Once it reaches the customer, it is filed under FINRA's record-keeping obligations. The communication exists.

Reversible decisions can sit further toward Actor on the spectrum. Irreversible decisions belong at Advisor or Filter, with a human gate.

2. Stakes

What is the cost if the decision is wrong?

An auto-approved invoice for $43: low stakes. The cost of getting it wrong is bounded.

An auto-approved invoice for $2.3 million: high stakes. The cost of getting it wrong is asymmetric (the wrong direction is much worse than the right direction).

The same model can sit at Actor for low-stakes decisions and at Advisor for high-stakes ones, with the threshold as a configuration. Most production deployments do this. Few teams articulate it as the design choice that it is.

3. Volume

How many of these decisions does the enterprise make?

10,000 invoice matches per day: the human-review budget cannot absorb full review. The model has to handle most of the volume; humans handle exceptions.

20 vendor contract approvals per quarter: human-review budget can absorb all of them. The model can be a strong Advisor. No need to push toward Actor.

High-volume workflows force Actor or Recommender-with-default mode because the human budget cannot scale. Low-volume workflows can stay at Advisor without operational cost.

4. Regulatory and accountability scope

Who answers if the decision is wrong? Is the answer the same when the model made it as when a human made it?

In most regulated contexts the answer is no. A FINRA registered representative cannot delegate suitability decisions to a model. A licensed physician cannot delegate diagnostic decisions to a model. A controller cannot delegate revenue-recognition decisions to a model. The model can advise; the human signs.

This factor is the one that overrides the others. A decision can be reversible, low-stakes, and high-volume, and still belong at Advisor because the rule book says a licensed human signs.

Real failure modes from too-eager automation

The pattern of failure is consistent across industries. The model works in pilot. The team is encouraged. The team pushes the model further along the spectrum than the four factors justify. The first incident happens. The deployment retreats two notches.

Auto-approved expense reimbursements that funded an employee's side project. A finance team deployed an Actor-mode model for expense approvals up to $5,000. The model approved a series of legitimate-looking receipts from a recurring vendor. The vendor turned out to be a shell company. Volume factor said Actor; accountability factor said Advisor with sampling. The deployment retreated.

Auto-routed customer complaints that violated regulator escalation timelines. A bank deployed an Actor-mode model that classified complaints and routed them to internal queues. A subset of complaints involved Reg CC funds-availability violations that have a 10-day regulator notification clock. The model routed them to the standard queue, which had a 14-day SLA. The bank missed the regulatory notification on three complaints. Reversibility factor said reversible (the queue could be re-routed); accountability factor said no, the clock had started.

Auto-generated patient discharge summaries that omitted a contraindication. A health system deployed an Actor-mode model that wrote discharge summaries from the chart. The model summarized accurately for 98 percent of cases. In 2 percent it omitted a contraindication that the physician had flagged in a free-text note that did not survive the model's structuring. The cost of each miss was high. Accountability factor said Advisor with physician sign-off.

In each case the team had a defensible position on three of the four factors and chose to ignore the fourth.

The middle ground that most enterprises should occupy

The right default for most enterprise AI decisions is Recommender-with-default with a thoughtful review budget. This pattern is harder to engineer than full Actor mode but much safer than pure Advisor.

The pattern: the model makes a recommendation with a default action. The human has a review window in which to override. After the window expires, the default executes. The system logs every decision: the recommendation, the override (if any), the executed action, the outcome (if observable).

Three design choices make this work.

The review window length is the budget knob. Short windows (minutes) mean most decisions execute as recommended. Long windows (days) mean most decisions get reviewed. The window length is set by how much human-review capacity the workflow can sustain, not by what feels comfortable.

The review queue prioritizes by risk. Not all recommendations get equal review attention. A risk score (which can be the model's own confidence, or a separate calibration signal) routes the lowest-confidence and highest-stakes recommendations to the front of the queue.

The audit trail captures both the recommendation and the override. Most deployments capture the recommendation. Fewer capture the override. Almost none capture the override reason. The override reason is the most valuable training signal the deployment will ever produce. Without it, the model never learns what humans actually overrode.

A starter playbook

If you are designing an AI workflow today, the order of operations matters.

List the decisions inside the workflow. Not the workflow steps. The decisions. A claims-handling workflow has dozens of steps but maybe six decisions: is this claim valid, is the policy in force, is the loss covered, what is the reserve, what is the payout, what is the customer communication.
Apply the four factors to each decision. Score reversibility, stakes, volume, and regulatory scope. Decide where on the spectrum the decision belongs.
Default to Recommender-with-default for the middle. Use Actor only where all four factors clearly allow it. Use Advisor where regulatory scope dictates. Use Filter for triage and noise reduction.
Design the audit trail before you ship. Capture recommendation, override, override reason, and outcome. The override reason is the asset.
Set the review-window budget intentionally. Pick a number that the workflow can actually sustain. Watch the queue depth. Adjust the window or the risk-routing if depth grows.
Re-run the four-factor analysis quarterly. Reversibility can change (regulation tightens). Stakes can change (volume of high-value decisions grows). Volume can change (the deployment expands to a new business unit). Regulatory scope can change (a new rule applies). The decision boundary is not static.

Closing

The model is the easy part. The decision boundary is the hard part. A team that gets the model right and the boundary wrong ships an Actor when it should have shipped an Advisor, and the first incident becomes the deployment's defining moment.

The taxonomy here is not new. It is older than enterprise AI by decades. What is new is the speed at which a poorly-placed AI decision can compound. A model in Actor mode can make 10,000 decisions before lunch. If the boundary is wrong, 10,000 wrong decisions are sitting in production by 1pm.

The discipline is to slow down at the boundary choice. The model can be off the shelf. The four-factor analysis is bespoke to the workflow and to the regulatory context. That analysis is where the consulting work lives. AvanSaber works with enterprise teams on these deployments. If you are mid-build, this framework is yours to apply.