Showing Posts From

Ai business case

What the CFO Needs to Understand About AI Investment (That the Vendor Won't Tell Them)

What the CFO Needs to Understand About AI Investment (That the Vendor Won't Tell Them)

The deck looks great. There's a 3x ROI at month 12, a cost-per-decision metric your competitors would envy, and case studies from companies that look just like yours. The vendor has done this pitch a hundred times. They know what a CFO wants to see. The problem is the ROI model they're using was designed for software. And AI isn't software. That distinction sounds pedantic until you're twelve months in and wondering why the numbers don't match the deck. The gap between AI investment promises and P&L reality is probably the most expensive misalignment in enterprise technology right now. Not because AI doesn't deliver value — it does, for the right use cases, in the right organizations, under specific conditions. But because the financial model used to justify it was built for a different kind of purchase. Software procurement ROI runs on three assumptions: costs are predictable, value delivery is linear, and the failure mode is a delayed project. None of those hold for AI. Why the software ROI model breaks on AI Software has a cost structure that finance teams can work with. Licensing is known, implementation is estimated, ongoing support is a percentage of the license. The model is imperfect but manageable. AI cost structure doesn't map to any of those buckets cleanly. The largest cost variable in most enterprise AI programs isn't the AI itself. It's data. Before a model can be trained on anything useful, someone has to assess what data you actually have — which is usually different from what the business thinks it has — fix the quality problems, integrate sources that weren't built to talk to each other, and set up the governance to make sure the training data is legally usable. That work is slow, expensive, and almost never appears on a vendor proposal. It also doesn't end: data quality degrades, systems change, and each new use case adds new requirements. Compute is the second piece that gets undercounted. Training costs and inference costs are different things, and vendor estimates typically focus on training. Inference is what you pay for in production — every time the model scores a new input. For high-volume use cases like fraud detection, real-time pricing, or recommendation, inference costs at scale regularly exceed what the organization paid to train the model. Cloud pricing makes this easy to miss until the bills start arriving. Then there's talent. AI teams don't price like enterprise software teams. Data scientists, ML engineers, and MLOps specialists have their own market rates, and those rates aren't decreasing. More importantly, the team that builds a model is different from the team that runs it in production. Both need to be funded and sustained for as long as the model is in use. The last piece is governance and monitoring. Every production model needs drift detection, performance tracking, audit logging, and a scheduled retraining cadence. This is unglamorous, recurring spend that consistently goes missing from initial program budgets. A model without monitoring isn't a production model. It's a liability on a timeline. The time-to-value curve vendors don't show you Vendor decks show value beginning to accumulate somewhere around month six. The actual pattern is different enough to change how you fund the program. The first three months are almost entirely cost. Data assessment, infrastructure setup, hiring or contracting the team, use case definition. Nothing deployable. Months four through nine are where the model gets built and tested. Results exist but aren't trusted enough to act on. This is when programs are most at risk of being canceled — the spending is real, the returns aren't visible yet, and the business is getting impatient. Months ten through eighteen are shadow deployment and validation. The model scores live data. Outcomes get compared against what actually happened. Trust builds incrementally, or it doesn't build at all. Past eighteen months is typically where the value curve starts moving in a way that looks anything like the deck. And it does compound — more production data, a team that understands the operational patterns, a process that's been rebuilt around the AI output. The economics improve over time. But only if the program survives long enough to get there. If the board expects visible returns at month twelve and the program is in month nine with real costs and nothing to show yet, someone will pull funding. The time-to-value curve needs to be part of the approval conversation, not something the program team manages quietly while hoping performance picks up. The opex trap Most boards think of technology investment as capex: a project spend that produces an asset and then stops. AI programs don't work that way. The ongoing costs are material — compute that scales with usage, continuous data quality maintenance, model monitoring, and retraining when performance drifts. An organization that funds AI as a project will hit a wall when the project budget closes and someone realizes the model needs sustained investment to stay useful. This also changes the unit economics conversation. The question isn't just "what does it cost to build this?" It's "what does it cost to run this for three years?" Those are different numbers, and the second one is the one that matters for the actual investment decision. Red flags in vendor ROI models Two things in AI vendor decks deserve specific scrutiny. FTE displacement is the most commonly inflated line item. Many ROI models show cost savings by treating automated tasks as direct headcount reductions. In practice, organizations rarely convert FTE displacement into hard savings. People get redeployed to other work, absorbed into open roles, or kept on to manage the exception cases the model can't handle. The productivity gain is real — the cost reduction usually isn't, unless the organization explicitly plans a workforce reduction. A vendor model that treats FTE displacement as a direct cost saving is overstating the ROI. Efficiency gains disconnected from business outcomes are the other pattern. "Your team handles the same volume 30% faster" is a productivity improvement. It becomes a financial outcome only if the freed capacity generates revenue or the cost base actually decreases. Efficiency claims need to be connected to a specific result, not left as an assumption that value will follow. And case studies drawn from other companies at different scales in different industries are useful for direction only. The right ROI model uses your baseline, your data quality, your integration complexity, and your team's capability. A vendor can't assess any of those from a discovery call. What to actually track Total ROI — value divided by investment — tells you the aggregate return after the fact. It doesn't tell you whether a program in flight is working. The metrics that do: Model performance against baseline matters first. Is the model improving, and is that improvement translating to better decisions? The baseline needs to be set before the program starts — what was the business doing before the model existed? Without a documented baseline, there's nothing to measure against. Production adoption rate tells you whether the business integration is actually working. A model that produces output nobody acts on isn't delivering value regardless of how well it scores in testing. What percentage of model outputs are actually consumed by a decision-making process? Cost per decision at volume should decrease as throughput scales. If it isn't, the infrastructure design or use case economics have a problem worth investigating. Retraining cost trend should improve as models mature. If the cost and time to retrain keeps rising, the data architecture has a compounding problem that will only get worse. The success definition that usually goes missing AI programs get approved with vague success criteria because specificity feels like it creates accountability before the team has figured out what's achievable. That logic runs backward. Vague criteria are what allow programs to run for eighteen months without anyone agreeing on whether they're working. A complete success definition has four components: a specific metric, a documented baseline, a numeric target, and a date. "Improve fraud detection" is not a success definition. "Reduce the false negative rate from 4.2% to below 2.5% by Q3 of next fiscal year" is. The CFO's job is not to slow the program down by demanding this. It's to make the investment defensible when someone asks whether it's working. And in every organization I've seen do this at scale, someone eventually asks.

Read full article
The AI Business Case: Why the Numbers Rarely Survive Reality

The AI Business Case: Why the Numbers Rarely Survive Reality

Every AI investment proposal I have reviewed in the past three years has had a compelling financial case. The productivity gains are specific, the cost savings are quantified, the revenue uplift is modeled, and the payback period is well inside what the investment committee would find reasonable. Most of them have also been wrong — not dishonestly, but systematically. The assumptions that make the numbers look good are made in a particular direction, and they tend to break in a particular direction too. The CFO who understands the pattern can ask the right questions before the commitment rather than investigating the variance afterward. How AI business cases are typically built The structure of an AI business case is generally one of three things: productivity improvement, cost reduction, or revenue enhancement. Often two of those, sometimes all three. Productivity cases are the most common. The model identifies a set of tasks that employees currently spend time on, estimates the reduction in time per task from AI assistance, multiplies by headcount and average cost, and arrives at a total productivity benefit. This benefit is then either translated into cost savings (if the productivity gain enables headcount reduction) or revenue capacity (if the freed-up time is assumed to generate additional output). Cost reduction cases focus on replacing a specific cost line with a lower-cost AI equivalent: automated processing replacing manual review, AI-assisted support reducing support ticket volume, AI-generated content reducing external agency spend. Revenue enhancement cases are the hardest to validate. They typically model increased conversion from better personalization, faster sales cycles from AI-assisted prospecting, or improved retention from AI-driven customer engagement. All three structures make assumptions that deserve scrutiny. The productivity case: where it falls apart The productivity benefit in an AI business case is almost always calculated as: time saved per task × number of tasks × cost per hour. The output looks rigorous because the components are quantifiable. The problem is in the assumptions embedded in each component. Time saved per task. Productivity estimates for AI tools tend to be derived from vendor-provided benchmarks, early adopter case studies, or lab conditions that do not reflect the complexity of the target organization's actual tasks. In practice, AI tools perform better on well-structured, high-volume, low-complexity tasks and worse on tasks that require organizational context, judgment, or integration with messy internal data. The business case rarely distinguishes between task types. Realization of saved time as economic value. The larger problem: even if the time savings are real, they do not automatically translate into economic value. An employee who saves an hour a day through AI assistance does not produce an extra unit of output or enable a headcount reduction unless the organization deliberately redirects that time. Most organizations do not, and the time is absorbed as slack rather than captured as value. I have seen productivity estimates that modeled 30% efficiency improvement across a 500-person workforce translate into an economic case requiring either 150 fewer employees or a 30% increase in output volume. Neither happened, because nobody had a plan to actually capture the freed capacity. Change in task volume over time. As the AI system is used and trusted, the scope of what it is used for often expands, absorbing the productivity savings in handling more work at the same cost rather than handling the same work at lower cost. The cost reduction case: where it falls apart Cost reduction cases tend to be cleaner in structure but optimistic in two specific ways. Implementation and operating costs. The business case benefits are usually calculated net of license costs but not fully net of implementation, integration, change management, training, and ongoing operational costs. A cost reduction case that shows net savings of $2M per year before accounting for $1.5M of implementation and $600K of annual operating costs is not a savings case — it is marginally break-even in the first three years with significant execution risk. Partial automation economics. Many AI automation cases are built on the premise that the AI handles a defined portion of a task, reducing human effort for the remainder. The economics of partial automation are frequently miscalculated because the human labor required for oversight, exception handling, and quality review is underestimated. A process where AI handles 80% of cases automatically and humans handle the remaining 20% does not cost 20% of the original — it often costs 40-50% because the exception cases require more effort per case than the routine ones, and the oversight of the automated cases is not free. The revenue enhancement case: where it falls apart Revenue enhancement cases should be held to the highest scrutiny because they are the hardest to falsify before the investment and the easiest to attribute other causes to if they fail. The specific assumption to challenge: revenue enhancement from AI is almost always modeled as an incremental benefit on top of the existing business trajectory. If the sales cycle is improving anyway, some portion of the improvement is attributed to AI. If retention is improving, some portion is attributed to AI personalization. The counterfactual — what would have happened without the AI — is almost never established. Ask how the business case quantifies the incremental contribution of AI specifically, as opposed to other factors moving in the same direction. If the answer is that it is impossible to isolate, the revenue numbers in the business case are assumptions dressed as projections. What a CFO should specifically challenge The realization rate. How will the organization actually capture the productivity benefit? Is there a plan to redeploy freed capacity, or is the assumption that it translates automatically into value? If there is no explicit realization plan, discount the productivity benefit substantially. The fully loaded cost. Have implementation, integration, change management, and ongoing operational costs been included? If the cost side is license fees only, the payback period is understated. The task mix. What proportion of the tasks in scope are well-structured and repetitive versus context-dependent and complex? The business case should show different adoption rates for different task types, not a single adoption rate applied across the board. The timeline assumptions. AI implementations almost always take longer and cost more than the business case assumes. How sensitive is the payback period to a six-month delay in deployment, or to adoption rates that are 30% lower than modeled in year one? The pilot evidence. Is there a pilot or proof-of-concept that demonstrates the modeled performance in the specific organizational context? Business cases built on vendor benchmarks without organizational validation should be required to run a pilot before commitment. What to take from thisProductivity benefits in AI business cases often model time savings accurately but fail to account for how that time will actually be captured as economic value. A plan for realization is as important as the estimate. Cost reduction cases frequently understate implementation, integration, and ongoing operational costs. Get the fully loaded cost before evaluating payback period. Partial automation economics are usually miscalculated. Exception handling and oversight are not free; account for them explicitly. Revenue enhancement cases without an established counterfactual are projections dressed as analysis. Require a measurement approach before the investment. Require a pilot with organizational data before full commitment on large AI investments. Vendor benchmarks do not predict performance in a specific organizational context.The CFOs who navigate AI investment well are not the ones who apply the highest discount rates to AI business cases. They are the ones who ask the specific questions that distinguish a credible case from a well-presented one — and who require the answers before signing off.

Read full article