Omar Mustaan

05 Dec, 2025
- AI Strategy

Why Every Management Consultancy Now Has an AI Practice (And What That Means for You as a Client)

Sometime in 2022 and 2023, every major strategy and management consulting firm released a version of the same announcement: a new AI practice, a significant investment in AI capability, a commitment to helping clients navigate the AI transition. The numbers varied — some firms claimed hundreds of AI specialists, others thousands — but the message was consistent. AI is the next big thing, and we are ready to lead you through it. This was not primarily a capability announcement. It was a competitive positioning announcement. The trigger was existential. If AI was going to reshape how organizations make decisions, operate processes, and build competitive advantage, then firms whose primary product is advice on those things faced a direct threat to their relevance. The AI practice wasn't built because the firms had developed deep AI capability. It was built because not having one was a risk to the business model. This distinction matters when you're on the receiving end of an AI advisory pitch. What actually happened Major consulting firms do not build capability quickly. They build narrative quickly. The firms that announced AI practices in 2022 and 2023 didn't have AI practices in the operational sense — they had strategy practices that could speak to AI, some technology practices with data and analytics capability, and a rapidly growing collection of AI-themed slide decks. The staffing reality in most AI practices, particularly in the first two to three years, was strategy consultants who had taken machine learning courses, technology consultants who had moved into AI positioning from adjacent areas, and a smaller number of genuine practitioners — people who had actually built and deployed AI systems in production — concentrated at senior levels where they were primarily used to credentialize proposals rather than do delivery work. This is not unique to AI. It's the standard consulting model for every new technology category: the firm establishes a practice, builds the marketing narrative, and races to build actual capability behind it before clients realize the gap. The firms that invested most heavily in genuine AI capability — in hiring practitioners with production experience, in building internal AI tools, in running their own AI programs — do exist. The quality gap between them and the firms running strategy work with AI rebranding is real, and it's visible if you know what to look for. What genuine AI capability looks like A consulting firm with genuine AI capability can show you production AI systems they've delivered — not demos, not prototypes, not internal tools. Systems in production at clients, operating at scale, monitored and maintained by the client after the engagement ended. They can tell you specifically what went wrong in delivery and what they learned from it. AI programs that have never failed in delivery either haven't been through delivery at the scale and complexity they're claiming, or they're not being honest about the history. The senior practitioners on the engagement — not the people who pitched it, the people who will be in the room — should have direct experience owning AI programs in production. That means having been accountable for model performance, retraining decisions, production incidents, and stakeholder relationships when results were mixed. They should have a clear view on where they add value and where you shouldn't hire them. A firm that claims to do everything in AI — strategy, delivery, engineering, operations — is almost certainly overstating capability in at least one of those areas. Genuine practitioners are typically clear-eyed about scope. The cover slide test Ask the firm to show you an AI strategy they developed for a comparable client. Redacted is fine. Then ask yourself: how different is this from a digital transformation strategy from five years ago? The AI strategy documents I've seen from repositioned strategy practices tend to have the same structure as every other technology transformation strategy: current state assessment, capability gap analysis, use case portfolio, operating model recommendations, investment roadmap. The AI content is in the use cases — different applications, different tools — but the strategic logic is identical to the framework the same firm was applying to cloud transformation or data analytics five years earlier. That's not necessarily wrong. Some strategic frameworks are durable. But it's diagnostic. If the firm's AI strategy is structurally indistinguishable from their prior digital strategy work, the AI expertise is probably in the examples rather than in the methodology. And examples without methodology give you a document, not a capability. The POC incentive problem Consulting firms have a structural incentive toward proof-of-concept work and away from production delivery — and it's worth understanding why, because it shapes what you're buying when you hire them for AI. A POC engagement is bounded, visible, and politically low-risk. The client sees a working model. The engagement team gets credit for a demonstration. The timeline is short enough to maintain senior partner attention. When it succeeds, it becomes a case study. When it fails, it's an experiment rather than a program failure. Production delivery is the opposite. It's longer, more expensive, politically messier, and the credit is diffuse. The consultants are one of several parties involved. Problems surface slowly. The partners have moved on to the next client by the time the model is in production, so the reputational upside is limited. This means a consulting firm left to its own advice will systematically recommend more POCs and more strategy work — and less production delivery — than is actually in the client's interest. Not from bad faith, but from incentive alignment that runs counter to the client's objective of getting AI into production. If you're hiring for AI, be explicit about what you're buying: strategy, delivery, or both. And if it's delivery, be specific about what "delivery" means — a deployed model in production, monitored and owned by your team after the engagement, not a handoff package that requires the same firm to operate. What to demand from an AI advisory engagement Before signing an AI advisory engagement, the questions worth asking: Who are the senior practitioners on this engagement — not the partners who pitched it, the people who will be working on it day to day — and what production AI systems have they specifically delivered? What does the engagement end state look like? At completion, what does the client own, can operate without external support, and has the internal capability to maintain? What is the firm's model for capability transfer? Is knowledge transfer written into the SOW as a deliverable, or is it a by-product of working alongside the team? What has gone wrong in comparable engagements, and what did you learn? Can you speak to a client where the engagement ended and the client is now operating independently? The answers to these questions distinguish firms with genuine delivery capability from firms with genuine strategy capability. Both are useful. Neither is a substitute for the other, and conflating them is where most AI advisory relationships go wrong for the client. What good looks like A good AI advisory engagement leaves the client more capable than when it started. The client team understands the models being used, can maintain and retrain them without external support, and has developed internal judgment about AI investment decisions. This is achievable. It requires an engagement structure that prioritizes knowledge transfer — embedding alongside client teams rather than working in parallel, documentation that is maintained by the client rather than produced by the consultant, and handoff criteria that verify internal capability before the engagement closes. It also requires a firm that has an economic model compatible with building client capability rather than client dependency. Those firms exist. Finding them requires knowing what to look for and being willing to ask the uncomfortable questions before the contract is signed.

Read full article

21 Nov, 2025
- AI Strategy

The Hidden Cost of AI Advisory: When Strategy Work Creates Dependency, Not Capability

There are two fundamentally different things an AI advisory engagement can deliver. One leaves you with a document and a roadmap. The other leaves you with the capability to execute. Both look similar on a proposal. The difference only becomes visible at the end of the engagement — when you find out whether your team can run without the advisors in the room. Most clients don't ask which model they're buying until they're already in it. By then, the incentive structure of the engagement has been set, the dependency has been built, and reorienting toward capability transfer requires renegotiation that's awkward to initiate. This is not, for the most part, deliberate. It's the product of an economic model that is structurally oriented toward ongoing engagement rather than client independence. Understanding that structure is the first step to buying against it. The two models A capability-building engagement is designed to make itself unnecessary. The advisory team works alongside the client's team, explicitly transferring knowledge and judgment at each stage. The client's people understand what's being built and why. By the end of the engagement, they can maintain, extend, and adapt it without the advisor present. The measure of success is whether the client can operate independently. A dependency-creating engagement is designed around the advisor's continued presence. The advisory team does the work, produces deliverables, and runs the process. Client team members observe but don't deeply understand. At the end, the deliverables are handed over, but the knowledge that produced them stays with the advisory team. The client has a roadmap and a strategy document. The next step of execution requires bringing the advisors back in. Both are defensible business models. One serves the client's long-term interest. The other serves the firm's. The client's job is to identify which they're buying and decide whether it's what they want. Why dependency is the default Advisory firms are businesses. Their revenue comes from advisory engagements. Building client capability that makes future engagements unnecessary is economically irrational from the firm's perspective. This isn't cynicism — it's how markets work. A firm that builds client dependency retains revenue more predictably than one that builds client independence. The partners who sell engagements are measured on revenue. The incentive structure flows from there. The firms that build genuine client capability do so either because they're competing on reputation in a market where long-term client relationships matter more than repeat engagement revenue, or because their principals have personal commitments to a different model. These firms exist. They're a minority. The practical consequence for clients: assume the default is dependency and require explicit proof of the contrary. Don't assume that a firm that talks about knowledge transfer in its proposal actually delivers it — the language is easy to include and rarely audited. How to spot a dependency-creating engagement before you sign The warning signs appear in the proposal and in the discovery conversation, if you know what to look for. Staffing structure that concentrates expertise externally. If the senior AI practitioners are all on the advisory side and the client team is positioned as project management and coordination, the knowledge will stay with the advisors. In a capability-building engagement, the client team is working alongside practitioners, not managing them. Deliverables defined as documents rather than capabilities. A strategy document, a roadmap, an architecture design — these are outputs that the advisory team produces. A team that can evaluate AI use cases, a working deployment pipeline, a model monitoring process the client team operates — these are capabilities the client retains. If the SOW describes the first category and is silent on the second, that's what you're buying. No knowledge transfer milestones. Capability-building engagements have explicit milestones for what the client team can do at each stage of the engagement. If the project plan has delivery milestones but no client capability milestones, the knowledge transfer is aspirational rather than contractual. Vague handoff criteria. "We'll transition knowledge at the end of the engagement" without specifying what that means, who will assess it, and what criteria define successful handoff is a dependency structure dressed as capability transfer. The "we'll be available after the engagement" framing. This is the softest version of the dependency model — the engagement ends but ongoing support is positioned as the natural next step. For a capability-building engagement, ongoing support should be the exception for genuinely novel problems, not the expected path for running what was built. What to demand contractually Protecting against dependency requires it to be explicit in the contract, not assumed from the proposal language. Capability transfer milestones written into the SOW. At each project phase, specify what the client team can do independently: run the model evaluation process, deploy a model to the serving infrastructure, interpret monitoring outputs and make retraining decisions, conduct a data quality assessment for a new use case. These are testable. If the advisory team can't commit to them, they're not building capability. Shadowing requirements. Client team members should be doing work alongside advisory practitioners, not observing presentations of completed work. The difference between embedded working and presentation-based delivery is the difference between capability transfer and information transfer. Documentation standards written for internal operation. The documentation produced during the engagement should be sufficient for the client team to operate what was built after the engagement ends. "Sufficient" means tested — have someone on the client team who wasn't in the room use the documentation to execute a task. If they can't, the documentation isn't complete. Handoff assessment. A defined point — typically at engagement close — where the client team's independent capability is assessed against the milestones defined in the SOW. If the milestones aren't met, the engagement isn't complete. This changes the incentive structure. The key-person risk inside consulting engagements Advisory engagements have their own key-person risk, and it's more acute than clients typically recognize. The senior practitioner who ran the AI program at the client may have been that one person at the advisory firm who actually understood what they were building. When the engagement ends and that person moves to the next client, the institutional knowledge about the client's system moves with them. This is different from normal consulting key-person risk, because AI systems require operational understanding to run correctly. If something breaks in the monitoring pipeline six months after the engagement ends and the person who built it is no longer available, you're rebuilding from documentation — or calling the advisory firm back in. The mitigation is the same as for the dependency issue generally: the client team needs to genuinely understand the system, not just manage it. That requires them to have been involved in building it, not just receiving it. What good looks like at the end A capability-building engagement ends with the client team demonstrating specific capabilities, not receiving a final report. The lead data scientist on the client side can explain why specific model choices were made and what the tradeoffs were. The ML engineer can run the retraining pipeline independently. The product owner can read the monitoring dashboards and knows what threshold triggers a review. The program lead can run a data quality assessment for a new use case without advisory support. These are concrete things. They can be tested before the engagement closes. They require the advisory team to have spent the engagement building them, not producing deliverables about them. If you're evaluating an AI advisory engagement, ask the firm to describe specifically what your team will be able to do at the end that it can't do now — and ask them to put it in the contract. The answer to that question tells you more about the engagement model than any amount of proposal language about partnership, knowledge transfer, and client-centricity.

Read full article

07 Nov, 2025
- Enterprise AI

What the C-Suite Gets Wrong When Briefed by Their Own AI Teams

I've sat on both sides of the AI executive briefing. I've given them, and I've prepared executives to receive them. The gap between what gets presented and what's actually happening in an AI program is not unique to any particular organization — it's structural, and it runs in one direction consistently. The direction is optimism. Not because AI teams are dishonest. Because they're humans operating in an organizational context where progress is rewarded, setbacks are uncomfortable, and the executives receiving the briefing are rarely equipped to distinguish between a meaningful demonstration and a controlled one. The incentive structure produces a particular kind of briefing, and the C-suite needs to understand that structure to extract accurate information from it. The incentive problem An AI team's relationship with executive leadership is shaped by several pressures that all point toward positive framing. The team secured budget by promising something. Every briefing is an implicit progress report against that promise. Acknowledging that the promise was wrong — that the timeline was too short, the use case was harder than expected, the production environment is more constrained than the POC assumed — is a career-adjacent risk. The team works in a domain that most of its executive audience doesn't understand deeply. This creates both an opportunity and a temptation. The opportunity: a technically sophisticated team can explain complex tradeoffs honestly and build genuine understanding. The temptation: a technically sophisticated team can use that complexity to obscure problems that would be obvious in plain language. The team has invested months — sometimes years — in a program. The sunk cost effect is real. Acknowledging that the program is not working as designed, or that the architecture needs to change, or that the use case selection was wrong, requires a level of intellectual honesty that is harder when you've built your professional identity around the thing you're assessing. None of this is malicious. All of it is human. The C-suite needs to account for it. The three things that get consistently obscured The gap between demo performance and production performance. A live demo is not a production system. This distinction sounds obvious. In executive briefings, it consistently isn't. A demo is run on curated data, in a controlled environment, with known good inputs, by the person who built it. Production is run on real data — messier, more varied, more adversarial — in an environment with different latency, different system interactions, and different edge cases than the demo accounted for. The performance gap between demo and production in AI systems is often 15–30 percentage points on key metrics. A model that achieves 94% accuracy in a demo may achieve 78% accuracy in production against the real distribution of inputs. The team knows this, or suspects it, and the demo is generally not where they surface it. When a briefing leads with a demo, the question that matters is: what does this look like against the real production input distribution, over the last 30 days? Not "can you show me it working" — "what's the monitored performance over real traffic?" The timeline the team actually believes vs. the timeline in the deck. Project timelines in AI executive briefings are almost always optimistic. The reasons are predictable: timeline estimates are produced under pressure to show momentum, AI programs have more unknown unknowns than most program types, and the cost of presenting a longer timeline (reduced budget enthusiasm, increased scrutiny) is visible while the cost of presenting an optimistic one (eventual overrun) is future. The tell is usually in the dependency language. "This timeline assumes the data infrastructure work completes in Q1" — where is the data infrastructure work currently? "This assumes we have the ML engineer hired by month three" — what's the current hiring status? Dependencies that are "assumed" in a timeline slide are often dependencies that are behind or at risk but not presented as such. The useful question: what is the most likely single point of failure in this timeline, and what's the contingency if it doesn't resolve? The production failure rate. AI programs accumulate failures — model predictions that were wrong, system behaviors that didn't match expectations, user adoption that didn't develop as projected. In executive briefings, these are typically either absent or characterized as "learnings" without the quantitative dimension that would allow an executive to assess their significance. A briefing that describes a "challenging quarter with good learnings" but doesn't specify what the model's production accuracy was over that quarter, what percentage of outputs were overridden by human reviewers, or how far business outcomes deviated from projection is a briefing that has converted failure information into narrative. The useful request: show me the model performance trend over the last six months, in actual numbers, against the performance targets that were set at program start. The benchmark trap AI teams report model performance using benchmark metrics. The most common are accuracy, precision/recall, and AUC. These are meaningful for comparing models and for tracking technical progress. They are not the same as business outcomes, and the relationship between them is often assumed rather than demonstrated. A model that improved AUC from 0.83 to 0.89 over the quarter has made genuine technical progress. Whether that progress translates into better business outcomes — more accurate fraud detection, better loan decisions, fewer customer service escalations — requires a different measurement entirely, one that connects model output to downstream business process. That connection is frequently not in the briefing. The question: for each technical metric in this briefing, what is the business metric it's supposed to drive, what is the current value of that business metric, and what is the target? If the AI team can answer that question clearly, the program has good outcome alignment. If the answer is complicated or deferred — "we're still working on the measurement framework" — the program may be optimizing for technical progress without a clear line to business value. Reading the language Certain language patterns in AI executive briefings are diagnostic of underlying program health. "We're making good progress" without specific metrics usually means the program is moving but metrics aren't where they need to be. If progress were specifically good, the specific numbers would be in the briefing. "The model is performing well in testing" without production performance data means the team is presenting test performance because production performance is worse or unmeasured. "We're seeing strong adoption" without adoption rate numbers means adoption is incomplete. Strong adoption would be presented as a number. "The data quality challenges are being addressed" means the data quality challenges have not been resolved and are affecting model performance. "Addressed" and "resolved" are different things. "We're on track" against a milestone that was previously described as at risk means the milestone was re-scoped to make it achievable, or the team has decided to declare it complete at a lower quality level than originally intended. None of these are lies. They're organizational language patterns that absorb uncertainty and make things sound more resolved than they are. Reading them accurately requires pattern recognition that executives build over multiple briefing cycles, or that they can shortcut by asking for the underlying numbers. When to get a second opinion There are situations where the C-suite should commission an independent assessment rather than relying solely on internal briefing. When a program has been running for more than 12 months and the production deployment is still described as upcoming. When the metrics in briefings have changed categories over time — when the team starts reporting different metrics than it started with, it's worth asking whether the metrics changed because the original ones showed the wrong trend. When the business unit that was supposed to benefit from the AI program is not prominently represented in the briefings as an active champion. When a specific milestone has slipped more than twice. These aren't definitive indicators of a failing program. They are indicators that the C-suite doesn't have a fully accurate picture and should find out why before making the next resource allocation decision. An independent technical advisor who can review program documentation, interview the team, and assess what's actually in production — without career stake in the outcome — produces a different quality of information than an internal briefing. The cost of commissioning one is small relative to the cost of continuing to invest in a program based on an inaccurate picture of its health.

Read full article

24 Oct, 2025
- AI Strategy

AI Strategy as Competitive Positioning

When organizations commission AI strategy work, they tend to receive a version of the same output: a capability gap assessment, a use case prioritization matrix, an operating model design, and an investment roadmap. The analysis is internally focused — what we need to build, where we're behind, how to organize to deliver. This is useful work. It's also incomplete in a way that the firms producing it have little incentive to address. The missing dimension is external: what does this organization's competitive position look like if a rival deploys AI capability at scale before they do? That question is uncomfortable to ask and to answer, it can't be answered without making specific competitive assessments that clients sometimes find alarming, and the answer doesn't always recommend more strategy work. So it often doesn't make it into the deliverable. I want to try to answer it here. Why inward focus is the default Strategy consulting is client-service work. The client defines the scope. And clients who commission AI strategy work are typically focused on their own capability gaps and delivery challenges — that's what brought them to the engagement in the first place. The external competitive dimension requires the strategy team to say things like "your largest competitor is eighteen months ahead of you on this capability and here's what that means for your market position." That's a harder conversation to have and to receive than "here are the use cases you should prioritize and here's the roadmap for delivering them." It also requires real competitive intelligence — understanding what competitors are actually doing with AI, not just what they're announcing. That's methodologically harder than internal capability assessment, the data is less reliable, and the conclusions are more defensible as inputs to a conversation than as outputs of an analysis. So they tend not to appear in formal deliverables. The result is AI strategies that are internally coherent and externally blind. They will tell you how to become more AI-capable. They won't tell you whether the pace and scope of that effort is adequate given what competitors are doing. The asymmetry of AI advantage AI competitive advantage has a compounding structure that is different from most other technology investments, and understanding this is the starting point for the external competitive analysis. The core mechanism is data. A model trained on more data, from a broader user base, over a longer time horizon, will generally outperform a model trained on less. This creates a feedback loop: organizations that deploy AI earlier accumulate more production data, which improves model performance, which enables better products or processes, which generate more data. The advantage compounds. This isn't inevitable — the compounding requires the right architecture, the right data strategy, and operational discipline to capture and use production signal. But for organizations that execute well, early deployment creates an advantage that grows with time. The implication: if a competitor deploys an AI-driven capability in your market twelve months before you do, the gap at month twelve is not the gap you're competing against. The gap at month thirty-six, after they've had three years of production data improving the system you're still building, is the gap that determines whether you can compete on that dimension. This is the analysis most AI strategy engagements don't run. The "good enough" trap The most common executive response to competitive AI risk is a version of "our current capability is good enough." The existing process works. Customer satisfaction is acceptable. The business is growing. Why take on the complexity and cost of AI transformation when things are working? This logic is historically sound for incremental technology change. It breaks for technology that compounds. "Good enough today" doesn't remain good enough when competitors are improving at the rate that AI systems improve with data. The relevant historical analog is not previous technology cycles where the transition was gradual and the gap between leaders and laggards was measurable in feature sets and product capabilities. The closer analogy is situations where a competitor's investment in a compounding advantage — network effects in a marketplace, proprietary data in financial services, algorithmic improvement in search — created a gap that was small and manageable in year one and insurmountable by year four. The "good enough" assessment needs to include a time horizon. Good enough for how long? What does the competitive position look like in eighteen months if the current AI investment pace continues and the competitor's does not? How to assess competitive AI capability from the outside Competitive AI intelligence is harder than most other forms of competitive analysis, and the signals that matter are different from the ones that get the most attention. Press releases and partnership announcements are weak signals. Organizations that are ahead on AI tend to be quieter about it, not louder — the capability is a competitive asset and announcing it in detail is a gift to competitors. The organizations making the most noise about AI capability are frequently the ones that have the most to prove. Hiring patterns are strong signals. Job postings for ML engineers, data scientists, MLOps roles, and AI product managers tell you where organizations are investing. The seniority level of AI hires tells you whether they're building for exploration or for production. An organization hiring senior MLOps engineers and ML infrastructure specialists is building for scale; one hiring junior data scientists for "AI initiatives" is still in exploration. Product behavior is the most direct signal. What your competitor's products are doing — how they're improving, what personalization or recommendation capability they're deploying, how quickly they adapt to user behavior — is observable evidence of AI in production. This requires systematic product analysis, not occasional use. Infrastructure choices provide indirect signals. Cloud provider choices, database technology, observability tooling — these leave traces in job postings, technical blog posts, and engineering conference talks that reveal something about the architecture being built. The composite picture from these signals is approximate, but it's more accurate than no analysis at all — which is the default in most AI strategy engagements. Where AI creates defensible advantage vs. table stakes Not all AI capability creates competitive advantage. Some of it is table stakes — capabilities that every player in the market will need to have to remain competitive, without any of them gaining durable advantage from it. The differentiation question is whether the AI capability encodes something specific to the organization — proprietary data, unique operational knowledge, a specific customer relationship, a process that has been refined over years — or whether it applies generic AI to a generic process. Generic AI applied to generic processes produces efficiency gains that are real but not defensible. If any competitor can achieve the same efficiency with the same tools and the same approach, the advantage is temporary at best. The cost reduction is real. The competitive differentiation is not. Proprietary AI built on proprietary data or processes is different. A model trained on years of proprietary transaction data, or on customer behavior specific to a service only that organization offers, encodes competitive advantage that is not replicable by a competitor who doesn't have the same data foundation. The strategy question is not "where can we use AI" but "where does AI, applied to what we specifically know and have, create advantage that competitors cannot replicate without our assets?" That question leads to a much shorter and more valuable list than the use case prioritization matrix that appears in most AI strategy deliverables. What the board should be asking The external competitive dimension of AI strategy is a board question, not just a management question, because the risk it describes is material to the organization's long-term competitive position. The questions worth putting on the board agenda: What is our assessment of the pace of AI deployment by our two or three most significant competitors? Where is AI deployment most likely to create competitive differentiation in our market over the next three years, and what is our current position relative to that? If a key competitor deployed a specific AI capability in the next twelve months that we don't have, what would be the business impact, and do we have a response ready? These questions don't all have clean answers. The intelligence is imperfect, the timeline predictions are uncertain, and the competitive impact analysis involves assumptions that can be challenged. But the alternative — approving an AI strategy that is silent on the external competitive dimension — is not neutral. It's a choice to optimize internally without knowing whether the pace and focus of that optimization is adequate for the environment the organization is competing in. The strategy that tells you what to build, but not whether you're building fast enough or in the right direction relative to where your competitors are going, is a strategy with a material gap. That gap is worth filling before the investment decision is made, not after it has been executed.

Read full article

10 Oct, 2025
- AI Strategy

The Second Opinion Every Board Needs on Its AI Strategy

When a management team presents an AI strategy to its board, there is a structural problem with the information flow that almost nobody in the room acknowledges. The people presenting the strategy are the same people who will be asked to execute it. They have career interests in its approval. They've spent weeks or months developing a position on where to invest and how to organize delivery. The strategy they're presenting is, inevitably, a strategy they believe in — and belief in one's own strategy is a particular kind of bias that is invisible to the person who holds it. The board's job is to scrutinize and approve, not to trust and endorse. But scrutinizing a technical and organizational strategy you didn't develop, in a domain you may not have deep experience in, using information provided by the people who want you to approve it, is genuinely hard. And the standard remedies — board education programs, independent non-executives with technology backgrounds, AI advisory reports — address the knowledge gap without addressing the conflict of interest. What addresses the conflict of interest is an independent review: an assessment of the AI strategy by someone with no financial interest in the approval or execution, commissioned by the board rather than by management. The structural conflict Most AI strategies presented to boards are management-produced documents. Sometimes they're supplemented by external advisors — consulting firms that were hired to help develop the strategy. Neither of these sources is independent. Management is presenting a strategy it developed and will implement. Its incentives are aligned with approval and the resources that follow. The consulting firm that helped develop the strategy has a commercial interest in the engagement and often a follow-on interest in the delivery work that the strategy will generate. Neither party is well-positioned to give the board an honest assessment of the strategy's weaknesses. This isn't a criticism of management or of consulting firms. It's a description of how commercial relationships work. The party with execution responsibility designs the strategy in a way that reflects their capabilities, their risk tolerance, and their organizational interests. These are legitimate inputs. They're also not the same as an independent assessment of whether the strategy is optimal for the organization. In financial audit, this problem is addressed by having the auditors appointed by and accountable to the board, not management. In strategy review, there's no equivalent standard. The independent AI review is an attempt to apply the same logic. What the review actually examines An independent AI strategy review is not a technical audit. It's an examination of whether the strategy the board is being asked to approve is coherent, realistic, and adequately governed. Coherence of the strategy. Does the AI investment portfolio connect to a defensible view of where the organization can create competitive advantage? Is there a logic to the use case selection that goes beyond "these are the best ideas we could generate internally"? Do the proposed investments reinforce each other, or are they independent bets without a strategic rationale? Realism of the execution plan. Are the timelines grounded in what comparable programs have actually taken? Are the resource requirements — budget, talent, data infrastructure — consistent with what the ambition requires? Are the risks to the timeline identified and given honest probability assessments, or are they listed as mitigations that assume away the uncertainty? Completeness of the risk picture. Does the strategy document the downside scenarios as clearly as the upside scenarios? Is the regulatory exposure understood? Are the data dependencies identified? Is the vendor concentration risk assessed? Does the governance structure address who is accountable when things go wrong? Adequacy of the governance framework. Is there a clear accountability structure for each AI investment? Is there a monitoring and reporting framework that will give the board real visibility into program health, not just progress updates? Is there a defined threshold at which the board would be asked to make a go/no-go decision on continued investment? A review that finds the strategy coherent, realistic, and adequately governed with specific evidence for each conclusion is a meaningful endorsement. A review that finds gaps — and most strategies have some — gives the board the information it needs to ask for revisions before approval, rather than discovering the gaps through program failure after the fact. What genuine independence requires Not all external reviews are independent in a meaningful sense. Independence has specific characteristics that are worth defining before commissioning a review. No financial interest in the outcome. A firm that is positioned to win delivery work based on the strategy's approval is not independent of the approval decision. This eliminates most large consulting firms, who treat strategy work as a pipeline for delivery revenue. Independence requires a reviewer with no follow-on commercial interest in what happens after the review. No prior relationship with the management team proposing the strategy. A firm that has a longstanding advisory relationship with management is not independent of management's perspective. The relationship will have shaped the reviewer's view of what management is capable of, what organizational dynamics are at play, and where to push and where to accommodate. Accountability to the board, not management. The reviewer should be commissioned by the board (typically through the chair or the audit/risk committee), report directly to the board, and have no obligation to accommodate management preferences in the findings. This requires explicit structuring at the outset — if management controls the engagement, the incentive structure will drift toward validation rather than assessment. Domain expertise sufficient to assess the claims. A reviewer who cannot assess whether an AI timeline is realistic, whether a talent plan is sufficient, or whether a technical architecture is appropriate for the use case can assess governance and financial logic but not technical strategy. For AI reviews, domain expertise is not optional. How to commission it without creating a political crisis The way an independent AI review is positioned internally determines whether it becomes a useful governance tool or a political problem. Framing it as a governance standard rather than a vote of no confidence in management makes a significant difference. The analogy to financial audit is useful here — boards don't commission financial audits because they distrust management; they commission them because independent verification is a governance standard. AI investments, as they become material to organizational strategy, warrant the same standard. Involving management in defining the scope while maintaining board ownership of the engagement maintains goodwill without compromising independence. Management can identify the key decisions they want the review to inform, the areas of most uncertainty, and the aspects of the strategy they're most confident in. This information is useful to the reviewer. It doesn't give management the ability to shape the conclusions. Sharing the findings with management before presenting to the full board — not to allow revision, but to allow factual corrections and context — reduces the likelihood that the review produces findings that management contests as factually inaccurate, which derails the board conversation. This is standard practice in financial audit. The fiduciary dimension Directors who approve material AI investments are making decisions they can be held accountable for. The AI strategies being approved by boards today represent significant capital commitments, carry regulatory exposure in multiple jurisdictions, and create organizational dependencies that will be difficult to unwind if the strategy proves wrong. In that context, approving an AI strategy based solely on management's assessment of it — without independent verification of its coherence, realism, and risk profile — is a governance decision. It's not necessarily wrong, but it's a choice, and it's one that directors should make explicitly rather than by default. The independent review doesn't make the decision for the board. It gives the board the information it needs to make the decision well. That's the job the board is supposed to be doing — and in AI, where the knowledge gap between management and board is widest, it's the job that most benefits from independent support. The board that says "we approved an AI strategy based on management's recommendation, with external advisory support from the firm that helped develop it" is in a different position than the board that says "we approved an AI strategy after an independent review that assessed its coherence, realism, and risk profile and identified the following gaps we required to be addressed before approval." The difference is not primarily legal. It's the difference between a governance process that is real and one that is notional. Boards that have been through significant governance failures know that distinction matters — before the failure, not only after it.

Read full article