In many executive conversations, the AI adoption story looks healthy on the surface. Usage is up. Demos are promising. Teams can point to faster coding, faster research, and faster drafting.
The harder issue shows up in delivery reviews. The program can look further along than it is because adoption is visible, while lead time, release confidence, and governance effort have barely changed.
Licences are assigned. Developers use assistants. The board hears that AI is now part of engineering. Then the conversation gets more practical: if teams are using AI every day, why are we still missing the same release dates? Why has lead time from idea to production barely moved? Why are senior engineers still spending review time working out what the code was meant to do? Why are QA and security still finding the same issues at the end?
In many organisations, the honest answer is that AI has made individuals faster inside the old delivery model. It has not changed how work is shaped, governed, reviewed, or released.

The adoption metric is too shallow
Usage still matters. It tells you whether the tool has entered daily work. It does not tell you whether the organisation has changed the work.
In this series, I separate three states that often get reported as one.
AI-enabled means individuals use AI to code, search, draft, explain, test, or summarise faster. This is valuable, but it is local acceleration.
AI-augmented means AI is added into parts of the workflow: planning, coding, review, documentation, test generation, ticket analysis, or release notes. The team improves parts of the lifecycle, but people still have to stitch intent, context, quality, approvals, and evidence together.
AI-native means the delivery operating model changes. Work is packaged into clear artifacts. Context is assembled deliberately. Agents execute inside bounded environments. Human judgment is spent on direction, ambiguity, architecture, risk, and approval. Every AI-assisted change carries the context, controls, evidence, and approvals needed to review it.
The executive diagnostic is simple: if you removed the AI tools tomorrow, would the workflow still basically work the same way, only slower? If yes, you are enabled or augmented. You are not AI-native.
Current tools are forcing the operating-model question
The tools are now pushing this question into workflow design, not only editor assistance. GitHub made Copilot coding agent generally available in September 2025, with background execution that can open draft pull requests for review. It has since expanded enterprise controls, audit logging, and agent management through its Agent Control Plane.
Atlassian is moving in the same direction with Rovo Dev: code planning from Jira context, code generation, pull request review, and background automation. Its April 2026 engineering write-up also described the secure execution infrastructure behind that agent platform.
For leaders, this makes the decision more concrete: which parts of delivery can AI execute or prepare, what must a human still approve, and what evidence would make that approval trustworthy?
DORA’s March 2026 analysis of its 2025 AI-assisted software delivery research sharpened the same point. The report found that 90% of technology professionals now use AI at work and more than 80% believe it has increased their productivity. It also found that higher AI adoption is associated with both increased delivery throughput and increased delivery instability. The practical tension is familiar: AI can increase output faster than the organisation can review, secure, test, and release it. If controls stay manual, the bottleneck moves from creation to verification.
A field signal from enterprise delivery
At one large Australian insurer, the useful work started after tool rollout. We did not just give developers GitHub Copilot. We connected Copilot to the systems that described the work: Figma, Jira, and Confluence. The Figma Model Context Protocol (MCP) server gave the agent the design context needed to build frontend screens against the actual Figma design. The Atlassian MCP servers pulled in Jira ticket details, acceptance criteria, discussion history, and linked Confluence pages, so the agent could code against the spec and update the Jira ticket when the change was complete.
The measured signal was greenfield sprint velocity, not total enterprise throughput. Against that greenfield delivery baseline, the program achieved a 50% acceleration and supported the wider move to compress a $400M transformation horizon from four years to three.
The lesson was not that a coding assistant, by itself, produced those outcomes. The workflow around the assistant changed. Design intent, delivery context, and review evidence started moving with the work instead of being reconstructed later.
What leaders should measure instead
Seat adoption tells you whether a tool has entered the organisation. It does not tell you whether delivery economics have changed.
A more useful executive scorecard asks:
- What percentage of AI-assisted changes reaches review with acceptance criteria, risk level, and test evidence already attached?
- Which parts of the lifecycle now run with machine assistance without increasing downstream review burden?
- Where has AI reduced handoff latency, not just typing time?
- Can we trace which context, tool access, and approvals shaped an AI-assisted change?
- Are senior engineers spending more time on judgment and less time reconstructing intent?
- Can we see cost, privacy tier, quality gates, and approval status at the level of the work item, not just in separate dashboards?
These questions expose the maturity gap quickly. A team can have high AI usage and still struggle to answer most of them.
The false summit is dangerous because it feels like progress
AI-enabled engineering is worth doing. It improves individual flow and removes friction from many daily tasks. The risk is treating that as the transformation.
Leaders see activity, pilots, dashboards, and enthusiasm while the same constraints remain: unclear demand, fragmented context, slow approvals, manual evidence collection, overloaded reviewers, brittle release paths, and uneven governance.
For CIOs, policies may exist without enough evidence flowing from the work itself. For CTOs, developers may feel faster while system throughput barely moves. For AI leaders, adoption can quietly become the target, even when the brief was to change delivery economics.
Before approving the next round of AI tooling, ask for evidence that the workflow has changed: clearer work packages, earlier risk classification, reviewable test evidence, shorter handoffs, and fewer hours spent reconstructing intent.
That sets up the next problem: why so many teams get stuck in the middle with faster engineers and the same slow company.