Research Notes

Where is the evidence that humans and AI are actually co-evolving?

← Back to Blog

When my colleagues and I were developing the strategic framework in The Architecture of AI Transformation (Wolfe, Choe, & Kidd, 2025), we mapped the enterprise AI landscape along two dimensions: degree of transformation and treatment of human contribution. What emerged was a 2x2 that surfaced four dominant patterns: individual augmentation, process automation, workforce substitution, and collaborative intelligence. The first three were everywhere. The fourth, the quadrant where transformation is structural and human capability is amplified rather than replaced, was the least deployed and the least measured. That asymmetry led me to look more closely at the mechanism that should define the collaborative intelligence quadrant: co-evolution.

Wilson and Daugherty (2018) proposed three mechanisms of collaborative intelligence: complementarity, where humans and AI contribute different strengths to a shared task; boundary-setting, where organizations define where human authority ends and AI autonomy begins; and co-evolution, where human capability and AI capability improve together through repeated interaction. Complementarity maps neatly to the individual augmentation quadrant in our framework. It is the most documented and the easiest to measure. Boundary-setting is the governance conversation. Co-evolution is the least measured and, I believe, the most consequential, because it is the only mechanism that produces durable capability growth rather than static efficiency. It is also the mechanism that would need to be operating for the collaborative intelligence quadrant to actually function as described. Without co-evolution, collaborative intelligence is just complementarity with better branding.

The hypothesis: co-evolution between humans and AI systems is occurring in organizations but is not being measured, named, or documented in the enterprise AI literature, because the field's dominant measurement frameworks privilege deployment metrics over interaction quality.

Three takeaways

First, the most rigorous longitudinal study of AI in organizations gets closest to describing co-evolution without ever naming it. The MIT Sloan Management Review and BCG Artificial Intelligence and Business Strategy research program has tracked AI implementation annually since 2017 across thousands of organizations. Their 2020 report found that only 10% achieved significant financial benefits, and that the distinguishing factor was organizational learning: companies that learned with AI, not just from it, were six times more likely to succeed (Ransbotham et al., 2020). The 2022 report established a bidirectional relationship: when individuals derive value from AI, organizations benefit as well (Ransbotham et al., 2022). The 2024 report found that organizations combining organizational learning with AI-specific learning were up to 80% more effective at managing uncertainty (Ransbotham et al., 2024). The 2025 report found that agentic AI “does not fit traditional management frameworks” (Ransbotham et al., 2025). Across eight years, this program has documented the conditions under which humans and AI systems improve organizational performance together. That is co-evolution. They have never called it that.

Second, the absence is a measurement problem, not an empirical one, and it is the same measurement problem we diagnosed in our framework paper. The enterprise AI literature overwhelmingly measures deployment: adoption rates, usage frequency, time saved, cost reduced. These metrics capture complementarity. They populate the individual augmentation quadrant. They do not tell you whether anything is actually transforming. Co-evolution requires longitudinal measurement of capability change in both the human and the system over repeated interaction cycles. Argote (2011) established that organizational learning is observable through changes in knowledge, routines, and performance over time. The MIT SMR-BCG program has shown that learning with AI predicts financial benefit. But no study I have found tracks whether human judgment improved because of working with AI, or whether AI outputs improved because of human feedback, at the level of specific workflows over time. We have instruments for complementarity. We do not yet have instruments for co-evolution. This is why the collaborative intelligence quadrant remains underpopulated: the field lacks the measurement infrastructure to detect what would distinguish it from augmentation.

What that infrastructure would need to look like is becoming clearer. The first layer is positioning: determining whether an organization is actually operating in conditions where co-evolution could occur, or whether it is optimizing within a quadrant where co-evolution is structurally impossible. Questions like whether AI strategy is driven by cost reduction or capability expansion, whether the organization sees AI as replacing tasks or amplifying potential, and whether investment is flowing toward technology or toward human-AI collaboration skills are not abstract. They reveal strategic positioning. The second layer, the one that does not yet exist, is longitudinal: tracking whether human capability and system capability are changing together over repeated interaction cycles at the workflow level. The positioning layer tells you where to look. The longitudinal layer would tell you whether co-evolution is actually happening there.

Third, I suspect co-evolution is happening, and I suspect it is hiding in the places where people describe AI as “indispensable” without being able to explain why. BCG’s 2025 AI at Work survey found that employees at organizations that have reshaped workflows around AI report sharper decision-making and more strategic work. They describe AI as a collaborative partner rather than a tool. This language is suggestive. It implies a relationship that has evolved, not a tool that was adopted. When workers describe AI as indispensable rather than helpful, they may be reporting the felt experience of co-evolution: a gradual shift in their own capability through sustained interaction with a system that was also changing. Current measurement captures the outcome (perceived indispensability) without the mechanism (bidirectional capability improvement over time).

The longer view

Organizational learning theory provides the foundational lens. Argyris and Schön (1978) distinguished single-loop learning (correcting errors within existing frameworks) from double-loop learning (questioning the frameworks themselves). In the terms of our transformation framework, single-loop learning is what happens in the incremental half of the matrix: augmentation and automation improve efficiency within existing structures. Double-loop learning is what would need to happen for the transformational half to operate: the interaction changes what the human can do, what the AI can do, and how the division of labor itself is structured. The MIT SMR-BCG finding that learning with AI predicts success is, I believe, a proxy measure for double-loop learning. But the field has not yet built the instruments to distinguish it from single-loop adaptation, which is why most organizations that believe they are transforming are actually just augmenting more efficiently.

Vygotsky’s (1978) zone of proximal development offers a complementary lens. ZPD describes the space between what a learner can do independently and what they can do with guidance. In human-AI interaction, the AI may function as a scaffold that expands the human’s zone of capability, and the human’s feedback may function as a scaffold that expands the AI’s operational effectiveness. If this bidirectional scaffolding is occurring, it would look exactly like what the enterprise surveys report: people getting better at their jobs in ways they cannot fully attribute, systems getting more useful in ways that are difficult to disentangle from the humans using them.

My two cents

I have spent enough time in AI transformation to know that absence of evidence is not evidence of absence. The MIT SMR-BCG program is the most rigorous longitudinal data we have on AI in organizations, and it keeps circling the same finding: the organizations that succeed are the ones where humans and AI systems are learning together. That is co-evolution. It is the mechanism that would populate the collaborative intelligence quadrant of the framework we proposed, the quadrant that represents the highest value and the lowest deployment. The reason it remains underpopulated is not that co-evolution is rare. It is that we have not built the instruments to see it.

What we need are longitudinal studies tracking capability change in both the human and the system at the workflow level over repeated interaction cycles. We started building the positioning layer through the AI Strategy Diagnostic that accompanies our framework. It can tell you which quadrant you are in. What it cannot tell you is whether co-evolution is occurring there (yet). That longitudinal layer is the gap. Until it is built, co-evolution will remain something I believe is happening based on pattern recognition and indirect evidence rather than something I can point to in a dataset. I am comfortable with that uncertainty, but I want to name it clearly: this is a hypothesis in search of its measurement instrument.

Try This

If you are measuring AI’s impact in your organization, ask whether your metrics capture how human capability has changed through interaction with AI, not just how AI has augmented existing capability. The difference between those two questions is the difference between complementarity and co-evolution, between the augmentation quadrant and the collaborative intelligence frontier, and the answer may determine whether your AI investment compounds or plateaus.

Read to learn more

Academic: Argyris, C., & Schön, D. A. (1978). Organizational learning: A theory of action perspective. Addison-Wesley.

Industry: Ransbotham, S., Kiron, D., Khodabandeh, S., Iyer, S., & Das, A. (2025). The emerging agentic enterprise. MIT Sloan Management Review and Boston Consulting Group.

References

Argote, L. (2011). Organizational learning research: Past, present and future. Management Learning, 42(4), 439–446.

Argyris, C., & Schön, D. A. (1978). Organizational learning: A theory of action perspective. Addison-Wesley.

Ransbotham, S., Khodabandeh, S., Kiron, D., Candelon, F., Chu, M., & LaFountain, B. (2020). Expanding AI’s impact with organizational learning. MIT Sloan Management Review and Boston Consulting Group.

Ransbotham, S., Kiron, D., Candelon, F., Khodabandeh, S., & Chu, M. (2022). Achieving individual and organizational value with AI. MIT Sloan Management Review and Boston Consulting Group.

Ransbotham, S., Kiron, D., Khodabandeh, S., Chu, M., & Zhukov, L. (2024). Learning to manage uncertainty, with AI. MIT Sloan Management Review and Boston Consulting Group.

Ransbotham, S., Kiron, D., Khodabandeh, S., Iyer, S., & Das, A. (2025). The emerging agentic enterprise. MIT Sloan Management Review and Boston Consulting Group.

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.

Wilson, H. J., & Daugherty, P. R. (2018). Collaborative intelligence: Humans and AI are joining forces. Harvard Business Review. https://hbr.org/2018/07/collaborative-intelligence-humans-and-ai-are-joining-forces

Wolfe, D. A., Choe, A., & Kidd, F. (2025). The architecture of AI transformation: Four strategic patterns and an emerging frontier. arXiv preprint arXiv:2509.02853.

Related