The Conversational Architecture Behind Good Decisions
By: Andre Kotze

What 700+ observed groups reveal about the hidden structure of collective reasoning — and what leaders can do about it

Most post-mortems start in the wrong place

When a decision goes wrong — a strategy that doesn’t land, an implementation that hits unexpected friction, a risk that was obvious in hindsight but invisible in the room — the investigation usually focuses on the decision itself. Was the analysis flawed? Was the information incomplete? Did someone miss a signal?

These are reasonable questions. They are also, in many cases, the wrong ones.

The decision is the output. The meeting is the machine that produced it. And most organizations examine their outputs far more rigorously than they examine the machine.

There is a reason for this. The machine is difficult to see. A meeting leaves behind a decision, sometimes notes, occasionally minutes. What it doesn’t leave behind — at least not in any form most organizations capture — is the conversational structure that produced the decision. Who spoke, and for how long. Who was drawn in, and who stayed silent. Which ideas were developed, and which were displaced by the next proposal. Whether disagreement surfaced and got tested, or surfaced and got smoothed over.

This is the conversational architecture of a group. And it determines, to a degree most leaders underestimate, whether the group’s collective intelligence actually gets used.

At AirtimeBA, we’ve spent the last five years studying this architecture in detail. Since 2021, trained practitioners have observed and coded verbal behavior in more than 700 groups across four continents, logging over one million individual behaviors. What emerges from that dataset is a specific, repeatable picture of how capable groups fail to think well together — and what distinguishes the ones that do.

This piece is for leaders who suspect their meetings are underperforming and want a framework for seeing why. It won’t tell you whether any particular decision was right. That’s a question for domain judgment. It will tell you whether the conversation that produced the decision had the structural features that tend to produce robust collective reasoning, or the features that tend to produce fragile consensus.

What the data is, and what it isn’t

Before the patterns themselves, a note on method. AirtimeBA’s coding system classifies the type of verbal contribution in a group conversation — not the content of what was said. Trained observers, working in real time, assign each verbal behavior to one of 18 categories: proposing ideas, giving information, supporting, disagreeing, building on others’ contributions, seeking reactions, inviting participation, and so on. What results is a behavioral map of how a conversation worked — who contributed what kinds of moves, in what sequence, with what balance.

The data doesn’t tell us whether a particular decision was correct. It can’t capture domain expertise, factual accuracy, or the quality of the information circulating in the room. It reveals something different, and in some ways more useful: the structure through which whatever intelligence and knowledge was present either got accessed or didn’t.

Research by Anita Williams Woolley and colleagues establishes why this structure matters. Their work identified a measurable collective intelligence factor in groups — a “c-factor” analogous to individual IQ — and found that this factor was not driven by the intelligence of the smartest member, nor even strongly related to the average intelligence of the group. What drove it were two structural features: equality of conversational participation, and the social sensitivity group members demonstrated toward one another.
Woolley’s research named what good groups do. Our data shows what those behaviors look like in verbal terms, how often they actually occur, and where most groups fall short.

Three patterns account for most of the variation we see. A companion practitioner-level treatment of these patterns is published on our Substack, “Why Hiring the Smartest People Doesn’t Guarantee the Smartest Team.” What follows is the business-facing analysis — what the patterns mean for organizations trying to improve the quality of their decisions.

Pattern 1: The participation gap and what it costs

Start with the most visible pattern, because it’s also the most misunderstood.

We measure how conversation is distributed across a group using an Airtime Equity Score. Rather than expecting identical speaking times — an unrealistic and unhelpful standard — the measure looks at how many participants fall within a reasonable band of equal share: within five percentage points above or below what fully equal participation would look like.

When fewer than 15% of participants fall inside that band, the group is flagged as highly skewed. In concrete terms: two or three voices are carrying the meeting, while the rest sit largely silent.

The leadership temptation is to treat this as a soft problem — a matter of inclusion and politeness rather than performance. That framing misses the point.

Woolley’s research makes the stakes clear. Collective intelligence depends on the group accessing the distributed knowledge and perspective available in the room. When conversation concentrates, it doesn’t just mean some people talk less. It means the group’s cognitive inputs narrow. The perspectives of quieter participants — the doubt, the alternative interpretation, the operational concern that doesn’t surface until implementation — don’t enter the shared pool. Decisions get made on a fraction of the available intelligence.

Skewed participation is rarely about bad intent. It emerges from specific conversational moves, or the absence of them.

The behavior we code as Bringing In — explicitly inviting a specific person to contribute — accounts for just 1.4% of observed behaviors globally. Seeking Reactions — directly asking how someone is responding to what’s being discussed — sits at 1.3%. The moves that actively draw quieter participants into the conversation are rare. In their absence, the loudest voices fill the space by default, and the conversation reaches the decision point with inputs from a subset of the people responsible for executing it.
The cost of this pattern isn’t felt in the meeting. It’s felt afterward — in the objections that surface during implementation, the risks that weren’t named in the room, the execution friction that reads as resistance but is really unsurfaced disagreement. If you have ever watched a well-reasoned strategy fail to mobilize the organization behind it, this pattern is worth examining.

Pattern 2: Participation without co-creation

Here is the finding that surprises leaders most: a group can share airtime relatively evenly and still fail to think together.

Equal participation does not equal collaboration. What determines whether a conversation is genuinely generative is whether participants are building on each other’s thinking — or simply taking turns presenting their own.

We code a behavior called Building: extending, refining, or combining an idea that someone else has already proposed. Not introducing a new idea. Not agreeing with an old one. Actually adding a substantive new dimension to something another person contributed.

Globally, Building accounts for just 2.9% of observed verbal behaviors. Proposing Content — introducing new ideas — sits at 11.2%.

Read those two numbers together. Ideas are generated at nearly four times the rate they are developed. People propose. Others propose something different. The first proposal is set aside. More proposals come. The conversation covers ground, but it doesn’t go deep.

The Building Ratio — Building behaviors relative to Proposing Content — has a global average of approximately 1:3.9. When that ratio stretches to 1:8.5 or beyond, the pattern is clear: a group generating options without integrating them. Ideas arrive, circulate briefly, and are displaced by the next. The conversation is active, even energetic, but the output is a list rather than a synthesis.

This matters for decision quality in a specific way. A decision built on a list of competing proposals tends to be fragile. It reflects the preferences of whoever advocated most persistently, the option most easily understood, or the idea that happened to be in front of the group when time ran out. A decision built on genuinely integrated thinking is more robust, because the reasoning has been tested and refined through the process of building on it.

Organizations often confuse these two outputs. Both look like consensus. Both produce a decision. Only one reflects the collective judgment of the group.

The behavioral tell is simple. After a significant proposal, does the next contribution develop it, or pivot away from it? Count this in your own meetings for a week and a pattern emerges. Most teams, even highly capable ones, pivot far more than they develop.

Pattern 3: The advocacy–inquiry imbalance

A third pattern sits beneath both of the above.

We categorize verbal behaviors into two broad orientations. Push behaviors advocate a position — proposing, informing, disagreeing. Pull behaviors draw others in — questioning, supporting, building, inviting contribution.

The global average across our dataset is a Push/Pull ratio of approximately 2.9 to 1. For every question asked or inviting move made, nearly three statements advocate a position or share information.

A lean toward Push is not inherently a problem. Groups need to propose things, share information, advocate for positions. Decisions ultimately require someone to make a call. But when the ratio climbs above roughly 3.5 to 1, what the data tends to show is a group advocating without genuinely engaging. Ideas are pushed forward without sufficient testing. Proposals move toward decision without the friction of real inquiry. Alignment is reached — but it is the alignment of exhaustion rather than understanding.

One behavior within the Pull category deserves particular attention. Seeking Reasons — asking for the rationale, evidence, or assumptions behind a position — accounts for just 0.3% of observed verbal behaviors globally. It is the rarest substantive behavior we code.

The near-absence of Seeking Reasons is worth sitting with. It means that in most group conversations, positions are advanced, counter-positions are advanced, and the underlying logic of either is rarely examined. Disagreement becomes a collision of conclusions rather than a comparison of reasoning. Agreement becomes a convergence of preferences rather than a test of the case.

The combination — high Push, low Building, skewed participation, near-zero Seeking Reasons — creates a specific organizational risk. The meeting produces a decision with the appearance of consensus and the reality of unexamined assumptions. The decision moves forward. The assumptions surface later, in the form of problems no one anticipated.

What this means for how you see your meetings

The patterns above describe what we observe across hundreds of groups. They are not prescriptions. Every group’s context is different, every decision has its own stakes, and not every meeting needs the full apparatus of deep collective reasoning. Status updates, operational coordination, and simple information transfer don’t require Building ratios at 1:2. For those meetings, efficiency is the point.

But for the meetings that matter — strategic choices, significant resource allocation, cross-functional decisions with long implementation tails, decisions where the cost of being wrong is high — the conversational architecture is the difference between a decision that holds up and one that quietly falls apart.

Leaders who want to diagnose their own meetings without behavioral coding can look at a few observable features.

Who is not speaking? Not who is quiet by temperament — who has relevant knowledge or perspective but isn’t getting it into the conversation? Notice whether anyone is actively drawing those people in, or whether the conversation simply flows to whoever initiates.

When someone proposes an idea, what happens to it? Does the next contribution develop it, or pivot to something new? The word “and” is worth listening for — it often signals extension, but just as often signals a change of subject dressed as continuation.

What is the ratio of statements to questions? In a 60-minute meeting, count even roughly. A room full of capable people making mostly statements is a room that isn’t using its full intelligence.

What follows disagreement? When someone pushes back on a proposal, does the group probe the reasoning — what leads you to see it differently? — or does it move quickly to counter-argument or resolution? The sequence after a challenge is one of the clearest indicators of whether a group is using disagreement constructively or avoiding it.

None of these will give you the precision of behavioral coding. They will start to make visible something most organizations leave invisible: the conversational architecture through which their collective intelligence either gets used or left on the table.

What is knowable, and what isn’t

A word on what this kind of analysis can and cannot tell you.

It can tell you how a conversation was structured — the balance of behaviors, the distribution of airtime, the ratio of development to generation, the presence or absence of inquiry. It can flag patterns that correlate with conversations leaders tend to describe as generative, or with conversations leaders tend to describe as frustrating.

It cannot tell you whether a particular decision was the right one. It cannot substitute for domain judgment, strategic insight, or accurate information. A meeting with a healthy conversational architecture can still produce a poor decision if the premises are wrong or the expertise is missing. A meeting with a poor conversational architecture can still produce a reasonable decision if one person happens to be right and persuasive enough.

What the data suggests is more modest and more durable. Across hundreds of observed groups, certain structural features recur in conversations that use their collective intelligence well, and certain features recur in conversations that don’t. Making those features visible gives leaders something most lack: a view of the machine, not just the output.

A question worth carrying

The most consequential finding in Woolley’s research is that collective intelligence is real, measurable, and distinct from the intelligence of the individuals composing it. Our data adds a behavioral layer: the conversational moves that enable collective intelligence are specific, observable, and in most groups, remarkably rare.

The behaviors that draw quiet voices in: 1.4%.

The behaviors that develop ideas rather than just propose them: 2.9%.

The behaviors that ask for the reasoning behind a position: 0.3%.

These are not small gaps. They represent a systematic underuse of the moves that most directly determine whether a group’s intelligence gets accessed or left on the table.

A team’s collective intelligence is not a fixed property of its members. It is a state the group either does or doesn’t produce in the verbal fabric of its actual conversations. Hiring capable people is a necessary condition for good decisions. It is not a sufficient one. The conversational architecture is where the gap between capability and output either closes or widens — and most organizations have never looked at it directly.

The question worth carrying into your next leadership conversation isn’t whether your team is capable enough. It almost certainly is. The question is whether the way your team talks together is using that capability — or quietly leaving most of it unused.

If what we’ve described resonates with patterns you’re seeing in your own organization, we’d welcome a conversation about what you’re observing.

AirtimeBA observes verbal behavior in group conversations across 700+ groups and over one million coded behaviors, with data collected since 2021 across Europe, Africa, North America, and Asia. For a practitioner-level treatment of the patterns described here — written for coaches, facilitators, and team leads — see the companion Substack essay, “Why Hiring the Smartest People Doesn’t Guarantee the Smartest Team.”