Sketched illustration of an executive team gathered around a model bridge on a conference table
After the Safety Work

A note for leaders who built psychological safety and are now looking at what is behind it.

By: Andre Kotze

A note for leaders who built psychological safety and are now looking at what’s behind it

If you lead a People function, you’ve probably spent part of the last five years building psychological safety. By most measures, it worked. Surveys improved. People who used to stay quiet started raising things. Skip-levels stopped feeling like compliance theater.

Here’s what you may also have noticed in the same period: the gains stopped translating cleanly into decision quality. Not in a way that points back at the safety work — meetings got warmer and more participative, exactly as you’d want. But decisions coming out of these healthier rooms still fail at about the rate they used to. What’s changed is the texture of the failure. The surprises that used to read as “people weren’t telling us things” now read as “people were telling us things, and the things didn’t add up to the picture we needed.”

This version of the problem is harder to name, because the obvious move — keep investing in candor — doesn’t fit it. The room is already candid. Whatever’s producing the fragile decisions sits somewhere candor doesn’t reach.

We’ve been watching this from the behavioral side for a while. On first hearing, the pattern can sound like a critique of the safety work. It isn’t. It’s a second pattern that lives alongside safety, can hide inside it, and responds to a different kind of attention.

What the behavioral data shows

Across more than 700 groups observed by trained practitioners since 2021, we code the functional type of each verbal contribution in real time — not what people say, but the kind of move each statement makes. A proposal. A question. An agreement. An extension of someone else’s idea.

One pattern holds steadily across the dataset: after someone puts a new idea on the table, the most likely next move from anyone else is agreement. Less likely is building on the idea — taking what was proposed and developing it.

The numbers:

  • Supporting — explicit agreement or validation — runs at about 9.0% of all coded behavior.
  • Building — extending or refining a proposal someone else made — runs at 2.9%.
  • Proposing a new idea runs at 11.2%.

So ideas get generated at roughly four times the rate they get developed. And agreement shows up about three times as often as development.

(A coding note, because it matters: “that’s a great idea, and we could extend it to Europe” counts as two behaviors, not one — the agreement and the build, scored separately. The companion piece on Substack works through the figures in full.)

What this looks like in practice: a meeting runs smoothly, with participation and warm responses and no dissent left in the room — and still produces decisions made from proposals in roughly the shape they arrived. The room didn’t make the idea more robust. It approved it.

This isn’t groupthink in the sense Janis meant. There’s no suppressed dissent, no self-censoring. The disagreement that would have surfaced the bad assumption simply wasn’t reached for — by anyone — because reaching for it takes a specific move, and that move runs about a third as often as agreement.

Why this isn’t an argument against safety

Here’s the part that complicates the simple version.

Supporting — the “great point,” the “I’m with you” — is one of the building blocks of what good safety work produces. Edmondson describes psychological safety as the shared belief that interpersonal risk-taking is safe, and one way that belief shows up behaviorally is people receiving each other’s contributions warmly. A room that doesn’t do this isn’t a safe room.

And in our observations, Building tends to be more common in psychologically safe teams, not less. People extend each other’s ideas more freely when it doesn’t feel like encroachment. In competitive or hierarchical rooms, extending someone’s idea can read as a claim on it, so people propose their own instead. On the whole, safety helps Building. The relationship between safety and depth isn’t a trade-off, and any argument that treats it as one is too clean to be true.

What the data does surface is a specific failure mode that hides inside the appearance of safety. We see groups whose surface signature looks healthy — high Supporting, warm climate, almost no disagreement — but where the disagreement is missing because nothing is getting challenged, not because everything was examined and found sound. What looks like trust is functioning as avoidance. The room is agreeable. It isn’t examining.

Both can be true at once, and from inside the meeting they’re nearly impossible to tell apart — because agreement and examined-agreement produce the same warmth.

That’s the distinction worth holding. A safe room is a necessary condition for depth, not a sufficient one. You can have a room that’s genuinely safe — where the safety work plainly did its job — and where the developmental move still isn’t happening, because the room slid from “safe enough to challenge” into “agreeable enough that nobody does.”

We hold this carefully, because the data is observational and contains no outcome measures. We can see the high-Supporting, low-challenge pattern clearly and describe its structure. We can’t tell you how often it’s made a specific decision worse than it would otherwise have been. The mechanism — agreement closing the loop on an idea before anyone tests it — is visible and consistent. The consequence for any single decision sits a step beyond what we can directly observe.

What we can say with more confidence: this version of the problem isn’t fixed by more of the same safety work. If the room has slid into agreeableness, making it safer can deepen the very comfort that’s suppressing challenge.

What this changes for the People function

If this pattern is real in your organization, a few things shift in what you watch for.

The diagnostic story you tell about a decision that didn’t hold. The usual question is whether someone knew something they didn’t say. In a healthy-safety culture, the answer is increasingly “no” — and that “no” used to be diagnostic, pointing at whatever was suppressing candor. Now it more often means the candor was there and the failure was elsewhere. The question worth adding: did anyone develop the idea after it was proposed, or did the room register approval and move on? Most executive post-mortems don’t ask this, because the vocabulary built up over the safety era doesn’t include it.

Meeting design becomes a more visible lever. The rate at which ideas get developed responds to structure more than culture — agenda density, time per item, who proposes first, whether the first response to a proposal is set up as a question rather than a position. The same team builds less in a denser meeting than in a sparser one. That’s useful, because culture work is slow and structural work is fast. If the depth gap is what’s costing you, you can act on it next week.

Supporting and Building do different jobs. Both rise in a healthy climate, but a room can be rich in one and thin in the other. Supporting validates; Building examines. When a decision rests on a proposal that was warmly endorsed but never extended, it rests on a narrow base — often the first version of the idea, approved before it was opened up or pressure-tested. That kind of decision looks like consensus and behaves like fragility. The remedy isn’t less safety. It’s deliberate attention to the developmental move, which has its own behavior and needs its own encouragement.

This tends to bite hardest at the top. We don’t have outcome data, but the behavioral pattern by level is consistent: development is no more common in senior rooms, and often less. Time pressure is part of it; so is the higher social cost of extending a senior peer’s idea rather than a junior colleague’s. The highest-stakes decisions, made in the rooms where you spend the most time, are the ones most exposed to the gap.

Where to start

None of this calls for a new program. The first move is smaller and diagnostic.

Pick one or two decisions from the last six months that didn’t fully land in execution. Look at the post-mortem, if there was one, and notice whether it asked what got developed of the proposal between when it was made and when it was decided. If it didn’t, that’s a question worth re-running with the people who were in the room. If the honest answer is “we approved it more or less in the shape it arrived” — and in the rooms we observe, that’s common — you’ve found the gap in your own organization. From there the meeting-design moves get specific: you’re not redesigning meetings in general, you’re redesigning one kind of decision meeting in response to something a post-mortem just surfaced.

One caution, plainly. None of this says psychological safety is overrated, or that you should pull back from the work. Safety remains foundational. And the rate at which ideas get developed isn’t a target to chase — treated as a dashboard metric, it would collapse a structural observation into a performance number, which is the wrong use of it. The point is to add a question to the diagnostic toolkit, not a figure to a scorecard.

If you’ve been watching decisions wobble in a culture that by every other measure looks healthy, and the standard remedies haven’t moved the needle, we’d be interested in what you’re seeing — not as a sales conversation, but as an exchange between people who’ve watched the same problem from different angles.

The companion piece on Substack works through the behavioral pattern in fuller detail, including the interpretations we hold open about why the gap exists. https://substack.com/@autusnam663095/note/p-197528472?r=7433el&utm_source=notes-share-action&utm_medium=web

This is the executive-team version of a smaller conversation: what to do with the recognition that the safety investment, while right, wasn’t the whole answer.