Knowledge Exodus

Knowledge Management in IT: What Happens When Key Staff Leave

April 27, 20267 min read

The People Who Left Took Your IT Incident Process With Them

I want to tell you about a type of incident that doesn't get talked about enough.

Not the dramatic kind. Not the ransomware attack or the catastrophic infrastructure failure. The quiet kind. The kind where something relatively straightforward breaks, and the resolution takes four times longer than it should - not because the problem is complex, but because the people who understood the system aren't there anymore.

I've seen this happen after redundancies. After resignations. After reorganisations, where teams were shuffled, and institutional knowledge was scattered. And I've seen it happen in companies that by every external measure looked well-run - good tools, competent teams, reasonable processes.

The knowledge was just never written down. And when the people who held it left, it left with them.

How Technical Debt and Knowledge Debt Compound Each Other

Technical debt is a concept most IT leaders understand. You make a pragmatic decision - ship now, fix later - and the later never quite arrives. The workaround becomes the system. The temporary fix becomes permanent. Over time, the gap between how the system was designed and how it actually works grows, and the people who bridge that gap are the ones who remember both versions.

What gets talked about less is knowledge debt. The accumulated gap between what the system does and what is documented about it. The decisions that were made and never recorded. The dependencies that everyone assumed everyone knew about. The reason the alert threshold is set at that number and not a different one - a reason that made complete sense at the time and now exists only in the memory of someone who may or may not still work there.

When technical debt and knowledge debt compound - when the system has evolved significantly from its documented state AND the people who know the current state have left - the result is an incident response capability that is far more fragile than it appears. You have a disaster with a delayed recovery.

Everything looks fine until it isn't. And when something breaks, the team is simultaneously trying to fix the problem and understand the system well enough to fix the problem.

What I Watched Happen

In one situation I was close to, a company lost two senior technical people within a few months of each other. One left for a better opportunity. One was made redundant as part of a restructure. Both departures were handled professionally - notice periods were served, handovers were attempted.

The handovers were inadequate. Not through bad faith on anyone's part. Through the simple reality that the knowledge was too deep, too contextual, and too embedded in daily practice to transfer in two weeks of handover meetings and documentation sprints.

Six months later, a moderately complex incident hit. Under normal circumstances - with either of those two people available - it would have been resolved in under two hours. Instead it took the better part of two days. Not because the remaining team was incapable. Because they were working in a system they partially understood, with documentation that described an earlier version of the infrastructure, without the context that would have told them where to look first.

The business cost was significant. The human cost - the exhaustion, the stress, the erosion of confidence in the team - was harder to measure and longer lasting.

The Knowledge Transfer Failure Is a Process Failure, Not a People Failure

I want to be clear about something, because I've seen this situation handled badly in the aftermath.

When incidents like this occur, there is a temptation to attribute the problem to the people who left - they didn't document enough, the handover wasn't thorough enough - or to the remaining team, who should have asked more questions, sought out more knowledge.

Both of those attributions miss the point.

The failure is a process failure. Specifically, the failure to build knowledge management as a continuous organisational practice rather than something that happens in a hurry when someone hands in their notice.

If the only time documentation gets written is during an exit process, it will always be inadequate. The person writing it is leaving. They're distracted, sometimes emotional, focused on their next role. The person receiving it is trying to absorb months or years of context in a matter of days. The conditions are structurally unsuited to effective knowledge transfer.

The alternative - building knowledge management into the daily practice of the team - is not glamorous. It doesn't feel urgent when everything is working. But it is the thing that determines whether your incident response capability is resilient or brittle, regardless of who is in the building on any given day.

What Proper IT Knowledge Management Looks Like in Practice

I'm not going to suggest that every team needs a full-time knowledge manager or a complex documentation system. For most companies of 15 to 200 people, what's needed is more specific and more achievable than that.

Runbooks that are written and maintained as a standard part of the work - not created retrospectively when someone is leaving. Architecture decision records that capture not just what was decided but why. On-call documentation that is tested regularly - meaning someone other than the author has used it successfully to resolve an actual problem.

A simple rule that I've found effective: if a process can only be executed by the person who designed it, it isn't a process yet. It's a personal practice. The goal is to convert personal practices into documented processes that anyone with the appropriate technical level can follow.

Good IT knowledge management doesn't require an elaborate platform. It requires a commitment to writing things down as part of doing them - and a leadership decision that this matters enough to be a genuine expectation, not an aspiration that gets deprioritised whenever something more urgent arrives.

The Conversation Most Leadership Teams Avoid

There is a conversation that most leadership teams find uncomfortable, and it goes something like this.

If your three most knowledgeable technical people left next month, how long would it take your remaining team to resolve a major incident in your core systems?

If the honest answer is "much longer than it should" or "I don't actually know," that's not a reason for alarm. It's a reason for action. It's telling you that your incident response capability is currently person-dependent rather than system-dependent, and that the risk associated with any individual departure is higher than it needs to be.

Building resilience into the team - through documentation, through cross-training, through deliberate knowledge sharing - is not a one-time project. It's an ongoing practice. And it's significantly easier to build when everything is calm than to reconstruct after the people who held it together have already gone.

Frequently Asked Questions

What is knowledge debt in IT?

Knowledge debt is the accumulated gap between how your systems actually work and what is documented about them. It grows quietly over time - especially after staff departures, system changes, or periods of rapid development - and becomes a serious risk during incident response, when teams are relying on documentation that no longer reflects reality.

How do I reduce the risk of key person dependency in IT?

The most effective approach is building IT knowledge management as a continuous practice: maintaining runbooks as part of daily work, using architecture decision records, and regularly testing documentation with people who didn't write it. Cross-training and deliberate knowledge-sharing sessions also reduce single points of failure.

What should IT runbooks include?

Effective runbooks include step-by-step resolution steps for common incidents, escalation paths, system access information, known dependencies, and context for why key decisions were made. Critically, they should be tested by someone other than the author to verify they can actually be followed under pressure.

How often should IT documentation be reviewed?

At a minimum, documentation should be reviewed when significant system changes are made, when team members join or leave, and after any incident where outdated documentation contributed to a slower resolution. Many teams build a quarterly documentation review into their operational calendar.

Back to Blog