Structured comms resolve process

What Incident Communication Can Fix - And What It Genuinely Can't

May 21, 20267 min read

What Structured Incident Communication Can Fix - And What It Genuinely Can't

One of the things I've learned from thirteen years in IT operations management is that the most useful thing you can do for someone considering any kind of process improvement is to be precise about what it will and won't change.

Vague promises - "better incident response," "improved communication," "reduced downtime" - are easy to make and hard to evaluate. They set up the wrong expectations, which leads to disappointment even when the work itself is good.

The same is true of incident management tools. AlertOps will tell you it reduces MTTA. PagerDuty will show you dashboards of improved response times. Opsgenie will demonstrate escalation routing that sounds like it will transform your incident process. Some of those promises are accurate - within the specific domain the tool operates in. None of them is the full picture.

I want to be specific. Here is what structured incident communication can genuinely fix. Here is what it cannot. And here is where tools fit - and where they don't.

What It Can Fix

The communication vacuum during live incidents. The silence that forms when the engineering team is heads down, and nobody is managing the outward-facing conversation. Structured communication - pre-defined roles, pre-written templates, clear intervals - eliminates that vacuum. Clients receive timely updates. Stakeholders are briefed. The CEO is not finding out from a client complaint. This is the most direct and immediate fix, and it produces the most visible results.

No incident management tool fills this vacuum. Tools alert the engineering team. They do not alert the client. They do not write the update. They do not decide what composure sounds like in the first communication. That is a human process built around the tool, not something the tool produces on its own.

The freeze under pressure. The incident commander who knows what to do technically but doesn't know what to say to the board. The on-call engineer who has never written a client update during a live P1. Structured communication provides frameworks for these specific moments - not scripts, but structures that remove the cognitive load of starting from nothing when the pressure is highest.

The post-mortem that changes nothing. A well-designed post-mortem process - with clear accountability, specific actions, and a follow-up mechanism - closes the loop between what was identified and what was changed. It creates conditions where completion is the expectation rather than the exception.

The recurring incident caused by a process failure. If incidents are recurring because the communication and coordination around them is chaotic - because nobody knows who owns what, because escalation paths are improvised - a structured process addresses that directly.

The client relationship deteriorates after a badly handled incident. Honest, timely, and structured communication during an incident fundamentally changes the client's experience of it. The damage to relationships that comes from silence and disorganisation is largely avoidable. This is where the investment pays back most clearly.

What It Cannot Fix

Fundamentally unstable infrastructure. If your systems are generating incidents at a rate that reflects architectural problems or technical debt, structured communication can make those incidents less damaging, but it won't reduce how often they happen. The technical problems need to be addressed technically. This is also true of tooling: no incident management tool reduces incident frequency. It improves your response to incidents as they occur.

A blame culture that prevents honest escalation. I can design an escalation process. But if the culture means that people are afraid to escalate, the process will not be used when it matters most. Culture change requires commitment from leadership that goes beyond implementing a framework - or purchasing a platform.

The consequences of an incident that was technically mishandled. If an incident caused significant damage due to an engineering decision that was wrong, communication cannot undo it. It can manage the relationship through the aftermath. It cannot change what happened technically.

The decision not to invest in prevention. If an organisation repeatedly chooses not to address root causes, communication support can reduce the cost of each incident, but cannot change that strategic decision.

A team that isn't committed to using what gets built. A playbook that isn't practised. An escalation matrix that isn't used because the team reverts to habit under pressure. The process is the easy part. The commitment to using it is where most implementations succeed or fail. Tools have exactly the same failure mode: a well-configured platform that the team doesn't use correctly produces the same outcome as no platform.

Where Tools Fit - And Where They Don't

I want to be direct about this because it's the question behind many of my conversations.

Incident management tools - the good ones - genuinely improve the speed and reliability of alerting, escalation routing, and incident tracking. They reduce the time between something breaking and the right people being notified. They make on-call scheduling more manageable. They create audit trails that post-mortems can use. These are real and valuable capabilities.

What they don't do: they don't tell you who owns the client communication. They don't decide what the first update says or when it goes. They don't determine whether your post-mortem actions get completed. They don't address the blame culture that causes delays in escalation. They don't train your incident commanders. They don't build the human process that everything else depends on.

The right sequence is: define the process, then choose the tool that serves it. Not the other way around. I work with organisations before the tool decision - helping them understand what process they actually need - and after it, ensuring the tool works within a human system that holds up under real pressure. The tool is one component. The human process is the foundation.

What This Means for How to Evaluate the Investment

Identify which of the fixable problems you have. If the communication vacuum is real, if the freeze under pressure is real, if the post-mortem process isn't closing the loop - those are the things that will change with structured communication.

Identify which of the unfixable problems also exist. If the infrastructure is the primary issue, address that in parallel. If the culture doesn't support honest escalation, the process work has limited reach until the culture shifts.

Be honest about the commitment question. Do you have the capacity and will to implement what gets built? If yes, the results will be clear. If not yet, understand what needs to change first.

And if you're mid-way through a tool evaluation - ask yourself whether you've first defined the process that the tool will need to serve. If not, that's where to start.

Frequently Asked Questions

What is the most common reason incident communication processes fail?

The most common failure is the embedding problem - the process is documented but never practised, so it isn't available as a habit under real pressure. This failure mode applies equally to tool implementations: a well-configured platform the team doesn't use correctly produces the same outcome as no platform.

Can an incident management tool replace a structured communication process?

No. Tools like PagerDuty, Opsgenie, and AlertOps improve the speed and reliability of alerting and escalation. They don't define who owns client communication, what the first update says, how the CEO is briefed, or whether the post-mortem loop actually closes. Those require a human process built around the tool.

Should I define my incident process before choosing a tool?

Yes. The process defines your requirements: escalation paths, severity thresholds, communication ownership, on-call roles. Without those, tool selection is based on feature comparison rather than operational fit. Define the process first, then choose the tool that serves it most effectively.

Can structured communication reduce the duration of IT incidents?

Yes, directly. When the communications role is separated from the technical role, engineers can focus on the fix without having to manage upward communication. Structured roles and clear coordination reduce mean time to resolution, regardless of which tool is used.

What should a post-mortem process include to be effective?

An effective post-mortem covers both technical and process failures, assigns specific actions to named individuals with deadlines, schedules a follow-up checkpoint before the meeting ends, and is facilitated in a genuinely blameless way, focusing on system failure rather than individual performance.

Back to Blog