The 2am war room hasn’t gone anywhere. Ten years after Gartner coined the term AIOps, the platforms are bought, the licenses are renewed, the dashboards are live — and serious incidents still get resolved by engineers paging across multiple consoles, trying to work out where the fire actually is. MTTR has barely moved. Alert fatigue hasn’t eased. The outcomes the category promised, in most enterprises, have not arrived. Matt Lowe’s recent article on AIOps names the shortfall well. The argument is that the gap isn’t technological — it’s organizational. Undocumented service ownership. Siloed telemetry. Tools deployed by different teams, with different data models, never designed to speak to each other. The line that captures it best: “we built the instruments. We didn’t build the orchestra.”
He’s right. There’s one part of that diagnosis worth pulling on harder — the part about tools deployed by different teams. Because behind it sits a pattern most enterprise IT leaders are too close to see clearly. And it’s the pattern that actually generates the war room.
The pattern is the procurement, not the platform
Look at how the average large enterprise built its monitoring stack. Not in architecture documents. In invoices.
The application performance monitoring tool was bought in 2018 by an application team trying to debug a customer-facing service. The infrastructure monitoring tool came in 2019, when a different team needed visibility into a virtualization platform. The log management platform was a 2021 procurement, driven by SecOps after a compliance audit. The digital experience monitoring tool came in 2023, when end-user computing wanted to know why VDI sessions were slow. The network observability tool predates most of them, renewed every year because the network team uses it daily and would resist replacing it.
Five tools. Five buying decisions. Five different teams. Five different evaluators. Five different success criteria. Each one defensible on its own merits. None of them informed by the others. None evaluated against a shared definition of what operational visibility should look like across the layers.
That is silo buying. And it is, more than anything else, why enterprise monitoring estates fragment.
The uncomfortable thing about it is that you cannot point to a moment where it went wrong. Each procurement happened on its own clock, with its own brief, and produced a tool that did its slice well. The application team bought APM — they were not buying a monitoring strategy. The infrastructure team was not asked to think about how its tool would correlate with the application layer above. The brief did not contain that question. And so on, across every layer.
The fragmentation isn’t the result of poor decisions. It is the compound effect of good decisions made in isolation. Which is why no one inside the organization can be blamed for it — and why no one inside the organization is naturally positioned to fix it.
What the cost actually is
The cost of silo buying doesn’t appear on any vendor invoice. It shows up in the gap between the tools.
When an incident touches multiple layers — and most material incidents do — each tool sees its own slice of the truth. The APM tool flags a slow transaction. The infrastructure tool flags a VM under memory pressure. The log platform flags an authentication error. The DEX tool flags a poor session experience. The network tool flags a switch saturation event.
Each of these is true. None of them is the full picture. And no individual tool can correlate the others into a single, coherent root-cause narrative, because the tools were never designed to share a topology, a data model, or an alert taxonomy.
That is the war room at 2am. It isn’t the failure of any one platform. It is the architectural debt of how the platforms were bought.
The follow-on cost is harder to see, but bigger. Service-aware automation can’t be built on a topology nobody shares. Cross-domain runbooks can’t be maintained against alert formats that don’t align. AI-driven anomaly detection produces five overlapping noise streams instead of one signal. Every promising operational capability hits the same wall — because the wall is the integration gap between the silo’d purchases.
What is actually needed
Solving this isn’t a tool problem. It is a posture problem.
It needs three things, none of which are exciting, all of which are necessary.
- Cross-layer architectural authority. Someone in the organization — a head of observability, a director of operational visibility, whatever the title — who is empowered to think across the layers and influence buying decisions in each one. Not to veto the layer teams’ authority, but to ensure every tool selection answers a shared set of cross-layer questions before the contract is signed. Most large enterprises don’t have this role at all. The ones that have created it are the ones whose monitoring estates look less fragmented over time.
- An operating model before the next RFP. Define what the target state of monitoring looks like across the estate before evaluating the next tool. What does cross-layer root cause analysis need to look like? What does a service topology that survives across teams require? What does a unified alert taxonomy mean in practice? These questions can be answered, but they have to be answered before the next procurement cycle, not after. Teams that bring these questions into their RFP process end up with materially different vendor shortlists than teams that don’t.
- Telemetry as a shared discipline. Treat telemetry the way a finance organization treats financial data — as a managed, owned, governed capability, not a side-effect of each tool. Standards for what gets emitted, how it gets enriched, where it lands, who maintains it. This is mundane operational discipline. It is also the thing that determines whether the next AIOps initiative produces signal or noise.
None of this requires ripping out the existing stack. It requires changing the posture in which the next addition to the stack gets evaluated.
The bottom line
Every individual decision that built the current monitoring stack was reasonable. The compound effect is not.
The next platform any large enterprise evaluates is not only a tool selection. It is a decision about whether the operating model gets better, or fragments further.
The 2am war room isn’t built in incidents. It is built in RFPs. Making it optional starts there.
eG Enterprise is an Observability solution for Modern IT. Monitor digital workspaces,
web applications, SaaS services, cloud and containers from a single pane of glass.