Top 4 pitfalls causing poor monitoring in an IT estate

As we speak, there are likely thousands of organizations running some type of monitoring solution out there, each within unique operational structures, but the difference between those that that are successful and those that are not can be attributed to 4 root causes.

Not surprisingly, these 4 root causes apply to any toolset or technology used. That’s largely due to the likelihood that the best tools in the world used poorly will deliver poor results. Here, we take a look at these particular root causes because they are often overlooked either due to being unknown or known but not valued.

1. Poor business analysis approaches often lead to project failures

Too many businesses believe that they know their apps best and therefore are in the best position to fashion and implement an effective monitoring solution. Unfortunately they are frequently wrong as this approach usually delivers less-than-optimum results.

Deliverables are based on preconceived requirements and preconceived feasibility. Perhaps this is why only 29% of IT project implementations are considered successful. This is understandable, as users are usually in the “weeds” of their world technologically speaking. Business analysis skills are required to identify needs, root causes and subsequent requirements.

An incorrect choice of personnel in this realm may result in poor choices being made. They are often uncertain what they want to see in terms of results, which can lead developers to deliver (on time or late) products that don’t actually solve the customer’s problem. This can also make it difficult to step back and challenge the process, which is why a subject matter expert (SME) is vital to delivering successful monitoring.

That said, business analysis is a skill crafted over time as much as it is a science. An SME with the right focus and skill sets will make it more likely that the requirements ensuring a successful project implementation will deliver value.

2. Monitoring domain knowledge

We often see poor monitoring practice in a monitoring solution. Even though certain customer requirements may be met, there are often better approaches that may deliver better outcomes and mitigate poor time-to-issue identification, time-to-issue communication, or cause-and-effect gaps.

Simple practices can really help reduce these negative effects. Two easy ones right off the bat are:

  • Always monitor as close to the primary source as possible — often not log files.
  • Monitor both cause and effect, both can happen independently even if they shouldn’t.

3. Tempus Fugit

It takes time to deliver and keep monitoring up to date. AI helps, DevOps helps, but it still takes time. We often hear, “we’re working on that.” Fast-forward 6 months later when the problem is revisited, and it’s still not addressed or 100% taken care of. The reason is usually the same — other priorities got in the way!

Improving your site’s uptime takes planning. For example, use the best hosting service you can afford, employ CDNs to host your static content, have a plan in place to address outages, and pay attention.

No matter how great a plan you have, if it takes you an hour to notice you have an outage, you’ve already dropped below 99.99% uptime for the year. If it takes you three hours to fix the problem, you’re down close to a 99.95 uptime percentage, and if these outages happen frequently enough, you may be looking at 90% or worse.

4. Not knowing differences in monitoring tools

If what you’re monitoring is complicated, then your monitoring will be complicated. It just that simple! There are often multiple ways a monitoring tool can meet a requirement — Synthetic Monitoring, Real User Monitoring (RUM), and Infrastructure Monitoring all play different roles discovering and mitigating different underlying issues before your customers do.

We often see the wrong choice implemented because somewhere along the vendor-customer relationship chain, someone “didn’t know that made a difference.” Problems usually gets discovered after there’s an issue that can disrupt service or revenue flows.

Our research shows poor monitoring solutions and operations increase yearly monitoring costs significantly. This does not include business outage or reputational cost. Surely that’s a business case to address the pitfalls. You can avoid these pitfalls (and subsequent negative consequences) by scheduling a comprehensive assessment of your unique monitoring needs.