Sprint Performance

Can your IT staff see the forest through the trees?

Observability. Not something that most IT departments have a line item for in their annual budget. They’ll talk about things like “monitoring” or “event management” but these are often point solutions. Ways of viewing a slice or segment of the estate, however rarely do they provide the full picture.

One of our favourite philosophers put it quite succinctly:

This situation is reflected every day in IT departments around the world. Ask your people how systems are performing, you get opinions, look at the systems in isolation and you get perspectives. Neither is giving you the truth.

So what? Why is observability so important?

Well observability matters because uptime matters. Uptime’s 2022 Outage Analysis points out three very painful facts that relate to observability:

  1. High outage rates haven’t changed significantly.
  2. The proportion of outages costing over $100,000 has soared in recent years.
  3. The overwhelming majority of human error-related outages involve ignored or inadequate procedures.’

 

Wait, so you’re going to tell me that uptime isn’t the result of good design and high-availability?

No, uptime is about good design (we are an architect group, of course we’d say that!), however its about more than good design.

That’s where observability comes in.

You see the challenge is summed up nicely by Uptime’s executive director, Andy Lawrence:

“The lack of improvement in overall outage rates is partly the result of the immensity of recent investment in digital infrastructure, and all the associated complexity that operators face as they transition to hybrid, distributed architectures,”

Funny, I recall a recent blog post by this crazy group of architects that talked about the issues around complexity…. 

What this is telling you is simple: As your IT systems become more complex, more interdependent, more hybrid, more distributed – legacy application monitoring and event management will not tell you what’s really happening.

In his recent article @Franklin Okeke couldn’t have said it any better:

“With more blind spots come high chances of operational inefficiencies like system downtime, poor user experience, lack of predictive insights, poor resource utilisation, etc.”

Ah, so it becomes clear. Poor observability increases both the likelihood AND duration (aka cost) of outages because of the blind-spots that it leaves in your systems.

Increased observability is critical to increasing uptime.

What is most interesting about the emerging capabilities and focus around observability when compared to traditional monitoring, is that observability is not technology focused or siloed but rather looks at the IT estate and the business as a series of joined-up capabilities.

Our 5Cubed model was created for exactly the same reason. You cannot change the network without impacting the compute, you cannot automate your control plane without having the proper tech stack below it in networks (Digital Transport) and compute (Digital Machinery). You cannot secure any of it without joining everything together with some form of Digital Intelligence.

There are definite barriers. Observability is a fairly new concept and is constantly evolving. This means it can be difficult to price, and equally difficult to forecast the real costs. Like many things in life, there’s an element of “you get what you pay for” – every tick in increased observability will see a corresponding and likely non-linear increase in the costs.

This is why we focus as much on the reduction of downtime as we do on observabilities ability to break down the silos within an IT organisation and change the conversation – making the entire IT team function as one, focused on the business outcomes rather than the technology. Observability also changes an organsation’s relationship with their service providers. Rather than a provider simple being within or out of their SLA parameters, their actual performance towards the customer’s estate and the end users can be monitored in real-time. This means that measuring and reporting on the value of both internal and external resources becomes much easier. This added insight into the IT organsiation goes a long way to establishing the value of observability to the business. Yes, observability may have a high cost, however if the combined savings it generates in reduced downtime, improved efficiency and improved supplier management are all totted up – often the observability is actually less costly than the legacy systems it replaces.

Its as simple as this:

Observability allows IT organisations to take an agile approach to the very investment in time and effort across the IT estate. It generates the data to show how each change impacts (or fails to impact) the experience of users and the cost, performance and availability of the IT services being delivered to the business.

Now let’s briefly hammer home the benefits of observability with one final argument.

Security

Have you ever seen a prison without watch towers? Why do you think that is? Why does the military spend so much of their budget on ISR (Intelligence, Surveillance & Reconnaissance).  Do you or your business have CCTV?

That’s right. Observability is critical to security especially as organisations adopt DevOps, CI/CD and other forms of accelerated release cycle management. Security cannot be an afterthought. Observability needs to be built-in to DevOps (see this brilliant article from InfoWorld). Not only will it mean that developers will be able to spot and resolve issues faster, but breaches will be detected far sooner and understood far better with full-stack observability. This goes beyond our expertise (in Infratech) into a whole area of Data Observability – which is also well worth investing in for organisations at that level of maturity.

For now (yup, sales pitch incoming!), consider that your teams likely cannot see the forest through the trees. Their inward, operational focus means that they’re ill positioned to understand where observability can add value, what value it will add, and how to best realise that value. That outside-looking-in perspective is why you hire an architect!