Is cloud resilience good enough for you?
Does a cloud outage impact your business continuity plans?
Normally when you write these articles the intention is keeping you, the reader, engaged right to the end. Almost like a good whodunnit story, the killer is revealed at the end.
Here we are going to break with tradition and do the big reveal now.
That’s right the cloud is the killer. (Play “dun-dun-dun” music).
Seriously though, over dramatisation aside, the cloud providers might not be doing enough to make your business thrive. The cloud is truly wonderful and is one of the greatest technological developments of the 21st Century, but it is most likely harming your business.
If you are still reading, then let’s look at how the cloud developed to be the evil productivity killer and perhaps look to see if there is a hero (solution) that can save the day.
Thankfully, detective novel analogies will stop now.
As said before the cloud and the services provided are truly mind blowing. Small, medium and large companies now have instant access to the functionality they desire. Being cloud it’s hosted outside a corporate data centre and therefore any user or customer with internet connectivity will been able to get access.
Salesforce, arguably one of the first and now leading cloud SaaS (Software as a Service) vendors list 12 benefits of cloud computing.
- Cost Savings
- Increased Collaboration
- Quality Control
- Disaster Recovery
- Loss Prevention
- Automatic Software Updates
- Competitive Edge
This article is not going to question the validity of the above. However, the article does draw attention to one very interesting point
“Disaster Recovery: One of the factors that contributes to the success of a business is control. Unfortunately, no matter how in control your organization may be when it comes to its own processes, there will always be things that are completely out of your control, and in today’s market, even a small amount of unproductive downtime can have a resoundingly negative effect. Downtime in your services leads to lost productivity, revenue, and brand reputation.”
In fact, the last sentence “Downtime in your services leads to lost productivity, revenue, and brand reputation.” Is exactly the reason why cloud may not be the answer for you.
What is the argument against the cloud. Most cloud vendors actually give you the argument themselves and they put it in plain sight into their contracts. It is their SLAs (Service Level Agreements).
If you take Microsoft, by number of regions and availability zones, are the largest cloud provider in the world. Yet their published SLA is 99.9%.
It is that 0.1% that is the killer. That tiny little difference, that small percentage which could cause your organisation, and to quote Salesforce “lost productivity, revenue, and brand reputation”.
Why is the 0.1% so important. Take a close look at this table
99.9% means that the service could be down for 8.76 hours per year. That doesn’t sound like a lot, but what happens if that occurs in a single cloud outage. What damage can that bring without that service for one whole day of business?
There are some arguments against this being a problem.
Firstly, all cloud vendors will point to their historical uptimes, citing a history of being available more than the SLA. This is fine, but to quote old financial advisors “Past performance is not indicative of future results.” This has certainly been very true for the stock markets over the last 100 years and perhaps its too soon to tell if cloud services are any different.
In the first half of 2022 there were a number of high profile outages from major cloud providers lasting a number of hours.
Perhaps the cloud is in a bear market at the moment as far as outages goes.
The second argument, and one that requires more thought. How important is the service that you might lose? This is where all services are not equal and the cost to your organisation varies.
The most common use of Microsoft is for Office 365. If 0365 is down for a day, whilst it will cause some loss of productivity, it probably will not cause a loss of revenue or brand reputation.
The question needs to be “How much down time can I afford for this service to be down?”
The answer to that question is impossible to answer, well its not really the answer is “It depends”. It depends on the service, it depends on the importance of that service to your business and it depends on how many people use that service.
There is at least one service that is essential to your business, that you cannot afford to have more than a few minutes of downtime or possibly seconds and certainly not eight hours. That service is Authentication, MFA or Access Management.
These are fundamental to your users and maybe your customers accessing your systems. It is likely that you have this plugged in to nearly all of your applications and users. When an authentication service fails there are two options.
One is that users cannot access anything, and this certainly causes a loss of productivity. Have this situation with a consumer portal and it certainly will cause loss of revenue and brand reputation.
Second option is to fail open, simply allow people to access with good old-fashioned username and password. Oh boy, eight hours of poor security exposing yourself to the wilds of the internet and its less than trustworthy users or, just as bad, eight hours of staff bombarding your help desk trying to remember their passwords for all those applications. That’s assuming you don’t have the password reset provided by the authentication service.
This is why the cloud is the killer. If that lovely, sweet authentication service goes down your systems die and potentially with it your revenue and brand.
So what’s the answer, how do you avoid this situation with your authentication service.
Several options, make SLAs a deciding factor when choosing your cloud vendor. Some will, reluctantly, agree to 99.99% (55 minutes per year). Can you afford one hour of downtime, that’s one for you to answer. The answer is probably no.
Are cloud vendors likely to offer a more resilient solution in the future? Possibly but given the size and scale of their operations it’s a massively difficult thing to achieve for all their customers.
The second option, and its one that is often overlooked in the bright lights of the cloud. Keep it on premise. IT departments for decades have been implementing resilient architecture for decades, with a few routers or load balancers. It’s a skill that has not been lost.
Cost has always been a concern for keeping thing on premise and cost savings is the number one reason for cloud migration. Most on-premise authentication products require a few servers ideally distributed in a couple of different data centres. The cost of the additional infrastructure is thousands. This is orders of magnitude less than the millions or billions that can be lost in revenue or brand reputation through users not being able to access your systems for one day.
The moral of this story, if having a business resilient authentication solution is essential for your organisation you need to decide if the SLAs offered by cloud vendors is enough for you. If a cloud outage is going to cause loss of reputation and revenue, then you may need to consider taking control and bringing the service on-premise.