The plain truth: There will be no cloud computing industry initiative where competitors will agree to 'blind pool', and backstop each other's failures and outages.There may eventually be great open standards to move VMs and apps off of your cloud, but no set of commercial continuity services will ever hearken back to the days of centralized SNA shops with real plans, proven market capitalization, and legitimate durability. Rather, I would expect the next few years to look like this:
- When a really big utility cloud (AWS / Google) goes down - you will get an apology and a paltry credit. The only consolation will be that it won't happen often, and you will not be able to exceed their up time by running your own servers. The house usually wins.
- When a moderately well financed cloud provider starts to fumble, there will be ample warning, because folks will be watching and pinging. Have a plan, or get continuity coverage now, or when it really becomes specifically available for cloud users. Don't say I didn't warn you.
- When a small, under-financed but buzzed up PAAS, SAAS, Cloud, whatever....fails overnight, taking your operations with it - comfort yourself by thinking how much you saved while it was working those 5 months in 2009. Really, now, Consider having at least a local or S3 proxy dup your data. Get insurance. Think before you trust business operations to a startup.
- If I was Andy Rooney from 60 Minutes: "Have you ever noticed that all of these hosting providers have a page on how great their hosting facilities are? Even the cut rate ones say, 'we are a level 1000 bunker with a year of diesel backup power and armed guards, and multiple super network links?" I mean, did they all copy the same page to make us feel safe?"
There is no perfect state of reliability, and that even in the days of highly centralized data shops, continuity was planned. We are transitioning from this mercantile, Web Hosting mentality, to one of running business essential applications remotely. These services are splitting into ownership categories of incumbent giants, and start ups that have a semi permanent '?' on their forehead until they achieve operational liquidity. The former apologizes and credits your account, the later disappears int the night.
The vast majority of SAAS productivity app providers use existing utility compute services, in whole or in part. It's a cost thing - perfect continuity in the cloud, on a per-client-per-use basis would be infinitely costly. There will, however, be no comprehensive industry clearing house offsetting failures - not in the sense of a services for profit model.
There are some shared risk examples where pooled trusts for infrastructure failures exist (to the best of my knowledge, these were at least proposed in underwriting requirements):
1) Telecom, National data haulers , Carriers-Carriers, and submarine cable system operators sometimes negotiate emergency settlement and peering agreements as a prerequisite to satisfying underwriting requirements. Sometimes these agreements predate the insurer's audit, and are just good business. Don't confuse these contingency plans with standard settlements - they are negotiated for extraordinary outages and lock in fees and technical requirements. Only the very large carriers can enter into these agreements with true peers.
2) Municipal and State Gov. Emergency Radio communications networks, SMR, and certain common carriers (terrestrial radio specialty comms) sometimes have emergency coverage agreements that are mandated by statute.
3 )Interstate Nat Gas and Petroleum Pipelines. Etc.
The real message here is that underneath a pool of policies is a risk pricing model that is often further underwritten by a re insurer; risk pools come together faster if there are means to offset the preponderance of risk. Flood Zones are hard to mitigate, and pools are still formed, sometimes under the stentorian bark of a state regulator. But in the case of our beloved IT clouds, we have yet to get to a place where risk to an individual business that depends on a cloud service can be priced, mitigated against, and potential technical failures limited, in their worst instances.
You may now go read up on all the happy hoooha about, "the open cloud manifesto, cloud interoperability, etc.", good luck with all that - I'm an optimist too.
We are talking here about commercially brokered services that are paid for by a pool of insurance companies, and that are funded by premiums. We don't get there until the primary service providers are certified, rated, and as operationally good as they can be. At that crucial juncture, where a critical mass of SAAS and Cloud hosts agree to these ratings and certs, we can price the baseline risk of outages via standard actuarial methods. Subsequently, risk offsets that are purely technical in nature can be tested and put into production. Finally, when technical services are proven to be feasible, then we can look to the reinsurance market, and viola, we have a business.
Question #1): How many service providers and Insurers have to get on board, at least provisionally, to make a real retail or B2B market that multi-line agents and specialty carriers can sell into?
Answer #1): My research was cut short before I got that far. I felt that my client knew the answer and was testing to see if I came up with a verifying figure. My best guess is that at least 35% of the top 1000 SAAS and Cloud vendors and at least three major underwriters would be required to make a realistic market for policies and payouts that make any sense whatsoever.
Question #2): Other than the actual insurance underwriting and policy sales, is there a real business model here in operating the technical services pool of a blind trust API broker/ Data mirroring / continuity services for the insurance industry? How big ?
Answer #2) Oh yes, oh my G-d yes. I am writing this series because I got far enough in my work for the last client, that I did see the foggy future in a way that mature analysts sometimes do.
How big? I believe that operating the Trusted Services Pool will be worth about 60 - 120 million annually when it hits it stride. There may be ancillary channels and opportunities along the way that could lift revenues to 250M. So, it's not going to be a Cisco or an HP, but a specialty business funded by the small insurance premiums paid by Small and Medium Businesses that make cloud computing or SAAS a critical part of their operations. (Much of my work product was projecting these numbers).
As a matter of fact, the industry as a whole may become hamstrung if these risk offsetting services are not brought online.
Maybe someone will read this very long and not too interesting series of articles (would you rather not be reading some romance novel?), and put me back to work researching and creating the product road map, lassoing potential insurance industry partners, and start making this a reality (all that work!).
The services offered to offset cloud computing risk is a modest challenge to provision, and is really just another cloud service with special sauces for monitoring, security, and trust. That's it.
You were expecting nuclear fusion? The goal of these pooled services is to cap the worst losses that imply risks to the majority of small and medium business that may encounter inoperative remote services - thus mitigating the top tier of policy payouts. The insurers pay for and pool these services with the premiums collected from the insured businesses.
Blind Trusted Services:
The Trusted Services Pool has to have all the attributes of trust to be established. Fiduciaries and controllers, technical management, and operations staff have to be checked out. The capitalization has to be audited, and its own operational contingencies have to be assured. Do you see what is happening here? The insurer's technical services pool has to be as good or better than the hosting providers that it is backstopping.
The goal is to offset the worst risk cases for data loss and continuity losses to operations. This does not mean an up time guarantee. A certain major percentage of the insured population's data and transactions has to be preserved for a reasonable premium. To support a menu of insurance coverage levels, the following technical services will probably have to be supported over time: ( I am avoiding an exhaustive technical discussion, who has time?).
- Transaction Log Mirroring and Replication
The most basic, non data heavy service for small business is to maintain transaction logs. These logs can be shipped and ready-replayed to reestablish and reconstruct business transactions if a cloud provider goes down or out. Especially for POS and counter top retail business that are making the move from a distributed server based system, half the battle is capturing the transactions.
- Data Storage Proxy
In addition to table-based transactions, businesses that store document images or objects may require a backup proxy to alternative cloud storage. No big hurdle here, other than the assurance and credibility.
- VM machine image ready standby
If and when (some say now) a set of elastic services cane be frozen and placed on near-line stand-by, this a service that was discussed in my research. In meetings with several VM vendors, including some heavy hitters from IBM's superserver division, it became apparent that many instances of ready standby could be held in stasis, and re-synchronized to transaction logs in fairly short order, especially if we are catering to small and medium businesses, and not say, Citi Bank.
I guess this is where the open cloud initiatives are going. It seems that many of the VM vendors are leading the way. For the purpose of the trusted pool, I felt after a period of study that this is possible and actually in practical use in limited cases.
- API call brokerage for live services uptake.
There are already existing services that broker web API's. These services provide scaling, monitoring, billing, etc. Trusted services for the insured pool would maintain a similar brokered pool of API's that would either pass through the 1st level of calls directly to the provider, or would be cut in as an alternate route if a timeout exceeds a predetermined limit.
There are a few issues here that need massaging, as it not the business of the trusted services pool to provide transaction level assurance when your cloud or SAAS provider times out for a few minutes. Rather, a trusted API brokerage really makes the preceding items more elegant to provision. Even competitors can backstop each other's outages if the Trusted Services are blind to the parties and payments settled by the trusted pool.
The up sells beyond trusted services might cover all of the value-added items provided in the course of selling business continuity services, such as records management, facilities, and telecommunications. These would add revenue lines, and complement the agencies commission incentives.
The technical services discussion could be covered in much more depth, and I may take that on after I clear my desk. However, I wanted to close this series and show that some folks, including my former insurance industry client, are seriously looking at the business of providing indemnification services and underwriting to cloud computing clients.