Key considerations when choosing an IaaS provider
There are a variety of IaaS providers available with limitless options and configurations to choose from. To make the best decision for your organization, it's important to consider IaaS best practices and requirements specific to your industry or organization. Of course, budget and ROI will also be key points of consideration as well. Here are some additional key elements to consider.
Security and reliability
When selecting an IaaS provider, narrowing down vendor candidates can begin with various compliance and governance requirements organizations may have. For example, healthcare organizations make a point of selecting HIPAA-compliant IaaS providers to host their infrastructure. Retail banking organizations often highly rank data encryption mechanisms, identity and access management (IAM), and multi-factor identification in addition to IaaS best practices relating to security and reliability. However, less regulated organizations have more leeway in considering a cloud service provider’s methods and certifications for securing underlying infrastructure.
Disaster recovery
Quick backup and recovery services with excellent data retention capabilities are key when industries are highly regulated and audited, let alone the average organization that is trying to maintain operational status. Carefully considering the minimum level of disaster recovery required for your organization is a delicate balance between compliance, risk mitigation, and budget.
Some common methods include:
- Backup and restore: Data and applications are regularly backed up to a remote or cloud-based location, and, in case of an adverse event, the stored backups can be deployed to continue operations. However, this solution can take hours or days and may not be acceptable for all organizations.
- Pilot light disaster recovery: In this method, a minimal version of an organization's operationally necessary environment is maintained in the cloud (e.g., pilot light) where the data is live but the system is idle, awaiting instructions. In doing so, critical systems can be back online in usually less than an hour while non-critical systems can be brought back via backup and restore. This solution helps organizations with budget constraints or various lines of business with varying degrees of criticality to select an appropriate disaster recovery plan for their needs.
- Warm stand-by: This method maintains a scaled-down version of an organization's full production environment which is constantly running in the cloud. In case of disaster, the stand-by system can immediately be deployed while additional resources can be scaled up to fully operational levels. Warm-standby disaster recovery can usually be completed in roughly 10 to 20 minutes but is more costly than backup and restore or pilot light methods.
- Hot stand-by: Sometimes referred to as "real-time backup," the hot stand-by method involves a duplicate, fully operational production environment running in parallel in the cloud. Since the hot stand-by is an up-to-date, exact copy of the original live environment, an almost instantaneous switchover can occur in seconds so that an outage wouldn't even be detected by the user. And, while this is certainly an excellent option, it can be prohibitively costly for many organizations.
- Multi-cloud disaster recovery: For highly regulated, real-time data dependent, global organizations, the multi-cloud method may make sense. This solution requires a duplicate, fully operational production be continually running in a competing cloud provider's infrastructure so that if Microsoft had a serious cloud service event, the organization could jump to its AWS or Google cloud immediately. And, while a very effective strategy, this method can be extremely complex and costly to instantiate.
The noisy neighbor problem
As other tenants using IaaS systems may suddenly increase resource usage (e.g., bandwidth), be sure to speak with possible IaaS providers in detail about how quickly they will detect these spikes, how they will be resolved, and what guaranteed service levels or remediation guarantees are in place. By choosing a provider who not only has high reliability ratings but also one that provides infrastructure fairly for all involved, organizations can feel more secure in their IaaS system availability.
Flexibility and options in providing virtual resources
Consider your network, storage, and compute requirements and the initial options a potential cloud services provider offers for server size, virtual machine size, number of processing units, memory units, storage units, etc. Do they have the ability to address organizational requirements on Day One, or will it require modifications that may impact the provider’s service level or maintenance schedule?
Scalability
For organizations anticipating rapid growth, or those whose system usage can vary greatly in the general course of business, discussions with potential cloud providers regarding not only their scaling services (e.g., notice, timing, available resources) but also the flexibility of their service contracts is key. Look for a provider with flexible plans where fees scale closely with actual infrastructure usage. For organizations anticipating a more steady state of infrastructure usage, a consistent subscription or license fee may be acceptable.
User level controls
Organizations with highly specialized needs or unique industry requirements must carefully consider the level of control they require over IaaS features selected with a potential IaaS provider. Specifically, are the features selected a) within client control consistently and b) easily accessible for client personnel? Third, what impact will this level of control have on the cloud provider's overall offering, services, and cost structure?
Network speed and availability
Evaluation of bandwidth, latency, and network throughput are certainly key points to evaluate with potential IaaS providers, as are network availability and resiliency. In addition, potential provider locations may be a consideration for global organizations where geographic location can impact latency due to physical infrastructure challenges (e.g., quality of fiber network in the region).