How Cloud Computing Changes Storage Tiering
Storage has always been an interesting topic when it comes to cloud computing. Organizations today are making some big changes in ways they manage and control their cloud storage environments, as the amount of data they have to manage is exploding. Cisco’s latest cloud index indicates predicts that annual global data center IP traffic will reach 10.4 zettabytes by the end of 2019, up from 3.4 ZB per year in 2014, growing three-fold over the next five years.
New challenges in controlling data traversing the data center and the cloud have emerged. How do we handle replication? How do we ensure data integrity? How do we optimally utilize storage space within our cloud model? The challenge is translating the storage-efficiency technology that’s already been created for the data center — things like deduplication, thin provisioning, and data tiering — for the cloud.
I recently had a chat with Jeff Arkin, senior storage architect at of MTM Technologies, who argued that cloud computing adds an extra tier. “Cloud introduces another storage tier which, for example, allows for moving data to an off-premise location for archival, backups, or the elimination of off-site infrastructure for disaster recovery,” he said. “This, when combined with virtual DR data center, can create a very robust cloud-ready data tier.”
Before we get into moving and manipulating cloud-based data, it’s important to understand how data tiers work. Tiering is assigning data to different types of storage based on:
Protection level required – RAID 5 v. RAID 0 v. mirrored sync or a-sync
Performance required – Low-latency application requirements
Cost of storing data – SSD v. SAS v. SATA
Frequency of access – Less accessed data stored on cheaper near-line storage, such as SATA
Security – Requirements for encryption of data at rest or compliance issues with multi-tenancy or public clouds
This methodology scales across your on-premise data centers and your entire cloud ecosystem. When creating storage and data tiers, it’s absolutely critical to look at and understand your workloads. Are you working with high-end applications? Are you controlling distributed data points? Maybe you have compliance-bound information which requires special levels of security. All of these are considerations when assigning data a specific tier.
Data can be moved in different ways:
Post process analysis – Running scheduled analytics to determine historical hot v. cold data
Real time analysis – Moving hot data (blocks) into SSD or other cache in real-time, based on current activity
Manual placement – Positioning of data and information based on location, user access, latency, and other variables
Here’s another example Arkin gave: Modern cloud and on-premise storage providers actually offer solutions that scan Tier-One data, checking for inactive data and moving stale data to private or public storage.
This is done automatically to ensure the best possible utilization of your entire storage ecosystem. Remember, we’re not just trying to control data in the cloud. For many organizations, storage spans across on-prem. and cloud resources. Intelligent data tiers and good automation practices allow the right repository to sit on the proper type of array and have the appropriate services assigned.
This type of dynamic storage management creates efficiencies at a whole new level. Not only are we able to position storage where we need it, this also helps deliver data faster to the end user. Remember, storage tiers can be contained within an array or across arrays and physical locations. One advantage to using cloud-based storage is the ability to reduce or increase capacity (and even performance) on demand. This means data tiers can be applied to cloud bursting requirements where storage is delivered on a pay-as-you-go basis.
Cloud has broken traditional storage models. Elastic storage means vendors and cloud providers will have to continue to adapt to the growing needs of the market and the modern consumer.