The Storage Hypervisor Part 3 — Storage Efficiency
In the first post in this series we discussed how ONTAP – the #1 storage platform in terms of revenue – is a storage hypervisor of sorts – providing benefits which parallel those provided by virtualization. In the second post we covered WAFL and a new Flash Pool feature, and in this post we will cover storage efficiency.
In the interest of time, only a brief introduction to the features which comprise storage efficiency will be discussed here, with links to whitepapers for those who wish to dig deeper. In future posts, we will look at additional features which will build upon and leverage and even extend these capabilities for new efficiencies as well as explore their combined value.
There are many deduplication systems on the market but few storage offerings will offer deduplication within the primary storage tier. Often times you’ll have to purchase an additional device running a different platform for such a capability. With NetApp’s ONTAP platform, the entire FAS product line from high-end to low-end has enjoyed this capability for years. After the data is written to disk a post-process scan (which can be scheduled for off-peak hours) will scan for duplicate blocks (as granular as 4K), and deduplicate them – reclaiming the redundant blocks as free space within the volume.
How much space can be reclaimed? It depends on the environment and how the storage is implemented, but when best practices are followed in virtualized environments, the reduction in storage will often range between 30% and 75% depending on the data set. Think of all your common operating systems, which have common files and therefore common blocks – in VMware environments it is common to see a 75% reduction in storage for operating system drives. A typical file share will often see a reduction of around 30% from common documents, media files, etc.
Not only does deduplication reduce storage capacity, but it also increases performance. Imagine the scenario of a VDI boot storm or a failed ESX host with many VMs powering back on at once. Because the common blocks are deduplicated, the I/O activity is reduced to a smaller set of SAN blocks, providing more opportunities for cache hits. When ONTAP deduplication is combined with either Flash Cache or Flash Pools, a significant performance improvement can be realized in these and similar scenarios.
ONTAP also provides for compression which does not work on a file basis, but rather against a collection of adjacent blocks of up to 32K. Intelligent algorithms will determine the “compressibility” of the blocks and will only attempt to compress if significant benefits can be realized. The compression can be set for either inline compression (on write), post-process compression or a combination of both. The post-process method is a bit more comprehensive and will compress blocks that the inline method may have passed over.
Compression of course can save I/O operations on both reads and writes but at the expense of CPU computations. Generally speaking you will want to enable both inline and post-process compression on your archive and backup tiers, and the optimal settings for other tiers will vary based on both the application and how it is configured. The following table gives an overview of the space that can be saved using combinations of dedupe and compression on different data sets:
Percent of Storage Saved with:
|Application Type||Compression Only
|File Services: Home Directories||50%||30%||65%|
|File Services: Engineering Data||55%||30%||75%|
|File Services: Geoseismic||75%||3%||75%|
|Virtual Servers & Desktops (OS Volumes)||55%||70%||70%|
|Database: Oracle ERP||65%||0%||65%|
|Email: Exchange 2010||35%||15%||40%|
FlexVol — Thin Provisioning
In the ONTAP platform all FlexVols are thin provisioned – meaning that no SAN space is physically consumed until those blocks are actually written to and utilized. This not only saves space, but can improve performance by helping to maintain a higher spindle-to-data ratio.
Thin provisioning is commonly found on several storage platforms, but in the ONTAP platform, not only is thin provisioning the default for all FlexVols (and across all storage protocols), but you can actually both grow and shrink – yes shrink! – a FlexVol after it has been provisioned. This provides the maximum opportunity for both storage efficiency as well as flexibility.
FlexClone — Efficient Snapshots
Many SANs have snapshot/clone capabilities but often they come with severe limitations. For example, several use the “copy on write” method which can be expensive both in terms of disk space and performance. The FlexClone feature (datasheet) within the ONTAP storage platform enables the rapid creation of clone copies of production volumes. When a FlexClone is created a small metadata update is made and then only any new changed blocks are written to disk. No “copy-on-write” is performed and common blocks between the parent and child are fully leveraged. This space efficient approach minimizes overhead and enables up to 255 snaps per volume.
“But what about my databases and VMs” you ask? That’s an excellent question as those snapshots won’t be very useful for either QA, development or recovery if they are not application consistent. This is where NetApp SnapManager comes in, which has the ability to properly quiesce applications including Exchange, SAP, Oracle, UNIX, Windows and VMware virtual machines.
Bottom line is that FlexClones allow you to quickly and effectively take point-in-time application consistent snapshots of your production data, while avoiding the storage capacity and performance penalties which are typically associated with snapshots. This has profound benefits for QA and development (build up/ tear down) as well as backup and DR as we will get to in future posts.
SnapDrive and Space Reclaim
When Windows deletes data from the NTFS file system it simply updates the directory table, but leaves those blocks physically in use on disk. In other words, the data is still there, but it’s just no longer “listed” in the directory. This creates a disparity with the VMFS and SAN levels which are only concerned with whether a block contains data or not. SnapDrive for Windows is NTFS aware and can extend information about deleted blocks to ONTAP allowing for the space to be reclaimed.
SnapDrive for Windows has other capabilities as well, but reclaiming NTFS space can have a compounding effect, especially where FlexClones are used.
RAID-DP is the default RAID method used on ONTAP storage. By integrating with ONTAP’s WAFL technology (reviewed in Part 2), RAID-DP enables the protection of double-parity but without the performance penalties. According to NetApp, the performance penalty of ONTAP’s RAID-DP is between 2 and 3 percent relative to RAID-4, whereas the traditional write penalty of RAID-6 is often around 30%. Additionally, RAID-DP is more space efficient than most RAID-5 implementations by enabling a larger number of spindles (up to 26 data spindles and 2 parity spindles per array).
But RAID-DP is mostly about protection which is key when using large SATA drives which have longer rebuild times. With RAID-DP you can afford to lose 2 spindles within a RAID set, while your hot spare(s) are joining the array. A double parity scheme (such as RAID-6) would be standard in more arrays if it weren’t for the performance penalty it brings, but RAID-DP solves this problem, allowing the best of both worlds — improving protection, maintaining performance and optimizing capacity.
So far we’ve covered deduplication, compression, thin provisioning (FlexVol), efficient snapshots (FlexClone), Snap Drive Reclaim, and RAID-DP. When you combine the sum of all these efficiencies you can understand why NetApp offers their guarantee that you will use at least 50% less storage compared to other offerings. And all these ONTAP features are supported across any protocol — iSCSI, FC, FCoE and NFS — and aross the entire FAS product line.
For organizations already using a different storage array, you can still put a NetApp V-Series in front of most storage arrays and immediately gain the benefits of the ONTAP platform. In fact, NetApp will guarantee a 35% storage reduction in this scenario as well as gurantee that the V-Series will pay for itself within 9 months.
I’ll be discussing value in more detail in future posts, but for now consider this quote from Mercy Healthcare (Innovator of the Year Winner 2012) and what they did across over 30 hospitals and 400 clinics:
Mercy Healthcare built a state-of-the-art data center and implemented a flexible cloud infrastructure to effectively deploy an Electronic Medical Health Record for storing and protecting patient information and, in the future, to support smaller clinics and healthcare systems. With the help of the NetApp FlexPod(R) architecture, we have saved over 40% of storage space, reduced power consumption by 50%, and now provide rapid access to and data protection for 1,742K patients.
In this post we introduced the technologies behind storage efficiency and in future posts we will take a more specific look at various scenarios – including backups and DR – to see how ONTAP as a storage hypervisor can provide benefits and agility which parallel and complement those provided by VMware. And we haven’t even gotten to cluster mode yet! Stay tuned….