Software Defined Speed — A Look at PernixData FVP
PernixData FVP is a solution I’ve worked with in one environment for perhaps the past 6 months or so. I’ve been meaning to write about it (more than just tweets anyway) for some time, but I’m first now getting around to it.
The first question of course is “what does PernixData FVP do and why might I want it in my vSphere infrastructure?”. The short answer I usually give is that it’s Nitrus Oxide for your storage tier – just add FVP to your existing storage infrastructure and enjoy the speed (plus it’s legal)!
The longer answer is a bit more detailed than that, and first it would be helpful to have a quick overview of various storage architectures.
Traditional Storage Array
Here we are talking about hardware that is designed to offer up storage via usually fiber channel, iSCSI or NFS protocols. For the purposes of this article, most any hardware based storage array from NetApp, EMC, Nimble Storage, HP, Dell and many others fits this definition. This is a tried and true design, but as our capacity and performance needs grow, scale-out ability can become an issue in some environments (especially Google, Facebook, etc.). In fairness some storage array vendors have implemented scale-out capabilities into their solutions, but for our purposes here I am simply trying to build a distinction between architectures at a VERY high level.
Remember scale-out NFS and Hadoop? These designs typically did not rely on a monolithic storage array but multiple nodes using direct-attached storage and logically joined by…software. First we had “software defined” compute with VMware abstracting the CPU and memory resources of server hardware. Now we are abstracting at the storage controller level as well to unlock more potential.
Recently several vendors have had success with incorporating Hyper-Scale concepts into virtual storage arrays for vSphere, including Nutanix, VMware (VSAN), Simplivity, and more. Hyper-scale infrastructure is truly “software defined” as software and logical controllers are the key to making this distributed and scalable architecture work.
Occasionally this design is referred to as “Web Scale” as it does invoke a highly parallel environment designed for scale, but I prefer the term Hyper-Scale for several reasons, including that the use cases go far beyond just “web”. We’re talking about applying web scale principles to present “software defined storage”.
Considerations with Hyper-Scale
If write activity is in progress on a server node and it crashes hard before the data is replicated, what happens? (the answer is “nothing good”). The solution here is to write in parallel to two or more nodes (depending on your tolerance for failure settings). This is why a 10GB or better backbone is critical for hyper-scale designs – every write needs to be copied to at least one more host before it is considered to be committed.
Another consideration is locality to processor. For some applications anything under 20ms of latency is “adequate”, but some mission critical OLTP systems measure latency in the fractions of milliseconds. For these applications, latency can be significantly reduced by having the data closer to the CPU rather than having to fetch it from other nodes (more on this later).
Enter PernixData FVP
So let’s say you have an existing vSphere infrastructure and you have a storage array that while it could benefit from better performance, you are otherwise comfortable with. With PernixData FVP you can keep your existing storage array — eliminating the CAPEX burden of a new storage array — and accelerate it by decoupling performance from the storage array onto a new logical “flash cluster” that transcends your server nodes.
There are other solutions for adding flash-based read cache to your environment including vSphere’s vFlash capability, but most are local only (no flash cluster concept) and don’t offer the ability to cache writes. PernixData FVP is unique in my experience in that it is a true flash cluster that transcends across your server nodes that will accelerate BOTH reads and writes.
I’ve done this more than a few times now but I must say it’s rather straight forward.
First you will need to install some flash in your servers. In the environment I worked on we used FusionIO PCI cards, but SSDs will work as well. How much flash should you use? It depends on your performance profile and objectives, but as a general starting point, about 10% of the total size of the dataset you wish to accelerate is a usually a good place to start.
Then you install PernixData FVP which is done in two steps. First there’s a component you install on your vCenter server which adds an additional database to track some new flash performance metrics. Once installed you can managed and view the flash cluster from the vSphere Client (including the vSphere Web Client as of FVP 1.5).
The second step is to install the FVP VIB (vSphere Installation Bundle) on each ESXi host. I must have installed and uninstalled the FVP VIB several dozen times by now and it’s quite easy – just a standard ESXCLI VIB install.
First put the ESXi host into maintenance mode (stopping any active I/O) and perform the install ( a single ESXCLI command) and exit maintenance mode, and repeat for all additional ESXi hosts in the cluster.
Once you define and create the flash cluster, you can designate policy by datastore or VM. The two policies are write-though and write-back. With a write-through policy you are only using the flash cluster for reads – the most commonly used blocks as determined by efficient algorithms are maintained on the flash cluster for quick access. Not only does this reduce storage latency, but it reduces the IOPS load that your storage controller must process which should result in a performance improvement on the storage controller as well.
With the write-back policy writes are also processed by the flash cluster. Writes are written to the flash cluster (two nodes for failure tolerance) and are then de-staged back to the storage array as performance allows. The net result is that the commit time or latency from the application’s perspective is vastly reduced — incredibly important for write-intensive (i.e. OLTP) applications.
The graph above shows a chart (from the vSphere Web Client) of a database server accelerated by PernixData FVP for the past day. The purple line shows the latency that is incurred at the storage controller level, but the blue line is what the VM or application “feels”. The orange line represents the latency to local flash which is measured in fractions of a millisecond. The distance between the purple and blue lines is latency that has been effectively removed from the application by PernixData FVP.
Also one nice feature about FVP is that it reminds you right in the vSphere client what it is doing for you. In the environment I work on, it has saved almost 2 billion IOPS (pronounced “Beeeeeelion”) and 87TB of storage traffic just in the past 25 days.
Nitrus Oxide For Your Storage Array
In review, now you can see why I say PernixData FVP is much like adding Nitrus Oxide to a car (and of course being legal). You don’t have to buy a new car – you can just make the one you already have faster. And if you buy a new car (or storage array) you can still use your server-side flash cluster to accelerate it.
Much of what makes PernixData FVP special is the clustered file system that enables it to quickly and efficiently process writes to multiple hosts at once. This capability makes PernixData FVP a great fit for write-intensive transactional applications for which latency is key. Or maybe you have an array with slower SATA disk and you might find it more cost effective to simply accelerate it rather than getting a new storage array. Either way adding a server-side flash cluster to your vSphere cluster will significantly boost your performance. The DBA team in this environment has seen the time duration on some batch jobs decrease by over 900%.
PernixData isn’t done yet. Their next release will include the following features:
- RAM (memory) as a storage tier
- NFS Support
- Network Compression (reducing replication throughput)
- Topology Aware Replica Groups (control over the hosts used for DR and/or performance considerations).
The biggest feature there is RAM support. That’s right, you’ll be able to skip the flash if you prefer and use the RAM in your host servers as your clustered read and write cache. Just buy your host servers with the extra RAM capacity you want to use as cache and add FVP. And because memory is close to the CPU it should be quite fast. I’m looking forward to testing this capability when it comes out of beta and I’ll try to follow up with a post on that experience when the time comes.
The addition of network compression should also reduce the amount of data to be transmitted. ESXi already compresses memory pages because even with the CPU overhead it will increase performance by reducing swapping. FVP is using the same concept here to reduce the amount of data that has to be transmitted across the cluster.
In summary I found PernixData FVP a pleasure to use. It’s not difficult to install and it decouples most of the performance pain away from the storage controller and onto the server-side flash cluster (or RAM cluster in the next release). But the best result was seeing the impact on database performance and transaction times. If you have a write-intensive application that can benefit from server-side caching (not just reads but writes too!) then you owe it to yourself to take a look at PernixData FVP. I’ll be taking another look when 2.0 becomes available.