Building Blocks of SAN 2.0 — Flash, Thunder & Lightning

EMC Adds To Their Flash Storage Solutions with “Thunder” and “Lightning”

This week EMC announced two new flash based storage products, appropriately codenamed Lightning and Thunder.  There’s a recorded webcast here, and also excellent posts by Chuck Hollis (1, 2, and 3) and Greg Schulz  (1 and 2) and also Chad Sakac on the new solutions.  I thought I’d take a moment as well to look at these announcements from perhaps a slightly different perspective, using other products as a starting point.

[Standard disclaimer:  I’m just an IT guy interested in technology sharing my observations and opinions.   Please feel to offer comments, corrections and concerns by commenting at the end of this post]

The Need for Speed

This part is pretty simple.  With Moore’s Law CPU capabilities have roughly doubled every 18 months, but what about storage?  We have faster and faster processors driving more and more storage transactions (IOPS), but there hasn’t been matching improvement in the storage arena to keep up with the increasing demand for IOPS.  Flash-based storage – which we’ve seen in everything from phones, tablets and SSDs – just may be the paradigm shift that’s needed to help storage catch up to the demands being placed upon it.

Not only is flash storage faster, but by the terabyte it’s much smaller and uses less power than an array of spinning hard drive platters.  When you think about how SANs have evolved over the past few decades, it’s not inconceivable that flash could essentially become the basis for SAN 2.0.  It may be a while before the economics and technology and maturity of solutions completely replace hard drives, but for now we are seeing flash storage being introduced in strategic locations in the storage ecosystem and that is what Lightning and Thunder are all about.

The PCI Flash Card

FusionIO is the most popular PCI Flash solution today with cards that can provide over 500,000 IOPS and very low latency – so low that it rather than being measured in milliseconds it is measured in microseconds.  Current FusionIO cards have read and write latencies of less than 50 microseconds.

Part of the low latency comes from the fact that the storage is on the PCI bus and close to the CPU and memory.  While generally speaking, transports like fiber channel have low latency, latency will generally increase when it hits the wire, and the most demanding high-transaction environments will “feel” this latency difference.

I had the opportunity to witness the impact of SQL database being relocated from a SAN environment to a PCI based FusionIO card in an ESXi 5.0 environment and the impact was profound.  Both data throughput and IOPS were increased by 400% from the previous SAN configuration (SATA disk) and even more at peak levels.  As great as this solution was from a performance perspective, it also introduced new challenges.

The Shared Storage Problem

Using a PCIe based card for storage now meant that the storage was essentially captive to the host server.  The storage could not be shared with other hosts, making things which we sometimes take for granted like vMotion and high-availability a challenge, in addition to giving up additional capabilities that may have been offered by the storage array.  Fortunately there is a way to work around this somewhat.

What if we moved the databases back to the SAN, and instead used the PCIe Flash cards as a read-only cache?  This way the databases are on shared storage, enabling again possibilities for vMotion, high availability and more.  This is exactly what FusionIO has done with their IOTurbine product.

The IOTurbine solution effectively converts the FusionIO PCI card in the host into a read-only cache to accelerate the database performance.  A driver is installed inside the guest OS of the VM which enables it to use leverage the FusionIO card as cache as well as being compatible and transparent with vMotion.  As virtual machines are vMotioned to different hosts, the cache on the PCI Flash card will adjust over time to optimize itself for the workloads currently running on that host.

You might be thinking “this is all great, but what about my write performance?”.  It is true that in this configuration write performance is not directly accelerated, but it may very well be indirectly accelerated.  All the read-based IOPS that are served up by the PCI Flash card, have been offloaded from both your SAN and your SAN transport (i.e. fiber channel), enabling more IOPS and bandwidth for your write transactions.  I’ve read reports that some have experienced improvements in their write performance by as much as 300% in their environments due to this phenomenon.  The other consideration is your read-to-write ratio.  If 85% of your IOPS are reads and 15% are writes for example, you should see an excellent performance increase from this type of architecture.

So perhaps we can have the best of both worlds – PCI based Flash storage (used as read-only cache) and still have all the benefits of shared storage, enabling vMotion, high-availability and more.  With this foundation lets take a look at EMC’s “Thunder” and “Lightning”.

Thunder and Lightning

EMC has been a storage leader for some time now, so if Flash is going to play a key role in storage in the future, EMC will need to increase their investments in this area.  Flash storage is not new to EMC.  In 2010, EMC introduced flash storage as the “gold” tier on storage arrays using FAST Cache, which basically moves the “hottest” blocks on the array to flash storage in order to boost performance.  In fact EMC claims that 1.3 Exabytes of flash storage are running under EMC’s FAST solution today.  EMC is now introducing new flash solutions strategically within the storage chain to further improve options for performance.

The first product is VFCache (codename Lightning) which uses a Micron PCI Flash card in much the same manner as the read-only cache scenario discussed earlier.  The VFCache card will vastly accelerate reads, while allowing writes to safely pass through to the storage array.  Based on my own experience with PCI Flash cards I have no reason to doubt EMC’s claims that augmenting a VNX SAN with VFCache delivers performance increases of 201% and 260% respectively for Oracle and Microsoft SQL.  What did raise my eyebrow a bit was the slide below:

Note that while mostly similar, the FusionIO card selected is only PCIe 4x while the EMC/Micron card is PCIe 8x.  FusionIO does have PCIe 8x cards (ioDrive2 Duo) which FusionIO scored as significantly faster than their 4x counterparts.  Taking all this into consideration, I’m tempted to postulate that to the extent the hardware is equal, the difference between the two solutions may not be quite as large as it may be suggested above.  None the less, both solutions are certain to provide a substantial performance boost in the host systems in which it is deployed.

Like the FusionIO solution, EMC’s VFCache uses a filter driver within the guest operating system, which allows the flash cache to be targeted and isolated to specific workloads.  It also features a vCenter plugin which allows for the cache settings to be modified as well as to display some related metrics.

What caught me a bit off guard about EMC’s VFCache solution is its less than complete support for vMotion today (although I’m sure this will change).  With the 1.0 release of VFCache it seems that vMotion to another host with a VFCache card is only possible with a series of scripts (provided by EMC) which basically disables the cache in the VM, relocates the cache, and then re-enables the cache in the VM (this is detailed better here).  In contrast, the IOTurbine solution (as I understand it) will operate a bit more transparently with the caveat that the cache on the “new” host will have to be repopulated overtime with the “correct” blocks in order for the previous performance level to be fully realized.  (As a side note I’m speculating that it wouldn’t be terribly difficult to configure the environment to support vMotion to a host server without a VFCache card and forgo the performance benefits temporarily).

Having said all this, VFCache is definitely a 1.0 solution today and EMC has many more capabilities they intend to introduce in the future including:

  • Mezzanine VFCache cards for blade servers (i.e. Cisco UCS)
  • De-Duplication in the card to increase logical capacity
  • Distributed-Cache (this could facilitate a seamless vMotion scenario with no temporary performance loss).
  • More focused caching algorithms and larger capacity.

In a nutshell that’s Lightning.  Here’s a short video review by Demartek that breaks down the Lightning / VFCache solution:

THUNDER

As mentioned before, you get the best performance when the storage is close to the CPU and memory of the host system, but for many workloads the latency of a more traditional over-the-wire transport will be adequate.   For this scenario EMC will be introducing later this year “Thunder” – which is basically a 2U or 4U flash-based SAN.    Details on the specs aren’t available at this time but it sounds as if the appliances can be “stacked” in order to provide terabytes of networked flash storage.

If I took notes properly, I believe that both Ethernet and InfiniBand based interfaces will be available for connecting to “Thunder” based flash storage.   There’s also an RDMA over Ethernet (RoE) possibility which it would be interesting to see performance measures of as they become available, as this scenario would allow for the converged networking trend which began in earnest with FCoE (Fiber Channel over Ethernet).  And of course in those cases where latency needs to be under 50 microseconds, Thunder can be augmented with “Lightning” to provide that high-performance/low-latency performance profile.

WHAT IT ALL MEANS

It took a long time for SANs to develop and evolve to where they are today.  Similarly a transition away from the traditional hard drive in the enterprise storage array is not going to happen overnight.  But as a leader in storage, EMC is strategically introducing flash solutions at various points in the storage chain (FAST, full array, and PCI bus) in order to alleviate the storage performance gap that many environments have experienced.  Of course, EMC is not the first first vendor to offer a comprehensive Flash PCI card based solution, but it will be interesting to see how EMC continues to develop the solution and integrates it within the broader EMC, VMware and even Vblock ecosystem over time.

VFCache is a 1.0 solution today, and EMC has shared their development roadmap in which they plan to continue to invest in the platform, making improvements regarding capacity, VMware integration, performance and more.  And I would certainly expect at some point to see Lightning and/or Thunder make appearances within the Vblock product line as well.  Flash – when properly positioned within the storage chain – has the potential to solve many of today’s storage performance challenges and it will be exciting to see these new solutions continue to be developed and introduced in the coming year(s).

What’s your take on the future of flash?  Did I get it right?  Join the conversation with any questions, comments or concerns below!

4 Responses to Building Blocks of SAN 2.0 — Flash, Thunder & Lightning

  1. BlueShiftBlog says:

    Was thinking about vMotion scenarios with PCI Flash and decided to respond to my own post…

    It seems to me that there are three scenarios for vMotion with a PCI Flash card.

    1) vMotion to a host *without* a PCI Flash card

    2) vMotion to a host WITH a PCI Flash card, but without cache pre-population

    3) vMotion to a host WITH a PCI Flash card and WITH cache pre-population

    Scenario 1 is mostly an HA/DR type play as the cache is no longer available for acceleration. In scenario 2 there is a temporary performance drop as it could take an hour or more for the cache to become fully populated with the correct blocks. Scenario 3 is complete support of course with no performance drop at any point.

    If I understand it correctly, IOTurbine supports scenario 2, while in VFCache 1.0 scripts are required from EMC to support scenario 3. At the same time I would think it would not be hard to modify these scripts to get VFCache to support either option 1 or 2.

    I think EMC’s intent is to support option 3 transparently and seamlessly in the future. Today they support this with scripts, but once distributed-cache is introduced for VFCache, there will be a second host (think active-active) that has the same cache ready to be utilized.

    So while VFCache 1.0 may not support seamless and transparent vMotion for all scenarios today, it seems that it can support all three scenarios by customizing scripts. Scenario 3 is seems will be supported without scripts when distributed cache is made available for VFCache.

    At least that’s my thinking and interpretation from what I have read. What do you think?

  2. Pingback: SQL 2012 (Denali) Enables Exciting New HA Scenarios | Blue Shift

  3. Cals says:

    Storage vMotion was alien in 2007 for the aboriginal time. We’ve apparent a lot of improvements over time to the appearance accompanying to VM clearing technologies. Accumulator vMotion allows an active VM to be confused from one abstracts abundance to addition abstracts store. My accumulator convenance about sees me application VMFS volumes for vSphere VMs, but NFS is additionally accurate for Accumulator vMotion. <a href=”http://www.calsoftinc.com/VAAI.aspx”>vMotion Storage</a>

  4. The hard drive’s electronics control the movement of the actuator and the rotation of the disk, and perform reads and writes on demand from the disk controller.

Leave a Reply

Your email address will not be published. Required fields are marked *