vSphere 6.0 Public Beta — Sign Up to Learn What’s New
Yesterday, VMware announced the public availability of vSphere 6.0 Beta 2. I can’t tell you what’s all in it due to the NDA, but you can still register for the beta yourself, read about what’s new and download the code for your home lab. There’s some pretty exciting stuff being added to vSphere 6.0 in
Will VMware Start Selling Hardware? Meet MARVIN
The Register is running a story that VMware is preparing to launch a line of hardware servers.
VMware Pursues SDN With Upcoming NSX Offering
Earlier this week VMware announced VMware NSX – an upcoming offering that takes network virtualization to new levels. NSX appears to be somewhat of a fusion between Nicria’s SDN technology (acquired last year by VMware) and vCloud Network and Security (vCNS – formerly known as vShield App and Edge). Since I already had intentions to
What Really Is Cloud Computing? (Triple-A Cloud)
What is cloud computing? Ask a consumer, CIO, and salesman and you’ll likely get widely varying responses. The consumer will typically think of the cloud as a hosted service, such as Apple’s iCloud, or uploading pictures to Photobucket, and scores more of like services (just keep in mind that several such services existed before it
Agility Part 2 — The Evolution of Value in the Private Cloud
When an IT project is commissioned it can be backed by a number of different statements such as: “It will reduce our TCO” “This is a strategic initiative” “The ROI is compelling” “There’s funds in the budget” “Our competitors are doing it” Some of these are better reasons than others, but here’s a question. Imagine a
Stacks, the Vblock and Value — A Chat with EMC’s Chad Sakac
…I reached out to EMC’s Chad Sakac to gain more insights from his perspective on how the various stacks…well…stacked up….
Should You Virtualize vCenter Server (and everything else?)
When concerns are raised around virtualizing vCenter Server, in my experience they usually revolve around either performance and/or out-of-band management. The VROOM! blog at VMware just published a whitepaper that looks closely at vCenter Server performance as a VM versus native (physical) which speaks to these concerns as well as for other workloads. vCenter Performance
Can your VM be restored? VSS and VMware — Part 2 (updated)
The backup job for your VM completed successfully so the backup is good, right? Unfortunately it’s not that simple and a failure to effectively deal with VM backups can result in data loss and perhaps even legal consequences.
Something struck me this month. In the contrast between darkness and light, revelations were made. But to understand this we must first begin with the darkness.
In 2014 I was in a very dark place. We were fighting a losing battle with the bank and the court system which forced us into “survival mode” which meant among other things, working as many hours as possible. Normally when the doors are closed in one area of your life you can focus on another.
Career can be such an outlet but I found myself working on 10 year old technologies with no other opportunities or even certifications to pursue. While some certifications were attainable, the time for studying was sparse when working 80-90 hours most weeks.
Well there’s always family and your home which is incredibly important. Well about that…
Our home technically isn’t even a house and technically has no bedrooms. It’s a 600 sq. foot cottage without a working furnace, no kitchen (but a tiny sink), and a bathroom smaller than most handicapped stalls. There is no table for eating, games or homework. Our 12×16 shed collapsed and we are now storing these contents in our house (no basement). Now add 3 children ranging from 13 to 3.
I’m sure this all sounds superficial but I can assure you that the impact on us has been profound. You deal with it at first by telling yourself that it’s just temporary. But it turned out to be a prison. You can’t eat the way you want to. You can’t do activities with your kids the way you want to. You can’t sleep when (or how) you want to. Our health has been adversely impacted by toxic mold exposure. We’ve given up on many healthy habits we used to have because it’s just not possible. Not only has the house itself become a dysfunctional time sink, but it has affected all of us psychologically.
We made enormous sacrifices for years, paying over $3,400 EACH month for the mortgage to live in such an environment. Rather than spending time without kids this summer we spent it working on our legal case. But it was all for nothing as the court would not want to hear our story. It was done. After all the hard word and sacrifices, including paying two-thirds of what the home is worth we would lose ALL of it.
Career, home life, family life. All were dominated by stress, and blocked paths. Time was passing, kids were growing older and goals were fading.
Without getting into the details we could never walk away from the house, because with our predatory loan we could still be liable for huge sums of money – several times more than the house is even worth.
We were trapped. But then the clouds began to part.
Now we are looking at the possibility of losing everything and being forced to move (likely out of state) in a few months. Wait, that’s a good thing? YES IT IS. Yes, we will still lose everything. We’ve lost all that time and everything that we paid to the bank with nothing to show for it. The difference is that we have hope. We now have a reason to believe that hard work and effort can begin to positively impact us.
This is the power of hope. Without hope there is nothing but despondence and just trying to make it through the day while another week, month and year passes by. But with the hope that you can work for positive change, your outlook completely changes. Even though we have been wronged and lost badly, I feel as if an anchor has been removed from around my neck. We might have nothing, but now I can DO SOMETHING. In this case I can try to start a new life somewhere else. I can now have hope that my family can experience what a living in a middle class American house feels like. I can now have hope than I can reclaim time, healthy habits and pursue personal goals and career goals.
I have a special needs daughter who endured 10 surgeries her first year and more extreme surgery in the years to come (her story here). I found myself asking “why were we able to endure all of this with a positive attitude but be so negative and despondent this time?” We made incredible sacrifices in all areas of our life, career, financial and time to do everything we could for her. It was never even a question but we did it enthusiastically and with passion. Why would one set of challenges which absolutely immersed us be so different than another?
We ultimately didn’t have control in either situation. In one case we were doing everything for someone, where in the other one we couldn’t help ourselves. As time went on, what we dismissed as temporary sacrifices began to control and impact us and change our lives for the worse.
I’m not happy that my family and I have lost several years of what “could have” or “should have” been. But now I have hope that with some hard work, hopefully we can salvage what time we do have left. It really does feel like stepping outside of a prison for 5 years. There might be a hard road in front of us but right now I’m enjoying the feeling of the sun on my face.
Perspective has a profound impact on how we approach life. Are we letting our perceptions or fears impact and control us? What’s holding us back? Can we change our attitudes, our motivations, and even our environments to improve ourselves and attack both our problems and challenges? Are we really doing our best or are we letting negative thoughts in the workplace or in the home influence us? Are we making moves to be leaders or are we the cynic shaking their head in the corner? Are we finding fulfillment by reaching out to others to help and nurture them?
I do not yet know why all this happened or what purpose or lesson it may serve. I’m a perfectionist and I’ve always felt I would be a failure if I didn’t start a successful business or do something significant with my time in this world. But just having hope and purpose is the best feeling and motivation I could have right now.
Here’s to a great 2015 for everyone and while I perhaps I should be scared and nervous , I’m too excited about the freedom and hope to affect positive change. I found my motivation – find yours. Because time doesn’t stop.
UPDATE: vCloud Air OnDemand is out of beta and has now entered an Early Access Program for which you can sign up here.
Recently I’ve had the opportunity to explore a beta of VMware’s upcoming cloud offering – vCloud Air OnDemand through their Ambassador program. I wanted to share my observations and experiences, but there’s so much to talk about, I found it better to start with an introductory post and drill deeper with a walk through some of the details in a future post.
The quick version is that vCloud Air’s Virtual Private Cloud OnDemand is pretty much what it sounds like. Hosted IaaS (Infrastructure as a Service) running on VMware, enhanced with SDN, with on-demand availability and pricing — meaning that you are billed only for what resources (CPU, memory, disk, etc.) are actually consumed. It’s like the electricity meter on our homes, but this is measuring the resource utilization of your virtual datacenter in the “cloud”.
Amazon (AWS), Azure and Google are on most everyone’s short list for IaaS service providers but there may be some good reasons to put VMware on your short list as well.
The vCloud Air service is compelling for several reasons. To start, it runs VMware vSphere which provides easy and familiar methods for integrating with existing on-prem infrastructure. Perhaps you have a new project but don’t have time to wait to add more hardware and capacity, but still need to maintain operational methods and security. For many use cases vCloud Air Private Cloud will be seen as compelling — especially where vSphere is already used. And with over 99% of the Fortune 1000 using VMware, that’s…well…most of us.
Before we explore Virtual Private Cloud OnDemand in more detail, I’d first like to step back and review different cloud types, use cases.
Private, Public, Hybrid
The original key distinctions between private and public cloud were mostly control and multi-tenancy. With a private cloud the hosted infrastructure was exclusively yours and therefore afforded more control, whereas in a public cloud your workloads might be shared with those of others on the same hardware (multi-tenancy) which could lead to the “noisy neighbor” problem.
Advances in hypervisors, I/O virtualization, SDN and orchestration have made this a bit less of distinction now days as more control is available to the consumer and the “noisy neighbor” is not the threat that it once was.
A Hybrid cloud then is essentially a combination of an “in-house” private cloud and infrastructure from an external service provider. A perfect example is a business that runs VMware vSphere internally in their datacenter. Let’s say a new project comes along, and rather than buy new infrastructure (and incur the associated delays) they could just logically extend and scale their existing vSphere infrastructure to a hosted offering, and be billed only for what is consumed.
Is vCloud Air Hybrid or Private?
In 2013, VMware launched the vCloud Hybrid Service (vCHS) which was positioned as the the hosted cloud infrastructure needed to evolve an on-premises environment into a hybrid cloud. The vCloud Connector facilitated building a unified view of the hybrid cloud, allowing the ability to view, manage and migrate workloads from either the on-premises side or the hosted side.
Just this past September the service was re-branded as vCloud Air with the service offering now called Virtual Private Cloud (a dedicated option is available). What changed that it’s now called a private and not a hybrid cloud? Yes, there’s a bit of marketing here but also a pretty important point. Private cloud is all about control. Do you control the security, the operations, the processes?
When you start with the vCloud Air service you create a virtual datacenter. There is no external access until you establish firewall rules, public IPs, SNAT/DNAT rules, routing and more. There’s also VPN and load balancing services built in.
If that sounds like a lot, it’s not and it’s quite straight forward as you’ll see in the next post, but the point is that you have such a strong level of control that really can be considered a private cloud. It’s like the difference between ordering a sandwich someone else designed versus building your own. As an engineer who has encountered the friction and delays that silos bring, I found it liberating to be able to quickly design the virtual datacenter — network, storage, compute — to my requirements. And of course if you integrate the Virtual Private Cloud with an on-premises environment, you still have a hybrid cloud spanning those two environments.
Introducing vCloud Air Virtual Private Cloud OnDemand
The “original” vCloud Air Service that went live last year is Virtual Private Cloud. It is powered by vCloud Director, providing VMware users with a familiar construct and interface with their on-premises environment. With this service, capacity is purchased in “blocks”. For example a starting block might consist of 20GB of memory, 10Ghz of CPU and 2TB of storage (pricing as of November 16, 2014 shown below).
The new OnDemand service has many similarities with the original service. They both run vSphere and vCloud Director. They both employ SDN using VMware’s own offerings. They both integrate into vCenter Server using the vCloud Air plug-in. They both allow stretched Layer 2 and Layer 3 so that you can “bring your own IPs” and also feature Direct Connect options (private circuit).
My understanding is that the OnDemand service is a new “pod” within the vCloud Air service meaning that it is a new and separate rack design and configuration. The new OnDemand service — as it’s name would suggest — uses an OnDemand pricing model. Rather than purchasing “blocks” of capacity you will be billed for what you consume as you consume it. I haven’t done much for the past 24 hours but below you can see a screen shot of my billing for that period, broken down by CPU, memory, storage (SSD and standard tiers), and public IPs.
Click to expand
Each account has a single billing point but as we’ll see in a future post, it is possible to create multiple virtual data centers (VDCs) within your account to both track internal costs and well as to control access.
Use Cases for Virtual Private Cloud OnDemand
There’s many different use cases that are a very good fit for the OnDemand service. If you’re a new company without much capital you might want to just use the virtual datacenter as your primary datacenter.
If you’re a medium or large business with an established on-prem vSphere infrastructure, you might elect to keep your most critical applications and data on premises, but still leverage the OnDemand service for seasonal capacity, test/dev and new projects.
I was working at a Fortune 500 once when a new project came up which required a large amount of web servers, databases and middleware. How nice it would have been — and how much more quickly we would have been able to execute — if we could have simply defined our vApps in Virtual Private Cloud OnDemand and then clone and distribute them as needed in the vCloud Air service. You might even choose to keep the databases on-premises but put the web tier out in the cloud. You have the flexibility to align your workloads between on-premises and vCloud Air with whatever balance and topology works best for your organization, your security and operational requirements — you have the flexibility to allocate as you see fit.
As you could imagine, disaster recovery for on-premises vSphere deployments is a very popular use case and quite straight forward to setup. Today, the Disaster Recovery option is offered as a discounted tier on the original Virtual Private Cloud Service but it is my understanding that this will move to the OnDemand service in the future. This would be a very effective pricing model as when using the capacity for hot-site replication, most of your resources in a passive state will be storage. CPU and memory would be at relatively low levels until a fail-over occurred at which point they would increase with all the instances coming on-line. OnDemand capacity when you need it.
Sign Up and Getting Started
I’ll go through a detailed walk through later, but the effort required to start creating VMs and consuming resources is relatively low. I simply registered for the service, supplied a credit card, and once I was confirmed I was off creating my virtual data center and spinning up virtual machines and vApps. This was my first time using vCloud Air but it was not my first time using VMware and as a result it didn’t take me much time to quickly find my way around and be productive within the vCloud Director interface within the vCloud Air service. Within a few hours of signup, most should be able to define their networks and start provisioning VMs.
VMs, vApps and Catalogs
Within vCloud Air there is a public catalog from which you can instantly provision new VMs. At this time, the public catalog includes multiple editions of CentOS, Ubuntu and Windows Server. The Windows Server VMs will incur a licensing surcharge for their use which is prorated to an hourly rate. In other words you are effectively renting the Windows Server license cost by the hour.
There’s two other important ways to populate your own private catalog within vCloud Air. First you can import any OVA into your private catalog as either a URL link or a local file — which includes the over 1700 virtual appliances available on the VMware Marketplace. The second way is to simply upload your own ISO to your catalog. Just to prove a point that it could be easily done, I uploaded a Windows ISO to my private catalog in vCloud Air and I was able to build a VM from scratch right form ISO. Also using the vCloud Connector you can even keep your catalogs in sync between our on-premises vSphere environment and vCloud Air.
vApps are a vCloud Director construct which solves several problems. You can add multiple VMs and define rules for how they should work together. A vApp can be an n-tier application or just a set of servers that need to be managed by a common team. You can define leases on vApps as a cost control measure (i.e. power off after x hours, delete storage when off for x days) and even fencing, which ensures VM clones which exist in multiple vApps have unique MACs and IP addresses. More on this later but there’s a lot of rich capability here for designing and managing your virtual datacenter.
The vCloud Air plugin that is built into current versions of the vSphere client provides support for administering vCloud Air right from within the vSphere Web client. The video below provides a walkthrough of the functionality available in the vCloud Air plugin.
Having run administered many vSphere environments I’ve been somewhat spoiled by the ability to quickly extract rich metrics on VMs and hosts using vCenter and even more with vCOPS. In the vCloud Air environment you can see your CPU, memory and storage utilization for your virtual datacenters and vApps but that’s about it. The hosts really don’t need to be in the picture (that’s sort of the point of a cloud service) but it would be nice to know some key VM metrics (what’s my storage latency or memory allocation over time?
Two things here. One is that there’s nothing stopping you from using the monitoring solution of our choice. Want to use Microsoft System Center, CA UIM, Nagios, etc? Use whatever processes you use today in house. The second thing is that VMware has a robust monitoring solution in vCOPS. I would not at all be surprised if VMware were to release a version of this that would work within vCloud Air in the future.
There’s much more here in terms of features and even connection options that I haven’t drilled into here and which I’ll try to explore in future posts. But just to back this up a bit, many IT consultancies have suggested that hybrid cloud is the new normal — the business having the ability to consume on-prem and hosted capacity as needs arise, with use-case flexibility and functional integration (i.e. the vCloud Air plug-in in vSphere). Some cloud providers will require you to make adjustments to operational procedures and security, but vCloud Air does a good job of making this feel seamless for VMware shops. Also keep in mind the appeal of multi-cloud (using more than one cloud service provider) which can be used to mitigate risk, provide flexibility and expand DR options. And if you don’t already have a DR solution you may want to take a look at vCloud Air’s Disaster Recovery Service.
Most companies will want to explore options for both hybrid cloud and multi-cloud scenarios for many compelling reasons. As a long time VMware vSphere engineer, I found the vCloud Air service very accessible and easy to quickly get started with. If you have a significant VMware vSphere deployment in our organization or even if you are just starting out, you owe it to yourself to include vCloud Air in your short list of options. With the new OnDemand service with its utility pricing model being prepared for launch and more datacenters being added globally, the vCloud Air solution is worth taking a close look.
vSphere 6 has been in public beta for several months now and this week at VMworld some of the new capabilities are now public. vSphere 6 remains in beta for a future release (sign up here!), but let’s take a quick look at some of the new features that have been announced (so far)
SMP for Fault Tolerance
Just a quick overview here. Fault Tolerance is a pretty neat feature that can keep a second copy of a VM in complete lockstep for HA purposes. The second VM has it’s own VMDKs which can sit on a different datastore or SAN, while each CPU transaction is maintained on both servers. This is a great way to provide redundancy for applications which can’t afford to lose cycles during a fail-over event, but the Achilles heel was always that it was limited to a single vCPU.
VMware announced earlier this year that it would be discontinuing vSphere Heartbeat and now we know why. With Fault Tolerance being able to support VMs with up to 4 vCPUs in vSphere 6, it would no longer be necessary for high availability to be provided by in-OS clustering. VMs of up to 4 vCPUs and 64GB of RAM can now enjoy the benefits of VMware Fault Tolerance.
Some of the vMotion improvements announced include being able to vMotion across difference vCenter instances, across routed networks (this “may” work now but was never formally supported), and perhaps most importantly long distance vMotion.
The latency tolerance for vMotion will be increased from 10ms to 100ms in vSphere 6! With this generous of a tolerance for latency so many more vMotion scenarios would now be a possibility without the normal geographic penalties. Personally, I think VMware should demonstrate this capability by vMotioning a VM to an EVO:RAIL cluster in a hot air balloon with a 4G LTE wireless network.
This is a huge feature in my opinion – a whole evolution beyond what VAAI introduced — and rather than try to drill deep here I’ll try to stick to a simple overview. A vVol is a new logical construct that appears as a datastore in your admin tools, which allows the virtual disk to be a “first class citizen” in storage (versus the LUN or volume). A vVol does not use VMFS but is a new abstraction layer that enables object based storage access (with your VMDKs being the objects).
There’s several things going on here which I’ll just quickly touch on. First there is one protocol end point now versus many as illustrated below. This enables more API capabilities to be exposed and if I understand correctly, VMware has plans to allow third parties to develop filter APIs here.
Protocols are consolidated into a single endpoint
vVOLS are hardware integrated much like VAAI which means the storage vendors will develop their definitions for the API to activate the capabilities of their storage arrays. For example one capability is the ability to offload the snapshot function from a copy-on write flat file – to have the storage array handle it. While snapshots are an awesome feature of vSphere (which are not backups by the way), I’m not a big fan of the copy-on-write delta file method. I’ve seen snap chains 40+ levels deep (without anyone knowing) and snaps that were left open for months until the datastore filled up. By offloading snapshots and other operations to the storage array these things can be handled a lot more effectively.
I didn’t even get to storage profiles yet which allow to define what characteristics a certain VMDK should have. There’s many scenarios here but at a high level just removing the complexity of LUNs and RAID characteristics from admins is a big deal. When a VM is provisioned the admin needs only to select the storage policy (or one is forced for them) and the desired settings are enforced without the complexity being visible.
With that very basic into a highly encourage you to read one or more of the following blog posts which go FAR deeper into vVOLs, how they work, and their benefits.
This is a new logical construct within vSphere which allows you to enjoin multiple vSphere clusters into one construct to force consistent policy settings, provide a top level management point and facilitate cross-cluster vMotion.
Improved Web Client
The web client has significantly improved with each release but many (like me) find the web client to be a bit slow at times. It’s clear that VMware has spent some time on this as from using the beta I can assure you that there is a significant improvement in response time between the 6.0 and 5.5 web clients.
That’s just a quick summary of some of the features that were mentioned in the general session. Even more details should be available over time as vSphere 6 grows closer and closer to a GA (General Availability) release. If you’re anything like me, you probably can’t wait for vSphere 6 – perhaps the biggest feature I’m looking forward to is vVols. Until then, happy virtualizing!
The much rumored “MARVIN” has manifested today as EVO:RAIL which represents VMware’s entry into the “Infrastructure In-A-Box” or hyper-converged market.
Each “RAIL” consists of a block of four (4) x86 rack mount servers available from a list of partners, with VMware vSphere and VSAN. Because EVO can scale this solution will likely find acceptance in both branch offices as well as some larger scale-out designs – all with an HTML5 front end. Customers can now simply procure virtualization infrastructure — including storage — by purchasing multiple “RAILs” as needed for scale
“Shall we dance?”
This is a truly a software defined infrastructure solution which enables IT shops to procure infrastructure through a single vendor and scale-out as needed. Nutanix was the first to find success with this business model, and others will be sure to follow (also see Cisco and Simplivity) . I expect that this will be an increasingly popular (and disruptive) trend in the marketplace.
Also EVO RACK will be announced as being in the tech preview stage which will be intended to scale to multiple server racks of SDDC infrastructure.
Captain Picard can’t contain his enthusiasm for VMworld 2014
It’s the season for VMworld and all of us are getting a bit excited. I’ve never been to VMworld (and won’t this year either) but I’m still quite excited about what this VMworld will bring. Why? I’m glad you asked.
The two big reasons are what’s going to be announced/revealed as well as all the great ways to follow VMworld remotely (I’m am expert at this now!). My mind is already racing about designs, use cases and planning around deploying several elements that these new capabilities we expect to be announced.
This is all under NDA so we can’t talk about all the exciting new capabilities just yet, but if you’ve participated in the vSphere 6 beta you know that there’s some pretty major features we can expect to be announced here and possibly a surprise or two yet. One of the features we do know a bit about are….
vVOLS aren’t really a new concept as it was introduced at VMworld 2012 as a preview of where VMware would be going with storage. Over two years in the making and now with the overwhelming support of VMware’s storage partners (EMC, NetApp, Nimble Storage and more) vVols are poised to make a big splash. More on this after the embargo is lifted but here’s some available content from VMware on vVols until then.
Call it hyperscale, scale-out, or software defined (all of these work) but we are basically talking about modular hardware sold as single units which can be enjoined to form large pools of vSphere infrastructure. We’re not just talking about vSphere here, but also software defined storage (i.e. VSAN) and possibly SDN as well (i.e. NSX). Nutanix is one vendor who already sells hardware based on this model and quite successfully.
There’s been rumors all summer about VMware offering such a single box model which has so far been named MARVIN, Magic and Mystic if I’m not mistaken. Now we’ll get a chance to see the details behind what may be VMware’s entry into the hardware market. Also I wouldn’t be surprised at all to see some other big names making similar moves in this new and growing space.
PernixData FVP 2.0
I posted on PernixData FVP 1.5 here (which won “Best New Product” at VMworld 2013 and 2.0 will be a big jump with some exciting features – including the ability to use memory on your ESXi hosts as an acceleration tier (read cache and clustered write offloading).
A growing differentiation point with cloud providers is services on top of the stack and VMware recently introduced disaster recovery a few months ago. I hope to see some enhancements and possibly even new services and/or pricing options announced at VMworld.
Sessions are a huge part of the value of VMworld. These sessions are recorded and (in time) are made available for on-demand playback (access required). Duncan Epping has taken the time to highlight some of this year’s “must attend” VMworld sessions here.
Keeping Up Remotely
Like I said I’m an expert on this. Several of the general sessions will be available via live stream and there’s twitter and bloggers as well. It’s not the same as being there but it’s not hard to keep up with some of the details and big news either.
Looking forward to a great VMworld and some exciting new solutions and offerings that will help us solve problems, fill gaps and create value. Have an enjoyable and safe VMworld whether your attending in person or remotely!
PernixData FVP is a solution I’ve worked with in one environment for perhaps the past 6 months or so. I’ve been meaning to write about it (more than just tweets anyway) for some time, but I’m first now getting around to it.
The first question of course is “what does PernixData FVP do and why might I want it in my vSphere infrastructure?”. The short answer I usually give is that it’s Nitrus Oxide for your storage tier – just add FVP to your existing storage infrastructure and enjoy the speed (plus it’s legal)!
The longer answer is a bit more detailed than that, and first it would be helpful to have a quick overview of various storage architectures.
Traditional Storage Array
Here we are talking about hardware that is designed to offer up storage via usually fiber channel, iSCSI or NFS protocols. For the purposes of this article, most any hardware based storage array from NetApp, EMC, Nimble Storage, HP, Dell and many others fits this definition. This is a tried and true design, but as our capacity and performance needs grow, scale-out ability can become an issue in some environments (especially Google, Facebook, etc.). In fairness some storage array vendors have implemented scale-out capabilities into their solutions, but for our purposes here I am simply trying to build a distinction between architectures at a VERY high level.
Remember scale-out NFS and Hadoop? These designs typically did not rely on a monolithic storage array but multiple nodes using direct-attached storage and logically joined by…software. First we had “software defined” compute with VMware abstracting the CPU and memory resources of server hardware. Now we are abstracting at the storage controller level as well to unlock more potential.
Recently several vendors have had success with incorporating Hyper-Scale concepts into virtual storage arrays for vSphere, including Nutanix, VMware (VSAN), Simplivity, and more. Hyper-scale infrastructure is truly “software defined” as software and logical controllers are the key to making this distributed and scalable architecture work.
Click to enlarge
Occasionally this design is referred to as “Web Scale” as it does invoke a highly parallel environment designed for scale, but I prefer the term Hyper-Scale for several reasons, including that the use cases go far beyond just “web”. We’re talking about applying web scale principles to present “software defined storage”.
Considerations with Hyper-Scale
If write activity is in progress on a server node and it crashes hard before the data is replicated, what happens? (the answer is “nothing good”). The solution here is to write in parallel to two or more nodes (depending on your tolerance for failure settings). This is why a 10GB or better backbone is critical for hyper-scale designs – every write needs to be copied to at least one more host before it is considered to be committed.
Another consideration is locality to processor. For some applications anything under 20ms of latency is “adequate”, but some mission critical OLTP systems measure latency in the fractions of milliseconds. For these applications, latency can be significantly reduced by having the data closer to the CPU rather than having to fetch it from other nodes (more on this later).
Enter PernixData FVP
So let’s say you have an existing vSphere infrastructure and you have a storage array that while it could benefit from better performance, you are otherwise comfortable with. With PernixData FVP you can keep your existing storage array — eliminating the CAPEX burden of a new storage array — and accelerate it by decoupling performance from the storage array onto a new logical “flash cluster” that transcends your server nodes.
Click to enlarge
There are other solutions for adding flash-based read cache to your environment including vSphere’s vFlash capability, but most are local only (no flash cluster concept) and don’t offer the ability to cache writes. PernixData FVP is unique in my experience in that it is a true flash cluster that transcends across your server nodes that will accelerate BOTH reads and writes.
I’ve done this more than a few times now but I must say it’s rather straight forward.
First you will need to install some flash in your servers. In the environment I worked on we used FusionIO PCI cards, but SSDs will work as well. How much flash should you use? It depends on your performance profile and objectives, but as a general starting point, about 10% of the total size of the dataset you wish to accelerate is a usually a good place to start.
Then you install PernixData FVP which is done in two steps. First there’s a component you install on your vCenter server which adds an additional database to track some new flash performance metrics. Once installed you can managed and view the flash cluster from the vSphere Client (including the vSphere Web Client as of FVP 1.5).
Managing the Flash Cluster from the vSphere Web Client (click to enlarge)
The second step is to install the FVP VIB (vSphere Installation Bundle) on each ESXi host. I must have installed and uninstalled the FVP VIB several dozen times by now and it’s quite easy – just a standard ESXCLI VIB install.
First put the ESXi host into maintenance mode (stopping any active I/O) and perform the install ( a single ESXCLI command) and exit maintenance mode, and repeat for all additional ESXi hosts in the cluster.
Once you define and create the flash cluster, you can designate policy by datastore or VM. The two policies are write-though and write-back. With a write-through policy you are only using the flash cluster for reads – the most commonly used blocks as determined by efficient algorithms are maintained on the flash cluster for quick access. Not only does this reduce storage latency, but it reduces the IOPS load that your storage controller must process which should result in a performance improvement on the storage controller as well.
With the write-back policy writes are also processed by the flash cluster. Writes are written to the flash cluster (two nodes for failure tolerance) and are then de-staged back to the storage array as performance allows. The net result is that the commit time or latency from the application’s perspective is vastly reduced — incredibly important for write-intensive (i.e. OLTP) applications.
1 Day IOPS Chart for a database VM (click to expand)
The graph above shows a chart (from the vSphere Web Client) of a database server accelerated by PernixData FVP for the past day. The purple line shows the latency that is incurred at the storage controller level, but the blue line is what the VM or application “feels”. The orange line represents the latency to local flash which is measured in fractions of a millisecond. The distance between the purple and blue lines is latency that has been effectively removed from the application by PernixData FVP.
Also one nice feature about FVP is that it reminds you right in the vSphere client what it is doing for you. In the environment I work on, it has saved almost 2 billion IOPS (pronounced “Beeeeeelion”) and 87TB of storage traffic just in the past 25 days.
Nitrus Oxide For Your Storage Array
In review, now you can see why I say PernixData FVP is much like adding Nitrus Oxide to a car (and of course being legal). You don’t have to buy a new car – you can just make the one you already have faster. And if you buy a new car (or storage array) you can still use your server-side flash cluster to accelerate it.
Much of what makes PernixData FVP special is the clustered file system that enables it to quickly and efficiently process writes to multiple hosts at once. This capability makes PernixData FVP a great fit for write-intensive transactional applications for which latency is key. Or maybe you have an array with slower SATA disk and you might find it more cost effective to simply accelerate it rather than getting a new storage array. Either way adding a server-side flash cluster to your vSphere cluster will significantly boost your performance. The DBA team in this environment has seen the time duration on some batch jobs decrease by over 900%.
PernixData isn’t done yet. Their next release will include the following features:
Topology Aware Replica Groups (control over the hosts used for DR and/or performance considerations).
The biggest feature there is RAM support. That’s right, you’ll be able to skip the flash if you prefer and use the RAM in your host servers as your clustered read and write cache. Just buy your host servers with the extra RAM capacity you want to use as cache and add FVP. And because memory is close to the CPU it should be quite fast. I’m looking forward to testing this capability when it comes out of beta and I’ll try to follow up with a post on that experience when the time comes.
The addition of network compression should also reduce the amount of data to be transmitted. ESXi already compresses memory pages because even with the CPU overhead it will increase performance by reducing swapping. FVP is using the same concept here to reduce the amount of data that has to be transmitted across the cluster.
In summary I found PernixData FVP a pleasure to use. It’s not difficult to install and it decouples most of the performance pain away from the storage controller and onto the server-side flash cluster (or RAM cluster in the next release). But the best result was seeing the impact on database performance and transaction times. If you have a write-intensive application that can benefit from server-side caching (not just reads but writes too!) then you owe it to yourself to take a look at PernixData FVP. I’ll be taking another look when 2.0 becomes available.
Cisco UCS servers have made quite an impact in the market and are currently #1 in blades. Most UCS Servers don’t use any local storage beyond maybe booting ESXi from an SD card. But what if you had a use case where you needed to use direct attached storage? Not a common use case today, but VMware VSAN is likely to change that.
The problem I encountered is that ESXi in UCS servers would not report health for storage elements to ESXi. Cisco UCS servers use LSI controllers and we were completely blind to events like a hard drive failure, RAID rebuild, predictive failure and so forth. The use case here was a single UCS-C server with direct-attached storage which hasn’t been a common use case until just now with VMware VSAN.
Using different combinations of drivers blessed by VMware and Cisco I was unable to get physical drive and controller health to report in ESXi. I did my due diligience on a few Google searches but was unable to find any solution.
Then I went on the LSI website to look at the available downloads and something caught my eye — an SMI-S provider for VMware. I remembered that SMI-S is basically CIM, which is what ESXi uses to collect health information. This is a separate VIB that is independent of the megaraid_sas driver in ESXi. With the SMI-S provider installed in ESXi suddenly I could see all the things that were missing in the health section such as:
Physical drive health
Logical drive health
Basically the moral of the story is this — if you have an LSI array controller (common in UCS-C) then you’ll need to follow these steps to get health monitoring on your storage elements:
1) Go to LSI’s website and download the current SMI-S provider for VMware for your card.
2) Upload the VIB file to a VMFS datastore
3) From an SSH shell type “esxcli software vib install -v [full path to vib file]”
I’m not clear on why this capability is not exposed by the driver, but it seems for the time being that installing this additional VIB is required to get ESXi to monitor the health of storage elements on LSI controllers.
Yesterday, VMware announced the public availability of vSphere 6.0 Beta 2. I can’t tell you what’s all in it due to the NDA, but you can still register for the beta yourself, read about what’s new and download the code for your home lab.
There’s some pretty exciting stuff being added to vSphere 6.0 in quite a few areas. One of these new areas is vVols — a new abstraction for volumes that enables tighter integration with storage arrays through the VASA API. You can read more about vVols in vSphere 6.0 on Rawlinson’s post.
One more thing — after you sign up for the beta you will be able to attend the following two webinars on the vSphere 6.0 beta
Introduction / Overview – Tuesday, July 8, 2014
Installation & Upgrade – Thursday, July 10, 2014
Needless to say there’s some pretty awesome stuff in the 6.0 Beta. Start your download engines!
Back in 2010 I noticed with this blog post the entry of Nimble Storage into the storage market. With their release of their new CS700 line and what they call Adaptive Flash, I figured it was a good time for a second look.
Before we look at the new offerings a quick refresh on Nimble Storage’s CASL architecture would be in order. CASL stands for Cache Accelerated Sequential Layout and Nimble describes the key functions here:
CASL collects or coalesces random writes, compresses them, and writes them sequentially to disks.
Nimble states that this this approach to writes can be “as much as 100x faster” than traditional disks. The image below is a bit fuzzy, but if you click to expand it should be readable.
CASL Features (click to enlarge)
It is important to note that both the compression and the automated storage tiering to flash is inline (no post-process or bolt-ons) which adds additional efficiencies. Also features such as snaps, data protection, replication and zero-copy clones are included.
The CS700 is the new model which features Ivy Bridge processors, 12 HDDs and 4SSDs for a hybrid storage pool Nimble claims is up to 2.5x faster than previous models, with up to 125K IOPS from just one shelf.
Now you can buy expansion shelves for the CS700 including an All-Flash shelf and this is where something called “Adaptive Flash” kicks in. The All-Flash shelves host up to 12.8TB of flash each in a 3U shelf and are used exclusively for reads.The product materials on Adaptive Storage I found to be a bit light on technical details but from what I can discern some of the secret sauce is provided by a back-end cloud engine.
Nimble Storage has a robust “phone home” feature called InfoSight which sends health, configuration and utilization information to cloud services for analysis. Several vendors do this, but the twist here seems to be that they are using the resources of the cloud based engine to “crunch” your utilization data and send guidance back to your controllers on how they should be leveraging the flash tier. In summary the big idea here seems to be that leveraging greater computing resources “big data” style in the cloud can make better decisions on cache allocation and tuning that the controllers themselves.
The Big Picture
Nimble uses a scale-out architecture to scale out storage nodes into clusters. Nimble Storage claims that a four (4) node cluster with Adaptive Flash and support a half-million IOPS.
Below is a table (created by Nimble Storage) which position the CS700 in a 4-node cluster against EMC’s VNX7600 with ExtremeIO. I’d like to see an independent comparison but it appears Nimble Storage may be on to something with this architecture.
All-Flash arrays are nice but they aren’t the only game in town. Nimble Storage seems to have a compelling story around a hybrid solution which is driven by both controller software, as well as back-end software hosted on cloud services.
Evidence for MARVIN’s existence comes from two sources.
One is thistrademark filingdescribing MARVIN as “Computer hardware for virtualization; computer hardware enabling users to manage virtual computing resources that include networking and data storage”.
The second source is the tweet below, which depicts a poster for MARVIN on a VMware campus.
If this pans out to be true it would be a very interesting development indeed. It is important to note that that the trademark specifically says that MARVIN is “hardware”. But will it be VMware’s hardware? As Christian pointed on his post it would go against VMware’s DNA to sell it’s own hardware. But EMC — VMware’s majority owner — already has VSPEX — a confederated hardware offering from multiple OEMs but purchased through EMC. It seems more plausible that VMware would leverage a VSPEX-like model and utilize Dell, Cisco, SuperMicro, etc. hardware for MARVIN. What VMware really needs is a way to sell converged infrastructure nodes as one SKU (mitigate design risk) and one point of support — a VSPEX-like model for MARVIN would accomplish exactly this without VMware actually selling their own hardware.
MARVIN at first glance would also seem to be a validation of the Nutanix model — build a scale-out storage solution and sell as boxes that include the full stack. That’s not an apples to apples comparison and it’s not my intent to split hairs here, but one of the attractive things about the Nutanix model is that “you just buy a box”. By combining VMware VSAN with vSphere and hardware, VMware can offer a scale-out modular solution where customers just need to “buy a box” as well.
Of course its possible to build your own VSAN-enabled vSphere cluster using hardware of your choice from the HCL, but as noted with some recent issues there’s some risk in not selecting the optimal components. By offering a complete IaaS stack as a modular hardware unit, this eliminates the “design risk” for the customer and enables more support options.
One more thing to keep in mind. EMC recently acquired DSSD with the goal of developing persistent storage that sits in the memory bus, therefore closer to the CPU. It wouldn’t surprise me to see this introduced in future editions as well.
This could be an interesting development. What are your thoughts about the potential entry of VMware into the hardware market?
Also what could MARVIN stand for? How about…
Modular ARray of Virtualization Infrastructure Nodes?
Blue Shift is a blog focusing on today's major technology shifts -- virtualization and cloud computing. We will also cover complementary technologies, including storage, networking and Windows technologies.