Why Microsoft?

This is a question that can be explored from many different angles, but I’d like to focus on it from not JUST a virtualization perspective, and not JUST a cloud perspective, and not JUST from my own perspective as a vExpert joining Microsoft, but a more holistic perspective which considers all of this, as well

Top 6 Features of vSphere 6

This changes things. It sounds cliché to say “this is our best release ever” because in a sense the newest release is usually the most evolved.  However as a four year VMware vExpert I do think that there is something special about this one.  This is a much more significant jump than going from 4.x

vSphere 6.0 Public Beta — Sign Up to Learn What’s New

Yesterday, VMware announced the public availability of vSphere 6.0 Beta 2.  I can’t tell you what’s all in it due to the NDA, but you can still register for the beta yourself, read about what’s new and download the code for your home lab. There’s some pretty exciting stuff being added to vSphere 6.0 in

Will VMware Start Selling Hardware? Meet MARVIN

The Register is running a story that VMware is preparing to launch a line of hardware servers.

VMware Pursues SDN With Upcoming NSX Offering

Earlier this week VMware announced VMware NSX – an upcoming offering that takes network virtualization to new levels. NSX appears to be somewhat of a fusion between Nicria’s SDN technology (acquired last year by VMware) and vCloud Network and Security (vCNS – formerly known as vShield App and Edge). Since I already had intentions to

What Really Is Cloud Computing? (Triple-A Cloud)

What is cloud computing?  Ask a consumer, CIO, and salesman and you’ll likely get widely varying responses. The consumer will typically think of the cloud as a hosted service, such as Apple’s iCloud, or uploading pictures to Photobucket, and scores more of like services (just keep in mind that several such services existed before it

Agility Part 2 — The Evolution of Value in the Private Cloud

When an IT project is commissioned it can be backed by a number of different statements such as: “It will reduce our TCO” “This is a strategic initiative” “The ROI is compelling” “There’s funds in the budget” “Our competitors are doing it” Some of these are better reasons than others, but here’s a question.  Imagine a

Stacks, the Vblock and Value — A Chat with EMC’s Chad Sakac

…I reached out to EMC’s Chad Sakac to gain more insights from his perspective on how the various stacks…well…stacked up….

Reflection Is a Strategic Imperative

Doing my daily reading and I just came across this post at Harvard Business Review by Umair Haque which I wanted to pass on.

Curiosity and reflection are two critical traits that are essential for effective leadership in my opinion.  Too many problems — cultural or technical — stem from a lack of reflection and curiosity.  Is this really the best strategy?  Am I just doing what my managers are asking and missing a bigger picture?  Is their another approach to this problem?  Is my organization well aligned to execute our plan/strategy?

Managers who lack such traits tend to be risk-averse and increasingly rely on processes (think status-quo versus adopting disruptive technologies).

I’ll try to work this concept into future posts, but wanted to quickly pass this article along as I think it makes a great point.

Should You Virtualize vCenter Server (and everything else?)

Should You Virtualize vCenter Server (and everything else?)

When concerns are raised around virtualizing vCenter Server, in my experience they usually revolve around either performance and/or out-of-band management. The VROOM! blog at VMware just published a whitepaper that looks closely at vCenter Server performance as a VM versus native (physical) which speaks to these concerns as well as for other workloads.

vCenter Performance as a VM

vCenter Server can use SQL, Oracle or DB2 for a database, and the VROOM! team chose Microsoft SQL Server 2008 for their tests, which might be the most commonly used scenario. In the tests the vCenter server would support a virtual inventory (populated tables) of 8,000 VM’s across 500 hosts and then compare a number of performance metrics between the VM version and the physical/native version running on an HP DL380 G6.

There’s many different tests that were run so I’ll share just a sample of the results that are in the whitepaper. One is the rollup stats function where vCenter takes captured performance metrics from hosts and VM’s and rolls them into different time-based tiers with varying levels of granularity. In other words, this task is essentially an I/O intensive SQL stored procedure:

When you first looked at this graph you were probably trying to answer the question “how much slower is it on a VM?”  As you can see, that would have been the wrong question as the virtualized instance was actually faster than native!

This paradigm wouldn’t hold up for all tests however. The stored proceedure for TopN stat collection showed that while the VM was slower (for 3 of 4 tests) it was within 5% of native performance:

For the Purge stored proceedures, the VM was between 10-13% faster than the native instance:

On top of the vCenter specific test the team ran some IOP stress tests using different performance tools, and in each case, the VM was able to sustain greater IOPS than the native system:

Conclusions

This test – which mostly consisted of SQL stored procedures and raw IOPS throughput – showed that at worst a VM is no more than 5% slower than a native system, and depending on the workload in some cases the VM can actually be much faster.

This is consistent with what is taking place in the marketplace where IT organizations are virtualizing Exchange and even mission-critical SAP and Oracle instances.

Two more things I want to discuss. Should we virtualize everything else too? And what about other vCenter specific concerns?

vCenter Server as a VM

If you’re like me, you’re probably no longer terribly concerned about performance after reading the summary of the VROOM! Whitepaper, which leaves us with operational questions. As a general rule it wouldn’t seem right to manage a system from within that same system, but rather you’d opt for an out-of-band approach – which in the case of vCenter would require a physical server which is not dependent on the hypervisor and virtual infrastructure that you are managing.

However the potential failure levels are the OS, the VM (including its disks) and the host hardware. The risks of these can be mitigated by deploying vCenter Server Heartbeat which uses NeverFail technology to replicate the entire vCenter server to a stand-by VM, with its own virtual disks which are kept in sync via NeverFail’s replication engine.  If there is a critical issue at either the guest OS, VMDK or host hardware levels, the stand-by VM will kick in and host the vCenter services.

For all these reasons it is actually an official VMware best practice to run vCenter Server as a VM.

Should we just virtualize everything?

Performance varies by application and I/O patterns but for most transactional workloads (SQL, Exchange, etc.) a well tuned vSphere environment will run at least 90% of native and usually closer to 95%.  For some workloads the difference is less than 5% and some functions are actually faster on VMWare.  For more examples see additional whitepapers from VROOM! on SAP and Sharepoint.

The answer (I think) is “It depends”.  It depends on the application and your environment.  In most cases, virtualization will be a viable option but curiosity is recommended.

The first reason virtualization is considered is usually for CAPEX (consolidation) benefits. But what if an application is so “big” that you’re looking at a 1-to-1 consolidation ratio? The CAPEX benefit is now gone but what about the OPEX and agility benefits of virtualization?  As a VM, your application has new possibilities for backup, replication, HA and much more. And as we’ve seen above, in some cases workloads just simply run better as a VM.

Most workloads can be virtualized but it pays to be curious. Understand your application and your environment and consult with your storage/network teams to verify that your virtualized infrastructure can deliver the performance you need.

Happy Thanksgiving!

“Keep your eyes open to your mercies. The man who forgets to be thankful has fallen asleep in life.” – Robert Louis Stevenson (via @MakeAWish)

Today is Thanksgiving and there is so much to be thankful for!  While working on the turkey, stuffing and trimmings I find it inspirational to recall the original context of the holiday.  Below is George Washington’s Thanksgiving Day Proclamation delivered on this day in 1789.  Later Abraham Lincoln would revive the practice of setting aside a day of thanks during the dark days of The Civil War and later Congress would make it a national holiday.

On a personal level I am most thankful for the health of my daughter who went through major surgery this past summer, as well as the Make-A-Wish foundation for granting my daughter’s wish in advance of the surgery.  Have a wonderful thanksgiving and give thanks!

By the President of the United States of America, a Proclamation.

Whereas it is the duty of all Nations to acknowledge the providence of Almighty God, to obey his will, to be grateful for his benefits, and humbly to implore his protection and favor– and whereas both Houses of Congress have by their joint Committee requested me to recommend to the People of the United States a day of public thanksgiving and prayer to be observed by acknowledging with grateful hearts the many signal favors of Almighty God especially by affording them an opportunity peaceably to establish a form of government for their safety and happiness.

Now therefore I do recommend and assign Thursday the 26th day of November next to be devoted by the People of these States to the service of that great and glorious Being, who is the beneficent Author of all the good that was, that is, or that will be– That we may then all unite in rendering unto him our sincere and humble thanks–for his kind care and protection of the People of this Country previous to their becoming a Nation–for the signal and manifold mercies, and the favorable interpositions of his Providence which we experienced in the course and conclusion of the late war–for the great degree of tranquility, union, and plenty, which we have since enjoyed–for the peaceable and rational manner, in which we have been enabled to establish constitutions of government for our safety and happiness, and particularly the national One now lately instituted–for the civil and religious liberty with which we are blessed; and the means we have of acquiring and diffusing useful knowledge; and in general for all the great and various favors which he hath been pleased to confer upon us.

and also that we may then unite in most humbly offering our prayers and supplications to the great Lord and Ruler of Nations and beseech him to pardon our national and other transgressions– to enable us all, whether in public or private stations, to perform our several and relative duties properly and punctually–to render our national government a blessing to all the people, by constantly being a Government of wise, just, and constitutional laws, discreetly and faithfully executed and obeyed–to protect and guide all Sovereigns and Nations (especially such as have shewn kindness unto us) and to bless them with good government, peace, and concord–To promote the knowledge and practice of true religion and virtue, and the encrease of science among them and us–and generally to grant unto all Mankind such a degree of temporal prosperity as he alone knows to be best.

Given under my hand at the City of New York the third day of October in the year of our Lord 1789. Go: Washington

PBS: Why You Need Backup 2.0 — Avoiding the Costs of Traditional Backup

Reducing OPEX (Operational Expense) is a low-hanging fruit that often times I see some organizations fail to reach up and grab. Sometimes it is overlooked after virtualizing and other times it is even a fear towards virtualizing (VM Stall).

In this post I’ll share details from multiple sources (including Project Blue Sphere) on how backups in virtualized environments can be significantly improved.

The Backup 1.0 Burden

Veeam recently published the results of a survey of 500 enterprises on backups and the results were interesting:

  • The average enterprise requires 5 hours (averaged) to restore a virtual machine — little improvement over the 6 hour average for physical.
  • 63% experience problems each month when attempting to recover a server
  • Failed restores cost companies over $400,000 annually
  • 59% of organizations still use Backup 1.0 (physical-based backup tools)
  • 63% are trying to use the same backup tool for both physical and virtual servers

I believe that these sub-par numbers are being driven by the 59% which are still using Backup 1.0 methods.

Backup 1.0 versus Backup 2.0

What is Backup 1.0? I’m going to “borrow” Quest Software’s definition of Backup 1.0 and 2.0 which is illustrated below:

Backup 1.0 involves traditional backup agents installed within the OS which incurs additional OPEX burdens. The restore process often consists of deploying a new server, installing the OS and then backup software, performing a restore from tape and sometimes having to repeat the entire process.

Backup 2.0 in contrast is agentless and full images are restored from disk in a single step. This single-step recovery from disk often drastically reduces restore times – in some cases by a factor of 10 or more.  Furthermore, with Backup 2.0 you don’t have to sacrifice granular file/mailbox level recovery as solutions like Quest vRanger and Veeam Backup and Replication provide granular file/object level restores as well.

Project Blue Sphere: A Tale of Two Infrastructures

At Project Blue Sphere we had a newer 20-host virtual infrastructure being backed up to disk by Quest vRanger Pro 4.5.  Restores when necessary were relatively quick and painless.

However another ESX 3.0 farm existed from a different org in which Backup 1.0 methods were still used.   A few months ago, several VMFS volumes suffered a catastrophic failure and it took days to restore several production and business critical VM’s using the painful multi-step restore process of Backup 1.0.   Staff worked overtime during these days building systems from scratch, deploying backup agents and then trying with mixed success to run restore jobs from tape.

This event made management painfully aware of the differences of the backup methods and is one reason why Project Blue Sphere was chartered – to get the entire organization to realize the benefits of Backup 2.0.

A Note on SQL Databases

In a separate incident I observed a scenario where it took over 40 hours to restore a business-critical physical SQL server and database into production. There’s no technical reason the server couldn’t have been virtualized but that’s a different topic. It was the Backup 1.0 restore of the database from tape that took hours and hours for the restore.

Regardless of whether your SQL server is physical or virtual I would strongly recommend considering either Red Gate SQL Backup or Quest LiteSpeed for SQL Databases. These products will back up your databases to local disk with compression rates usually greater than 95% making restores very fast and pain free. From a recovery/RTO perspective, one of the worst things you can do for a large SQL database is to restore from agent-based backups to tape.

When combined with Backup 2.0 you have the best of both worlds.  You can quickly restore the server from an image, and then also restore databases to different points in time, using the local compressed backups from one of these solutions.  A similar best practice is to regularly dump the system state to local disk using NTBACKUP on Active Directory Domain Controllers.

Backup 2.0 Benefits Are Real, But Process Is Needed

While there are huge advantages to Backup 2.0 it’s not a magic bullet. Special attention is required – especially with VSS and application consistency to ensure that your applications are being backed up in a consistent state. For more details on these problems and how to address them, reference our two part series on VMware and VSS (Part One and Part Two).

Quest vRanger and Veeam Backup and Replication

I’ve referenced two competing products and the intent here is not to compare these two solutions, but rather a focus on how both Quest vRanger and Veeam Backup and Replication are strong solutions that can help organizations realize the benefits of Backup 2.0.

For Project Blue Sphere we happen to use Quest vRanger which we have used for years and I am certain that Veeam vPower would be able to do an excellent job as well.

Other Solutions

At Project Blue Sphere we elected to not use our existing TSM (Tivoli Storage Manger) infrastructure as the vStorage API is not fully supported at this time. Different backup vendors offer varying levels of support for the vStorage API (VADP), but in my opinion Quest vRanger and Veeam vPower are two strong solutions which are focused primarily on virtualized environments from the ground up, and both of them go beyond the capabilities of the vStorage API to provide additional benefits in virtualized environments.

Conclusions

Many companies are still using Backup 1.0 methods and those that do suffer from:

  • Long restores times
  • High labor cost for restores
  • High failure rate for restores
  • High financial impact for recovery delays and failed restores

Ironically, 44% in Veeam’s survey indicated that they are not virtualizing some workloads due to concerns about backup and recovery.

Both Quest vRanger and Veeam Backup and Replication can be used to achieve the benefits of Backup 2.0:

Are you using Backup 2.0 methods for your VM’s today? If not, you may have much to gain by taking a closer look.

Can your VM be restored?  VSS and VMware — Part 2 (updated)

Can your VM be restored? VSS and VMware — Part 2 (updated)

This post was originally made in July 2010, and has new updates and a new section on Active Directory.

The backup job for your VM completed successfully so the backup is good, right?  Unfortunately it’s not that simple and a failure to effectively deal with VM backups can result in data loss and perhaps even legal consequences.

In Part 1 we discussed VSS, why it is important, and how to make sure VMware Tools is configured to leverage VSS.  But unfortunately there are different levels of VSS support for different operating systems which needs to be considered.  In this post we will first discuss the gaps and then some potential remedies.

Let’s start with a wake up call:

If you have a VM running Exchange, SQL or SharePoint on Windows 2008, and are not running vSphere 4.1, your VM is not being backed up in an application-consistent state unless you have taken specific steps.

“Mr. Backup” at Backup Central brought attention to the VMware gap with Windows backups in a detailed post.  In this post I’ll try to summarize this and explore a few things from a different angle.

The above chart needs some explanation.  Volume consistency means that the volume is quiesced at the file level, but NOT at the application level.   If you’re not quiescing at the application level, you may be unable to restore that application.

Applications also need to be notified that a backup has taken place so that they can truncate their logs.  This is where you need to understand your applications.  Exchange is especially vulnerable to this as it is highly transactional and logs are only truncated during backups.  Some SQL databases may run in simple recovery mode and/or have stored procedures which will either backup or truncate the logs.  But if a SQL database is running in full recovery mode, with no process in place to truncate the logs, it will eventually fill up the disk and bring everything to a screeching halt.

vSphere 4.1

vSphere 4.1 corrects the gap with Windows 2008 application quiescing, as noted in the “What’s new in vSphere 4.1” notes:

VADP now offers VSS quiescing support for Windows Server 2008 and Windows Server 2008 R2 servers. This enables application-consistent backup and restore operations for Windows Server 2008 and Windows Server 2008 R2 applications.

UPDATE: Specific steps are needed to support application quiescing on Windows 2008 VM’s created in vSphere 4.0 and earlier.  Read this post for details.

Now that we understand the gaps, let’s take a look at some remedies.

Upgrade to  vSphere 4.1

This is one way to fix the gap with applications not being quiesced in Windows 2008.  At the time of this writing it is not clear however, if this includes the ability to notify apps of the backup so that they can truncate their logs (stay tuned).

Install a helper agent inside the VM

Both Veeam Backup and Quest (Vizioncore) vRanger Pro provide an additional VSS agent that can be installed in a VM.  Each of these agents provides full support for both application quiescing AND notifying the application that a backup has taken place.

If you are using the current version of either Veeam Backup or Quest’s vRanger, you just need to install their agent into the VM’s that require application-level integration and configure the backup job appropriately.

SQL Backups

One of my favorite ways to solve this problem for databases is to use either Quest Lightspeed or RedGate SQL Backup.  These products will back up your SQL databases to highly compressed files that you can keep right on your VM.  This means quicker restores, and the backups are automatically captured by volume-level quiescing.   You just need to make sure that your backup schedules are synchronized according to your organization’s RTO and RPO objectives, as well as have the proper monitoring in place.

Additionally you can use the built in SQL tools to configure backups (now compressed in SQL 2008 R2) or write a simple stored procedure to truncate the logs (if you don’t need point in time recovery).

UPDATE:  A few weeks ago I watched in horror as a SQL Server restore took over 40 hours (they were using “legacy” agent-based backup to tape — yuck!)).  Using a VSS-aware VM-level backup to disk would have saved most of these hours.  In addition, using a product like RedGate SQL Backup or Quest Lite Speed would have vastly improved database restore time (if necessary).  I continue to be amazed that some organizations would leave themselves vulnerable to long restore times when it is easily overcome.

Active Directory Domain Controllers

Some have made the statement that snapshots should not be used on an AD Domain controller, especially for the purposes of restore.  There’s several reasons for this, including problems with the SYNC driver (discussed in Part One) which can crash your AD, and restoring a domain controller from a snap is not supported (and for good reason).

Since snaps are used in backups, does that mean you should not back up with snaps?  I don’t agree with this interpretation.  First having a snap open for only the duration of a backup is OK in my opinion.  Second, if VSS integration is enabled with the domain controller, VSS will automatically quiesce the System State which includes the SYSVOL, NTDS.DIT and other elements of Active Directory.  And third, I would recommend an additional backup of the System State using NTBACKUP as an additional level of protection.  This is basically the same concept as using Quest Lite Speed or RedGate SQL Backup on a SQL server — you let NTBACKUP backup the critical elements to a flat file on your system and that file will be included in the backups.  You just need to schedule the timing such that the System State backup file exists on your system at the time of the VM-level backup.

I believe that AD Domain Controllers can be successfully virtualized and that there are significant benefits to doing so — including being able to do offline testing against a “real” AD domain controller.  But only use snaps for backups (never revert a DC to a previous snap!) and keep in mind that there are additional considerations to restoring AD objects from a previous state (authoritative restore, etc.).

Microsoft Exchange

I haven’t worked on Exchange for some time and thus sometimes I overlook it,  but Veeam has a great post here which details how to address both VSS and granular restore with Microsoft Exchange.

This post originally appeared in July 2010.  Please also see the post Application Consistent Quiescing in vSphere 4.1 for more details on Windows 2008 VSS support.

PBS: Upgrade to vSphere Enterprise Plus before 12/15! (updated)

UPDATE:

Maish (Technodrome) and I went over the numbers and we both had to make some adjustments.  After issues like VM versus CPU licensing were accounted for, it appears that the benefit of the add-ins in the promotion comes out closer to $11K at list price.  Furthermore, if up to 3 orders are used the benefit can be tripled for a total of over $32K in savings.  This is a great way to start a VDI pilot and/or assess the value of Capacity IQ in your environment.

One of our first tasks in Project Blue Sphere will be to evaluate VMware’s current promotion for upgrades to vSphere Enterprise Plus which expires on 12/15/10.

The promotion includes the following incentives:

  • List price for Enterprise to Enterprise Plus upgrade dropped from $685 to $495 (about 28% off list).
  • 50 VMware View 4.5 Premier Licenses (virtual desktops) — includes ThinApp and vShield Endpoint and more.
  • vCenter Capacity IQ for 15 VMs

At list prices these VMware View and Capacity IQ licenses are worth over $10,500!  This is a great way to start a VDI pilot with VMware View and/or look at using CapacityIQ to keep storage (and costs) under control.

But we’re not quite done.  There’s also the SUSE Linux Enterprise Server (SLES) promotion which if I am not mistaken, entitles the customer to one SLES license for each CPU upgraded.  SLES is the 2nd most popular enterprise Linux platform right behind Red Hat which can run Oracle and much more.  A patch subscription is included but technical support would have to be purchased separately through VMware.

And finally let’s not forget the benefits of Enterprise Plus itself which includes (over Enterprise):

  • 8-way SMP for VMs
  • Virtual Distributed Switch (vDS)
  • Network vMotion (important for web servers)
  • Storage I/O Control and Network I/O Control
  • Load Based Teaming
  • Host Profiles
  • Native SAN Multipathing (NMP) on supported SANs

So to wrap it all up you get the following if you order before 12/15:

  • Enterprise to Enterprise Plus Upgrade at ~28% discount ($495 per CPU from $685).
  • 50 Licenses of VMware View 4.5 Premier with ThinApp and vShield Endpoint (+1 year S&S)
  • 15 Licenses of Capacity IQ (+1 year S&S)
  • SUSE Linux Enterprise Server for each CPU upgraded

We’ll be making our case to management and if you’re still at Enterprise you should consider the same!  Promotions also exist for Standard and Advanced but needless to say the cost goes up accordingly.

Quest vFoglight 6.5 Released

In my current environment we use Quest vFoglight which I find to be a fairly strong tool for monitoring virtual infrastructures, for which version 6.5 was just released yesterday.

Chargeback?  Performance trends and reports?  Which VM’s are using the most disk/network/CPU/RAM?  Out-of-the-box alarms based on best practices?

vFoglight can do all of this and more.  In version 6.5 several new capabilities have been introduced including support for Hyper-V, custom views, improved FAQts and Event Remediation.

Event Remediation is something I’m familiar with from working with monitoring systems like HP Operations Manager in the past.  It’s automation in the form of “When event X happens, do Y”.

Quest has a post here which details an example of using the automatic event remediation feature on a VM memory alarm by automatically changing the memory limits to “unlimited”.

I will have a chance to work with vFoglight 6.5 when we upgrade to it as a part of Project Blue Sphere, and will post observations after having had the opportunity to work with it.

VMware Stencils and Powerpoint Graphics

I use these fairly often when creating my own PowerPoint slides so I wanted to quickly echo another post.

Simon at Techhead has posted links to a collection of  graphics and stencils for VMware, Veeam, vEcoshell and other vendors that can be a great tool when putting together a presentation.  Check them out.

Is the Vblock a Monolithic Stack?

This question has come up several times over the past week, and most recently on Kendrick Coleman’s “Finding the True Value in Vblock” post with Duncan Epping of Yellow Bricks asking some great questions.  I commented on the post, but I felt this discussion was also worthy of a blog post as the topic is likely to be brought up again.

The Vblock is somewhat positioned as a data center building block, but is it a one-size-fits-all universal block?  

One observation is that storage I/O patterns will be very different from VDI versus SAP for example.  And where I/O patterns are different, there will be opportunities to optimize for that workload.  The VCE coalition (VMware/ Cisco/ EMC) has appeared to validate this by releasing special “VDI” and “SAP” editions of the Vblock.

Does this mean we’ll see many different Vblocks for different enterprise applications?  Does this mean the Vblock is not quite so universal?

I had wanted to re-word this but in the interest of time, I’m going to just repost the comments I made on Kendrick’s post:

…the type of storage I/O patterns in VDI versus SAP for example will be very different. Right now there are different SKUs for the SAP and VDI configurations.

The concern was raised with at what point do make a new SKU for other enterprise apps and risk fragmentation which would dilute some of the value proposition of the Vblock?

As was pointed out in the conversation, the Vblock is very well tuned for most workloads, but if you are looking at a very specific role, there may be opportunities to customize.

At the very least two things must be true before a SKU/config can be justified:

1) Significant Market demand for the application
2) A significant differentiation from “normal” workload patterns that enables significant optimizations.

A part of #1 I believe is marketing. Customers want to know that their Vblock is certified for x VDI seats or x SAP instances, so that’s a factor in creating a new SKU as well.

When you look at these two things, VDI and OLTP stand out for me and those are basically the 2 variations that exist today (OLTP = SAP). I think these are the big two areas you would want to call out with custom tuning, while the “generic” Vblock still does an excellent job at servicing general/mixed workloads.

So Vblock may not be quite a one size fits all static block, but I don’t think it’s terribly fragmented either (if that makes sense). Excluding the OLTP and VDI workloads, I think the Vblock can be looked at more as a monolithic all-purpose stack.

As always these are just my thoughts as an outside observer and I welcome any other thoughts on this topic.

So while there are different editions of the Vblock, there are reasons behind them and I don’t expect to see many additional editions.  It seems that while the Vblock delivers excellent performance with mixed workloads, there are opportunities for optimization in specific scenarios, and this along with marketing reasons is I think why we are seeing different editions.

For a slightly different angle on the Vblock’s value proposition, be sure to read our prior post, “Let Your Fast Zebras Run Free (with a Vblock)”.

Oracle 11g RAC now supported on vSphere

I first heard the news on Twitter, and then a few hours later Chad Sakac put up a great post detailing the news.  Chris Wolf at Gartner also has a great post on the news which discusses Oracle licensing in some detail.

For the details, read either of the posts above, but the bottom line is that Oracle will support RAC on vSphere, provided that the problem “can be demonstrated to not be as a result of running on VMware” and as Chad points out a V2P can be used to prove this (if necessary/worst case).  Oracle has not “certified” RAC on vSphere (and who knows if they well), but there are many vendors out there (including EMC/VMware) with detailed best practices for Oracle on vSphere based on experience.

One thing in Chris Wolf’s post that I found interesting was this section:

Some customers are more fortunate. For example, one client I have worked with migrated 100 Oracle database instances from AIX to RHEL/ESX last year. Their motivation was to save on IBM support costs, which they estimated at close to $200,000 annually. This particular client had a site license with Oracle, making the migration to ESX practical because they didn’t have to pay additional licensing fees to run in the ESX environment.

A few months ago during VMworld I asked “Will the Cloud be Stormy for Proprietary Hardware?” and I’m wondering the same thing again.  AIX is a strong platform for Oracle but its also very expensive.  As we see above, there was a $200K annual savings in one case when moving Oracle from AIX to RHEL on vSphere.  More organizations may start to take a closer look at similar moves to reduce cost now that Oracle is formally supported on vSphere.  What do you think?

Why Disk Alignment is important (and how to fix a misaligned VM)

Your disk system and your Windows VMs may be running slower than necessary.  Some simple steps can help to improve disk performance by 9 – 13% in your virtual infrastructure.

By now many are familiar with the disk alignment issue but a quick recap.  DBA’s and Exchange admins are very familiar with this concept as they need to drive disk performance for their databases.  The same principle applies to virtualization.  Here we go…

UPDATE:  11/3/11 — Take a look at Nick Weaver’s free UBERAlign tool which can diagnose and correct alignment issues.

The graphic above shows the three layers at issue.  There are the SAN blocks at the bottom, then the VMFS blocks in the middle, and then the NTFS blocks used by the Windows VM.  If these three layers are not aligned, your SAN may be working harder than it needs to.  For example, a call to read a single NTFS block may require the SAN to read three blocks as shown below:

That’s not very efficient.  What would be ideal is for these layers to be aligned so that a single NTFS block requires only one SAN block to be read as illustrated below:

These graphics were taken from the ESX 3 whitepaper, Recommendations for Aligning VMFS Partitions

Let’s talk about how to fix this at both the VMFS and NTFS levels.

VMFS Alignment

When VMFS volumes are created by the vSphere client, they are aligned on a 64K boundary.  Check your SAN vendor’s documentation but in most cases the default 64K boundary will work.  For more details, consult the Performance Best Practices for vSphere 4.0 whitepaper.

NTFS Alignment

By default Windows 2003 and older will align on 32K which will not be aligned with the 64K VMFS layer.  Windows 2008 solves this problem by aligning on a 1024K boundary – this works because 1024K is divisible by 64K.   For example:

  • 32K (NTFS) / 64K (VMFS) = 0.5 = not aligned.
  • 1024K (NTFS) / 64K (VMFS) = 16.0 = aligned.

As long as there is no fractional remainder in this exercise, the two layers are aligned.

Now that we understand the problem and how it impacts performance, let’s focus on determining what VM’s have the problem and ways to fix it.

IDENTIFYING THE PROBLEM

VM’s running Windows 2008 or later should already be aligned as noted above.  If a VM is running Windows 2003 you can check for alignment by one of three ways:

1)      Use the list partition command within Microsoft’s DISKPART utility.

2)      Create a WMI query for the desired attributes by running the following command:

wmic partition get BlockSize, StartingOffset, Name

The output will be in bytes so just divide by 1024.  A 64K aligned partition will have a starting offset of 65536.

3)      Use Vizioncore’s free vOptimizer Wastefinder utility which can scan and identify all your VM’s for misalignment.

#1 or #2 will work fine if you just want to check a specific VM, but Vizioncore’s free utility may be the quickest way to scan your entire environment.

CORRECTING ALIGNMENT PROBLEMS

Once you create a Windows partition you can’t easily change the alignment.  If you are deploying a new Windows 2003 VM, you can use the DISKPART utility to force alignment on a 64K boundary (align = 64).

But what about existing Windows 2003 servers which are not aligned?  One way would be to create new aligned partitions in a new VM and restore/rebuild everything to the new VM, but that’s time consuming and risky.

Vizioncore offers two products which can solve this problem for non-aligned Windows 2003 VM’s:

1)       Vizioncore vOptimizer Pro

This is the full version of the free Wastefinder utility mentioned above.  This product can automatically re-align VM’s on a 64K boundary.  In addition it can resize VMDK’s on a reoccurring schedule in order to significantly reduce storage capacity.

2)      Vizioncore vConverter 5.0

This updated version of vConverter will now automatically align on a 64K boundary while performing either a P2V or a V2V migration.  You can use this tool to perform V2V migrations, which will result in aligned disks.  As far as I am aware this is the only P2V/V2V tool that will automatically realign the target VM on a 64K boundary.

And there you have it in a nutshell.  Two other issues I’d like to quickly point out.

First, you may have a number of Windows 2008 VMs, but if a misaligned VM of any OS is on the same VMFS volume, it could potentially be impacting disk performance for all the VM’s on that volume.

Second, just because a VM is running Windows 2008 it may still be unaligned.  If you did an in-place upgrade of Windows 2003 to Windows 2008 the disk partitions were never modified.  You’ll need to either rebuild the VM or use one of the methods above to correct it.

PowerCLI Script for SYNC/VSS Status

In an earlier post I discussed the problem surrounding the SYNC driver in earlier (pre 3.5 Update 2) versions of VMware Tools.

Chuck at No Planning Required wrote a one-line PowerCLI script to remotely query a VM via WMI and report on the presence of both the SYNC driver and the VSS service.  Thanks Chuck!