Thursday, March 15, 2012

vSphere 5 Upgrade: SRM - PART 1

Per Its Time vSphere 5 Upgrade, time to upgrade SRM.  Well, I got to step 6.2 and things went downhill from there.  The following is the description I used to open an SR with VMware support:
Upgrading SRM from 4.1.2 to 5.0.  I get the error: "failed to create database tables".  It doesn't appear to make any changes to the database.
I've double-checked settings per KB1015436 and several communities postings.
I've tried re-installing SRM 4.1.2 (successfully), performing a repair, then another upgrade but it still fails with the same error.
The first support tech went down the 'invalid permissions' path but this was not the problem.  Turns out, the SRM 5.0 upgrade does not support upgrading from SRM 4.1.2!  Interesting because the only documentation I can find on the subject clearly states that you can upgrade from SRM 4.1(!).  Looks like VMware needs to do a better job documenting these requirements.

I was then informed that I could wait until the next minor/point release of SRM 5 which would support upgrading from 4.1.2, but I didn't have that kind of time (and who knows when they'll actually release it).  So no upgrade for me, full install from scratch instead!  Besides have to reconfigure mappings, protection groups (which I was going to have to do anyway), etc, the biggest downside is losing the previous DR test results.  Yes, I saved those off as separate Excel files, but it would have been nice to have had all of the results right there in SRM from the beginning.


But wait, there's more!  Now that I have a brand new freshly installed SRM up and running, it's time to setup vSphere Replication.  Did that go problem free you ask?  Ummm, no.  The following is the description I used to open yet another SR with VMware support:
The VRMS servers at both sites fail to connect.  I have unregistered the server, powered down/deleted the appliance VM, re-initialized the VRMS database, repaired SRM, redeployed the VRMS servers and configured them with the same vCenter FQDN per KB2007463 but still have the same problem.
Between the support tech and I it took several hours to figure this one out.  The short of it is that it's a vCenter certificate problem.  What clued me into this was the error I got when registering the VRMS instance:

That "unacceptable signature algorithm" message is not your typical self-signed cert warning!  Turns out, my vCenter self-signed certs had expired.  This hadn't caused a problem until installing vSphere Replication - it wants at least a current/non-expired cert.  I checked the vCenter cert and sure enough, it had expired in 2010.  It was created in 2008 and was valid for only 2 years!

Now I bet you're wondering, how does one fix this cert problem?  Well that's easy, reinstall vCenter!  And repair won't work either so you have to uninstall the current vCenter instance and re-install a new one.  Luckily most settings are maintained in the vCenter database so this could have been much more painful.

While I was at it I checked the new vCenter cert and VMware apparently decided to make this one valid for 10 years.  Now that's more like it!

But wait, there's more!  Look for part PART 2 of this adventure in a near future post.  A little hint - the fun ain't over yet.