Friday, June 29, 2012

The Continued Evolution of Windows Management

Back in the late nineties I was hire by a company to lead a nation-wide implementation of Microsoft Systems Management Server (SMS) 2.0. I had learned about management technologies Microsoft included in Windows NT 4.0 and thought this would be an exciting opportunity that used these technologies to the fullest. By 2001, we had successfully implemented SMS and were using many of these components in the process.

While I currently consider my expertise to be virtualization and storage, Windows management technology remains near and dear to my heart.

Windows management technology is made up of an alphabet soup of components: HAL, DCOM, WMI, CIM, CIMOM, WEBM, etc, etc. These components have evolved over time, and it looks like they're getting a (much needed) update in Windows 8 and Server 2012.

The fact that hasn't changed in 12 years is that these components are critical to successfully managing Windows systems. And now, non-Windows systems and devices. I highly recommend any IT professional keeping up with these components and where not only Microsoft, but the rest of the industry is taking them.

A good place to start - this TechNet blog article:
Open Management Infrastructure

Monday, June 25, 2012

ERROR: VMware ESXi with 3PAR SAN and Dead LUN 254

Problem

Last week I discovered  a couple of error messages in the vmkernel.log file that caused me some concern:

2012-06-22T18:18:45.806Z cpu17:939673)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate:972:Could not select path for device "Unregistered Device".
2012-06-22T18:18:45.806Z cpu17:939673)WARNING: NMP: nmpPathClaimEnd:1195:Device, seen through path vmhba2:C0:T2:L254 is not registered (no active paths)


I did a "esxcfg-mpath -l" and found 4 dead paths - 2 to each fibre HBA.  I never provisioned a LUN with an ID of 254.  Maybe this was a "special use" LUN?  I doubted it because my HP EVA has one of these and the device is listed in vCenter.  However, there were no devices with LUN ID of 254 listed anywhere in vCenter, only 4 dead paths.

Solution

After a focused Google search, I found the answer.  The HP 3PAR guy that came out and did the installation had us use a host persona of "1 - Generic" when we should have used "6 - Generic-legacy".  Luckily these can be changed on the fly via the InForm Management Console.

After making the change in the IMC, I rescaned each of the hosts and the messages stop appearing in the vmkernel.log and the dead paths were no longer listed in the vSphere Client.

Here are the relevant sites I found per the Google search:

And, of course, I always recommend following the manufacturer's best practices:

Looks like the guide was recently updated and it does recommend using the persona of "6 - Generic legacy".

Another mystery solved.

Saturday, June 23, 2012

Butterfly Takes Flight

Aunt Eileen gave the kids one Black Swallowtail cocoon each last fall.  This is the story of one of those cocoons...
A butterfly emerges from its Chrysalis shell!

 April showers bring May flowers... and butterflies.  Hey, let me out!

Max contemplates... how did he do that?

Based on the markings, we can determine its a male.  Now fly away little butterfly...

No, really, fly away!


Well, he finally did.  And let me tell you the thing took off like a bird - very fast by butterfly standards.  Then about twenty feet up, it almost got eaten by a bird!  It was a cool thing to see it dodge the bird mid-flight then disappear into the distance.

Where did he go?  Why did he go there? How did he know to go wherever it was that he was going? Too bad we'll probably never know. But we all got to see metamorphosis in action which was really what is was all about.

Fly on little butterfly, fly on.

Thursday, June 14, 2012

ERROR: The query service is not available or was restarted

I wasn't getting any results when going to the Hardware Status tab for all hosts in my recovery site.  When clicking "update", I'd get the error:
The query service is not available or was restarted. Please retry.

Of course retrying doesn't work.  Hardware status worked fine for all hosts in my protected site.  I thought maybe it was a problem with linked mode in vCenter so I logged on to the vCenter server in the recovery site, fired up the vSphere client, opened the Hardware Status tab and got the same results.

I then started another instance of the vSphere client and logged directly on to the ESXi host.  The hardware sensor data worked fine here (it's not in a separate hardware tab, but looks nearly the same).  Hmmm.... must be something with vCenter?

I Googled the error and found this VMware KB:

Well it's the exact same error message so this must be the fix, right?  Wrong!
First of all, step 14 is incomplete.  Please follow these steps to reset the vCenter Inventory database:

Secondly, this was not the only problem and probably didn't ultimately fix the issue.  I found this link in the same Google search:

The above forum posting had a link to the following web site with instructions on updating the ADAM instance vSphere uses for linked mode:

While this was for 4.1, the same settings apply to 5.0.  The only thing I would recommend is checking all of the common name (CN) properties to make sure the FQDNs are correct.  You do not need to change these to IP addresses!

I did have to reboot the vCenter server in the recovery site after making the changes.  Even then, it didn't seem to start working until the following morning so it may take some time for the changes to propagate.  Not certain about that but now the hardware status tab works for all servers in the recovery site.

ERROR: vSphere Replication shows Not Active

Another strange one.  Existing VM replications appear to be working based on "last sync completed" time stamps.  However, setting up a new replications result in a status of "Not Active".  Right-clicking on a VM and choosing "sychronize now" results in this error:
Call "HmsGroup.OnlineSync" for object "[some long GID]" on Server "[server name/IP]" failed.  An unknown error has occurred.
I Googled the error and found this VMware communities forum thread:
SRM5 using vSphere replication, status shows 'not active'

Read through it but note that you shouldn't have to reboot everything like one poster did.  I rebooted the VRMS server in the recovery site and replications started working for all VMs again.  YMMV.  This happened to me after having rebooted the vCenter server also at the recovery site.

ERROR: A general system error occurred

Recently, when trying to logon to SRM 5, I got the following error:
A general system error occurred: Internal error

Wow, that's real telling!
I tried restarting the SRM servers but no dice.  I then opened a ticket with VMware support.  I started a WebEx with the tech and after reviewing several SRM and vCenter logs, he really couldn't find the root cause of the problem.  However, he did say that they've only seen this generic error with vCenter, not SRM.

We rebooted the recovery side vCenter and viola, I was able to login again successfully.  It's the old saying - if all else fails, reboot!

Thursday, June 7, 2012

ERROR: Call "VirtualMachine.Relocate" for object...

The full error is:
Call "VirtualMachine.Relocate" for object "VirtualMachineName" on vCenter Server "VirtualCenterServer" failed.

Looks like it thinks there was a snapshot left-over from a Backup Exec backup.  Yet, no snapshot exists, no file on the datastore.  Turns out, it's in the vCenter database...

First hit on Google: VMware KB http://kb.vmware.com/kb/2008957

I unregistered and re-registered the VM in inventory and it fixed the problem.  One thing to note:  make sure you answer the power-on question as "Copy" and not "Move".  Selecting "copy" will cause vCenter to assign the VM a new ID, cleaning up the database.

Tuesday, June 5, 2012

My Take On Fruits and Robots

I’m not against the iPhone/Pad/Pod or anti-Apple.  I actually admire what Steve Jobs was able to accomplish.  I give iPxxx users a hard time because it’s fun!  Seriously, at the end of the day I recommend doing you're own research and deciding which platform and device is right for you.  The answer will not be the same for everybody.

On personal usage:
However, the iOS platform and iDevices are not for me – I don’t want to be locked into a specific hardware platform.  I don’t want to use iTunes and convert all of my WMA-lossless music to Apple’s AAC lossless format (and WMA compatibility is actually more prevalent in other devices than AAC). I am used to having certain apps available to me as part of my daily routine (a podcast app that automatically manages new podcasts including downloading new episodes and deleting the old ones I’ve listened to, for example).  Can I get similar apps on the iPhone?  Probably, but not always.  The most popular apps are typically available on both phones.  But there's no guarantee that a less-popular app I might like and use will be available on the iPhone.

In other words, it would be a project to convert to iPhone. And at the end of the day I will have a phone with a smaller screen? Not something I want to do if I can avoid it.

On business usage:
There are two trends that can’t be ignored:
  1. Market growth/popularity
  2. Bring Your Own Device (BYOD)

My employer standardized on iPhones for both of these reasons.  Apple's iPhone and iPad are popular in their markets (iPad is the tablet market leader).  The executive that mandated this new standard didn't weigh all of the pros and cons of all devices in the market AFAIK.  A number of other employees including managers either bought thier own iPhones or were asking for them.  So a number of these things came together and boom, we have a new standard.

It's interesting to note here that we still maintain some RIM devices in our supported corporate standards.  These devices have always been more business-oriented.  They're cheaper in most cases.  But they're losing market share on the business side and I don't think they ever had it on the consumer side.

I would argue that it's for these same reasons that Android devices can not be over-looked or ignored.  Android devices are the mobile phone market leader and Android-based tablets are gaining in popularity.  As of this writing, the Google Android platform holds a 51% market share compared to Apple's iOS at 31% (source).  Then there's the fact that Android continues to find it's way into other devices such as TVs and even car stereos! I don't do predictions but I've got to believe the platform will continue to improve, market share will continue to grow and businesses will continue to adopt Android phones as part of their standards at increasing rates.

As in most companies, all it will take is for a C-class executive to want one of these phones for whatever reason and boom - another new standard.  It's not a matter of "if", but "when".  Maybe it's one month from now, a year or five years from now.

On a final note, take this for what it is - my opinion.   There are many heated "Apple vs. Android" threads that can be read on the Internet.  Each platform has technical pros and cons in both the hardware and software.  That would be the topic of an entirely different article.  I'll be the first to admin that I'm technically biased towards Android and believe it to be a better hardware/software solution over-all.  But don't take my word for it.  Put the time in, do the research and decide for yourself.