Blog

SCOM: Configure a monitor recovery task for a healthy state

During a recent project a client had a small request to create a monitor and run a command when a device was not accessible anymore. Easy right! But (yep there’s always a but) they wanted to run a command when the monitor was returning back to a healthy state to restart a service when the device came back online… Hmmm and all in 1 monitor.

So the conditions were as follows:

Monitor:

  • Action: Run a PowerShell based monitor to test the connection with the device
  • BAD: Device is down => Run recovery task to remediate
  • GOOD: Device is up again => Run recovery task to restart service

(note: Always do this small matrix of a monitor design to exactly know what the customer wants)

I don’t have the device to simulate but came up with a small example in my lab to show you how to get this working with just 1 monitor. The situation in my lab is very simple. I want to turn on my desk lighting when my pc is on (and I’m working) and turn it off when my pc is not online.

My conditions:

Monitor:

  • Action: Run Powershell based monitor to test the connection and pass the result to SCOM
  • BAD: PC is offline: => turn off my desk lighting
  • GOOD: PC is online:=> turn on my desk lighting

So first things first we need to test the connection to see whether my pc is running. To check this I’m using this small script:

[xml]

param ([string]$target)
$API = New-Object -ComObject "MOM.ScriptAPI"
$PropertyBag = $API.CreatePropertyBag()

$value = Test-connection $target -quiet

$PropertyBag.AddValue("status", $value)

$PropertyBag
$API.Return($propertybag)

[/xml]

So I’m testing the connection and sending the response to SCOM. The  PowerShell “Test-Connection $target –quiet” command will just return true or false as a result whether the target is accessible or not

Creating the Monitor with Silect MP Author

The creation of this monitor consists of 2 parts:

  • Defining the class where the monitor will be targeted to and therefore the machine which will test the connection to the desktop
  • Passing the status from the machine to SCOM and take action by using a monitor

Defining a class:

To properly target this monitor we need to create a class in SCOM which identifies the servers that need to test the connection. In this case I’ve added a reg key to all servers who need to ping the desktop so I’m starting a Registry Target to create my class:

printscreen-0254printscreen-0255

I fill in a server that has the key already in there to make it much easier to browse the registry instead of typing it in with an increased margin for errors.

printscreen-0256

Select the Registry key you want to look for

printscreen-0257

In my case I’ve added a key under HKEY_LOCAL_MACHINE\Software\pingtestwatchernode

printscreen-0258

Select the key and press add and ok

printscreen-0259

Identify your registry target:

printscreen-0260

Identify your discovery for the target

printscreen-0261

In my case I just check whether the key is there. No check on the content.

printscreen-0263

The discovery will run once a day.

printscreen-0264

Review everything and press finish

printscreen-0265

At this point our class is ready to be targeted with our script monitor.

Next up is to create the monitor:

Create a new script monitor:

printscreen-0266

Browse to the PowerShell script and fill in the parameters. In this case I have 1 parameter which is “target” and will hold the IP of the desktop.

printscreen-0267

Define the conditions:

Healthy condition is when the status is true and type boolean

printscreen-0268

Critical condition is when the status is False

printscreen-0269

Note: I’m using a “boolean” Type

Configure the script and select the target you have created earlier on and the availability parent monitor

printscreen-0270

Identify your script based monitor

printscreen-0271

Specify a periodic: run every 2 minutes

printscreen-0272

No alert generation necessary.

printscreen-0273

Review all the parameters and create the script based monitor.

printscreen-0274

Load the management pack in your environment and locate the monitor:

printscreen-0278

Check the properties => recovery tasks and create 2 recovery tasks for the Health state “critical”.

Note that the screenshot below already shows the correct healthy state after config of the mp.

printscreen-0279

Export the managment pack and open it in an editor and locate the “recoveries” section to find your recovery tasks we just created:

printscreen-0280

scroll to the right and locate the “ExecuteOnState” parameter and change the one you want to run when the monitor goes back to healthy from “Error” to “Success”

Save the management pack and reload it in your environment.

printscreen-0281

So all we need to do is test it…

My pc is on: IT-Rambo has his cool backlight:

20141130_230930098_iOS

My pc is off and the light is automatically turned off…

20141130_230904267_iOS

Final Note: If you use this method you need to make sure to NOT save the recovery tasks in the console anymore otherwise the different settings we just changed in our management pack will be again overwritten as SCOM can’t natively configure a recovery task for a healthy state.

You can use this basically for anything where you want to run 2 conditions on the same monitor or even 3 if you have a 3 state monitor.

Enough talk, let’s build
Something together.