SCOM: Configure a monitor recovery task for a healthy state
During a recent project a client had a small request to create a monitor and run a command when a device was not accessible anymore. Easy right! But (yep there’s always a but) they wanted to run a command when the monitor was returning back to a healthy state to restart a service when the device came back online… Hmmm and all in 1 monitor.
So the conditions were as follows:
- Action: Run a PowerShell based monitor to test the connection with the device
- BAD: Device is down => Run recovery task to remediate
- GOOD: Device is up again => Run recovery task to restart service
(note: Always do this small matrix of a monitor design to exactly know what the customer wants)
I don’t have the device to simulate but came up with a small example in my lab to show you how to get this working with just 1 monitor. The situation in my lab is very simple. I want to turn on my desk lighting when my pc is on (and I’m working) and turn it off when my pc is not online.
- Action: Run Powershell based monitor to test the connection and pass the result to SCOM
- BAD: PC is offline: => turn off my desk lighting
- GOOD: PC is online:=> turn on my desk lighting
So first things first we need to test the connection to see whether my pc is running. To check this I’m using this small script:
$API = New-Object -ComObject "MOM.ScriptAPI"
$PropertyBag = $API.CreatePropertyBag()
$value = Test-connection $target -quiet
So I’m testing the connection and sending the response to SCOM. The PowerShell “Test-Connection $target –quiet” command will just return true or false as a result whether the target is accessible or not
Creating the Monitor with Silect MP Author
The creation of this monitor consists of 2 parts:
- Defining the class where the monitor will be targeted to and therefore the machine which will test the connection to the desktop
- Passing the status from the machine to SCOM and take action by using a monitor
Defining a class:
To properly target this monitor we need to create a class in SCOM which identifies the servers that need to test the connection. In this case I’ve added a reg key to all servers who need to ping the desktop so I’m starting a Registry Target to create my class:
I fill in a server that has the key already in there to make it much easier to browse the registry instead of typing it in with an increased margin for errors.
Select the Registry key you want to look for
In my case I’ve added a key under HKEY_LOCAL_MACHINE\Software\pingtestwatchernode
Select the key and press add and ok
Identify your registry target:
Identify your discovery for the target
In my case I just check whether the key is there. No check on the content.
The discovery will run once a day.
Review everything and press finish
At this point our class is ready to be targeted with our script monitor.
Next up is to create the monitor:
Create a new script monitor:
Browse to the PowerShell script and fill in the parameters. In this case I have 1 parameter which is “target” and will hold the IP of the desktop.
Define the conditions:
Healthy condition is when the status is true and type boolean
Critical condition is when the status is False
Note: I’m using a “boolean” Type
Configure the script and select the target you have created earlier on and the availability parent monitor
Identify your script based monitor
Specify a periodic: run every 2 minutes
No alert generation necessary.
Review all the parameters and create the script based monitor.
Load the management pack in your environment and locate the monitor:
Check the properties => recovery tasks and create 2 recovery tasks for the Health state “critical”.
Note that the screenshot below already shows the correct healthy state after config of the mp.
Export the managment pack and open it in an editor and locate the “recoveries” section to find your recovery tasks we just created:
scroll to the right and locate the “ExecuteOnState” parameter and change the one you want to run when the monitor goes back to healthy from “Error” to “Success”
Save the management pack and reload it in your environment.
So all we need to do is test it…
My pc is on: IT-Rambo has his cool backlight:
My pc is off and the light is automatically turned off…
Final Note: If you use this method you need to make sure to NOT save the recovery tasks in the console anymore otherwise the different settings we just changed in our management pack will be again overwritten as SCOM can’t natively configure a recovery task for a healthy state.
You can use this basically for anything where you want to run 2 conditions on the same monitor or even 3 if you have a 3 state monitor.