Monitors are a very useful addition to SCOM since SCOM 2007 came out back in the days. However for a lot of fresh SCOM administrators the alerts generated by monitors sometimes can create headaches.
An alert is raised when a state is changed and closed when the state changes back to the health condition. This is the really short version…
If you speak to advanced SCOM admins they can all agree that the management of the monitor generated alerts can be tricky from time to time if you work with operators.
If at one point they close an alert in the console which was generated by a monitor but the condition is not changed for the monitor it will remain in unhealthy state until a force reset is done on the monitor itself.
We all know how many monitors are floating around in our environment so it’s just a disaster waiting to happen. Therefore it is wise to reset the unhealthy monitors for your core business services regularly until everybody is aware about the fact that they can not close alerts from a monitor…
However I use this setup also for another annoying thing that can have great impact on your environment. Again this is a scenario to rule out a human error.
- IF an alert is raised by a monitor going into a unhealthy state, a notification is successfully triggered and a ticket is created… So far so good.
- BUT if someone closes the ticket or the alert without looking at it the condition remains and no warning will be raised again.
- As a lot of my customers are using scom as a monitoring tool in the backend and monitor the tickets it generates they will not be alerted again.
Therefore I created this small PowerShell script in combination with a bat file. It will just reset the health of the unhealthy monitors of a specific monitor you specify. Only thing left to do is create a scheduled task for the bat file and you are good to go.
The script can be downloaded at the Gallery together with the bat file.
Example: Fragmentation level is high and we want to be alerted everyday again as long as the condition remains:
Check the monitor properties to retrieve the monitor display name:
In this case “Logical Disk Fragmentation Level” Copy paste the name.
Fill in the name in the batch file and run it.
The unhealthy monitors will be reset and their alerts are automatically closed in the console.
If we check the monitor again it is now forced to reset state and will fire again the next time it checks the unhealthy condition when this is still true.
This way you will receive a new alert every time this script runs. You could also schedule this during shift change of the helpdesk to get a clear view of the current situation on your environment that they start with a clean sheet.