Blog

SCOM: Configure a monitor recovery task for a healthy state

During a recent project a client had a small request to create a monitor and run a command when a device was not accessible anymore. Easy right! But (yep there’s always a but) they wanted to run a command when the monitor was returning back to a healthy state to restart a service when the device came back online… Hmmm and all in 1 monitor.

So the conditions were as follows:

Monitor:

  • Action: Run a PowerShell based monitor to test the connection with the device
  • BAD: Device is down => Run recovery task to remediate
  • GOOD: Device is up again => Run recovery task to restart service

(note: Always do this small matrix of a monitor design to exactly know what the customer wants)

I don’t have the device to simulate but came up with a small example in my lab to show you how to get this working with just 1 monitor. The situation in my lab is very simple. I want to turn on my desk lighting when my pc is on (and I’m working) and turn it off when my pc is not online.

My conditions:

Monitor:

  • Action: Run Powershell based monitor to test the connection and pass the result to SCOM
  • BAD: PC is offline: => turn off my desk lighting
  • GOOD: PC is online:=> turn on my desk lighting

So first things first we need to test the connection to see whether my pc is running. To check this I’m using this small script:

[xml]

param ([string]$target)
$API = New-Object -ComObject "MOM.ScriptAPI"
$PropertyBag = $API.CreatePropertyBag()

$value = Test-connection $target -quiet

$PropertyBag.AddValue("status", $value)

$PropertyBag
$API.Return($propertybag)

[/xml]

So I’m testing the connection and sending the response to SCOM. The  PowerShell “Test-Connection $target –quiet” command will just return true or false as a result whether the target is accessible or not

Creating the Monitor with Silect MP Author

The creation of this monitor consists of 2 parts:

  • Defining the class where the monitor will be targeted to and therefore the machine which will test the connection to the desktop
  • Passing the status from the machine to SCOM and take action by using a monitor

Defining a class:

To properly target this monitor we need to create a class in SCOM which identifies the servers that need to test the connection. In this case I’ve added a reg key to all servers who need to ping the desktop so I’m starting a Registry Target to create my class:

printscreen-0254printscreen-0255

I fill in a server that has the key already in there to make it much easier to browse the registry instead of typing it in with an increased margin for errors.

printscreen-0256

Select the Registry key you want to look for

printscreen-0257

In my case I’ve added a key under HKEY_LOCAL_MACHINE\Software\pingtestwatchernode

printscreen-0258

Select the key and press add and ok

printscreen-0259

Identify your registry target:

printscreen-0260

Identify your discovery for the target

printscreen-0261

In my case I just check whether the key is there. No check on the content.

printscreen-0263

The discovery will run once a day.

printscreen-0264

Review everything and press finish

printscreen-0265

At this point our class is ready to be targeted with our script monitor.

Next up is to create the monitor:

Create a new script monitor:

printscreen-0266

Browse to the PowerShell script and fill in the parameters. In this case I have 1 parameter which is “target” and will hold the IP of the desktop.

printscreen-0267

Define the conditions:

Healthy condition is when the status is true and type boolean

printscreen-0268

Critical condition is when the status is False

printscreen-0269

Note: I’m using a “boolean” Type

Configure the script and select the target you have created earlier on and the availability parent monitor

printscreen-0270

Identify your script based monitor

printscreen-0271

Specify a periodic: run every 2 minutes

printscreen-0272

No alert generation necessary.

printscreen-0273

Review all the parameters and create the script based monitor.

printscreen-0274

Load the management pack in your environment and locate the monitor:

printscreen-0278

Check the properties => recovery tasks and create 2 recovery tasks for the Health state “critical”.

Note that the screenshot below already shows the correct healthy state after config of the mp.

printscreen-0279

Export the managment pack and open it in an editor and locate the “recoveries” section to find your recovery tasks we just created:

printscreen-0280

scroll to the right and locate the “ExecuteOnState” parameter and change the one you want to run when the monitor goes back to healthy from “Error” to “Success”

Save the management pack and reload it in your environment.

printscreen-0281

So all we need to do is test it…

My pc is on: IT-Rambo has his cool backlight:

20141130_230930098_iOS

My pc is off and the light is automatically turned off…

20141130_230904267_iOS

Final Note: If you use this method you need to make sure to NOT save the recovery tasks in the console anymore otherwise the different settings we just changed in our management pack will be again overwritten as SCOM can’t natively configure a recovery task for a healthy state.

You can use this basically for anything where you want to run 2 conditions on the same monitor or even 3 if you have a 3 state monitor.

Home automation: Putting a child lock on my Nest thermostat using SCOM

 

This post is part of a series on how I demonstrate how to use SCOM to basically monitor everything. The other parts can be found here:

After I have successfully been able to get data into SCOM from my Nest Thermostat and my Flukso energy meter it’s time to do some cool stuff with it. More devices are in the pipeline to get data into SCOM to create the ultimate Domotics controller or should I say “SCOMotics”…

The world: Keeping an eye on Teen Trouble

One problem I have in real life is the fact that it’s very hard to explain to my wife and kids the process off radiant floors. It takes some time to heat up but it stays warm a long time so there’s no point in setting the thermostat to a higher point to get instant heat because it takes approx 1 hour to heat up 2 degrees celcius (something I also learned from getting my Nest thermostat data into SCOM).

But you can explain all you want if they find it chilly they’ll turn up the thermostat assuming it will get warm instantly but in fact they are just using more energy than necessary to heat the house in 2 hours when they already left the house.

So the mission was very simple. To stop them from doing this. Yes… I could put a lock code on the Nest thermostat and make it only available to me but if I’m not home and they really need to put the heating higher they are not able to do so.

So I came up with another solution: Setting a hard limit on the degrees and enforcing it.

So in short what do I need to achieve with SCOM:

  • Detection of the current temperature set: Target temperature
  • Alerting when the Target temperature breaches the set limit
  • Take corrective action to make sure the target temperature is set below the max temperature.

So let’s start with the detection of the current target temperature. I can reuse the work I already did to read in this value and compare it to the limit. To keep track of things and as this is a more general approach I’ve documented the process of creating a PowerShell script monitor using Silect MPAuthor here: http://scug.be/dieter/2014/04/24/scom-creating-a-powershell-script-monitor-with-silect-mpauthor/

So now that we have the monitor in place let’s check out whether it’s working!

First of all I’m setting my nest thermostat to 20 Celsius while my limit is set to 19 Celsius:

SNAG-0257

After the first run the monitor is picking up that indeed the temperature is higher than the requested limit. This is detected by running the PowerShell script monitor we’ve configured earlier:

SNAG-0263

Here you can see that the Recovery target which I configured kicked in as well. This recovery target consists out of a PHP file which is located on my Webserver and loaded by using the PowerShell Invoke-Webrequest module..

Note: I’m running this recovery against my Watchernode class which consists of 1 server and thus I’ve copied the “settempnest.ps1” to the local folder of that particular server.

How did I configure the recovery task

First open the monitor and click add on the “configure recovery tasks” section

SNAG-0260

Fill in the name of the recovery and select the status where to react upon.

SNAG-0261

Enter the command:

  • Full path: C:\Windows\System32\WindowsPowerShell\V1.0\powershell.exe
  • Parameter: -noexit “& “C:\scripts\settempnest.ps1”

SNAG-0262

The powershell is running a invoke-webrequest on my webserver. The PHP script it is running is copied below:

[xml]

<?php

require ‘inc/config.php’;
require ‘nest-api-master/nest.class.php’;

define(‘USERNAME’, $config[‘nest_user’]);
define(‘PASSWORD’, $config[‘nest_pass’]);
date_default_timezone_set($config[‘local_tz’]);

$nest = new Nest();
$nest->setTargetTemperatureMode(TARGET_TEMP_MODE_HEAT, 18.0);

[/xml]

So after running the recovery we see the monitor changing back from error to healthy:

SNAG-0259

There we go… All good again saving some energy

SNAG-0265

And final check on the thermostat itself… Back humming at 18 degrees.

SNAG-0264

SCOM 2012: Overview link blog

This post will be my (and hopefully yours) one stop to post all the relevant info to SCOM 2012. I will try to generate an overview of all the different steps you need to start from scratch and continue to build your environment to a level that suites your environment.

If you feel there are things missing or you’ve found dead links please do not hesitate to leave a comment and I will update this post. This post has grown out of my favourite list of SCOM related topics and info I found on forums, technet and blogs.

Note:

    • I’ll sometimes post more than 1 link at a topic so you can combine the different blogposts to get the bigger picture.
    • Most of the info is relevant for SCOM 2012 and SCOM 2012 SP1 (if there’s specific info for a specific version it will be pointed out.)
    • This is a link post to relevant info I found on the web. All credits and copyright belong to the respective authors.

SCOM 2012 R2:

General information:

This section lists all links that will give you a general overview of SCOM.

Design and Topology

This section lists all links that will help and guide you to make a proper design and take the correct decisions concerning topology

How to install

This section lists all links to the install walkthroughs and possible issues.

Configuring scom

This section lists all links to help you quickly setup scom after you have succesfully installed it.

 

Specific configuration

This section lists all the different aspects of SCOM that need additional installation or configuration

Dashboards

ACS

APM (Application Performance Monitoring)

Gateway configuration

Network Monitoring

Azure Monitoring:

Management pack basics

This section lists all the links to give you the basics about management packs

 

Management pack advanced

This section liste all the links to the more advance management pack tips and tricks.

Community Management packs

A list of must have community management packs to increase your productivity and solve some gaps and functionalities in scom.

Note: These management packs are written by members of the community so no warranty is given. Test before you use in production!

Integration possibilities

This section lists links to different integration possibilities between the different System Center products.

 

Usefull blogs and sources for this list (In random order of importance)

Note: System Center Blogs: Now on iPhone, Android and Windows Phone

Tips & Tricks

Cool Showcases with SCOM

Partner solutions for SCOM (in random order)

Videos

LiveMeeting 22/11/2012: System Center Products better together…

 

On the 22th of November I’m hosting a LiveMeeting on how to integrate the different System Center products.

a0983d0e-9819-4cbc-a995-f9cb7fd44d36[1]

We’ll go over the different steps to integrate the different System Center products to get past the standard “just monitor it” scenario with SCOM but truly integrate the different products together.

All the products will be positioned within the System Center stack and integrations will be showcased.

If you are looking for a session to convince your boss to install more system center products or just want to convince yourself of the force of system center products brought together…

Look no further this is your session.

Register here:https://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032533093&Culture=en-us&community=0

Speaking at MMS2012

MMS2012 just got a little bit more interested for me (can it?).

I will get the chance to speak at this major System Center event together with System Center Operations Manager guru and colleague Maarten Goet.

thumbs-up

We’ll be presenting the following session:

AM-B312 Operations Manager Tuning War: No Management Pack Left Behind

Friday, April 20 10:00 AM – 11:15 AM Venetian Ballroom G

Speaker(s): Dieter Wijckmans, Maarten Goet

Track(s): Application Management

Session Type: Breakout Session

Products: Operations Manager

So you installed Microsoft System Center Operations Manager, deployed your first agent and started importing management packs. Now, a hundred or more alerts start appearing. Was this really the best approach? How do I tune my management packs anyway? How do I get things sorted again? Come to this session led by five-year System Center MVP Maarten Goet, who draws from multiple years of field experience to show you how to successfully tackle this. No Operations Manager administrator should miss out on this session!

Make sure to join us to get an in depth view into management packs and how to get you started in tuning yours.

Additional resources:

Hope to see you there!

SCOM 2012: What’s new: Default behavior of overrides

This post is part of a series What’s new: Check here for the other parts.

SCOM has some huge changes on board… But some are rather small and go unnoticed to the untrained eye although they could save you a major headache.

I’m pretty sure not a lot of hands will be raised when I pop the question: “Are you 100% sure your default management pack is free of overrides, if it’s not you buy me a beer?”. Although this is not that important because (let’s face it) it works doesn’t it? You will at one point or another have a big headache when you want to delete or upgrade a management pack which has an override stored in the default management pack. This makes the default management pack referenced by the management pack and therefore you can’t delete it.

Although a lot of new System Center admins make this error I must admit it’s in fact quite easy to make the error… Just click next and it’s there…

In SCOM2007R2 the default behavior when creating an override is storing it in the default management pack:

printscreen-0036

NOTE: Notice that My default management pack name has been changes to something which draws a little bit more attention when you want to click OK to minimize the possibility you click ok to fast. Check here how to do this: http://scug.be/blogs/dieter/archive/2011/05/13/scom-2007-renaming-default-management-pack-display-name.aspx

This is one of the first things to do on my checklist when I open the console at a new client. As this is not ruling out the fact that you once in a while just click ok to fast it helps avoiding some issues.

In SCOM2012 this behavior is changed. Now you need to explicitly select a management pack before you can click ok. Making my linked blog post above completely useless but hey you can’t win them all Smile

printscreen-0037

This small adaption will keep a lot of default management packs clean and will score me a lot less free rounds of beer but hey… it’s for a good cause Smile

While we are on the subject make sure you use the proper approach for storing your overrides.

Marnix Wolf MVP has written a nice blog post on the subject: Storing overrides, the good, the bad and the ugly.

His conclusion:

When storing Overrides, store them in a single unsealed MP which is dedicated only to the MP where you’re making the override for. So overrides for the SQL MP go into the unsealed MP ‘Overrides SQL’ and overrides for the Server OS MP go in to the unsealed MP ‘Overrides Server OS’. This is the only viable and workable option. All other options cause issues, sooner or later.”

SCOM 2012: What’s new: Maintenance mode changes

This post is part of a series What’s new: Check here for the other parts.

In the second part of this series of what’s new in SCOM2012 I’ll be highlighting a small change with big implications in SCOM 2012 in the maintenance mode department.

image_2It was kind of frustrating to see that a lot of issues at customer sites had to do with the fact that the RMS or MS (or even worse both) were put in maintenance mode and never came out of it until manually removed.

Putting your RMS in maintenance mode is a big no no as this is the pounding heart of your environment and can cause serious issues.

But hey enough said about the past… let’s talk about the future! Fortunately the future is bright in the SOCM 2012 world concerning maintenance mode.

These are in fact the changes in maintenance mode:

  • When a management server in SCOM 2012 (remember no more RMS) is placed in maintenance mode the System Center Management Configuration Service will act up and make sure that the agents are forced to failover to another management server so no data loss will occur. This is of course possible by bundling the management servers in resource pools.
  • The far most important change in maintenance mode is the fact that when you put a management server in maintenance mode the workflow to get that particular management server out of maintenance mode is actually moved to another management server which is not in maintenance mode. This way the command to get the server out of maintenance mode is triggered from another server. Finally…

Why is this such a huge improvement?

In SCOOM2007R2 if you for one reason or another find your RMS in maintenance mode the workflow to actually get it out of maintenance mode was also fired from the RMS. Which of course will not fire because… yeah it’s in maintenance mode. This can keep your RMS in maintenance mode without you even knowing it. The only possible way to get it out is to manually remove the maintenance mode.

So this is resolved in SCOM2012 by moving the workflow to get the management server out of maintenance mode to another management server in the resource pool. Another cool feature of the resource pools where the different management servers are residing in.

The only catch is that to have this new approach working you’ll need at least 50% of your management servers out of maintenance mode. So take this in account when you decide on update strategies to divide your management servers in at least 2 different patch groups with different action times.

SCOM 2007: Authoring console can’t find referenced mp

Just the other day I was working on an MP after updating the environment to CU4 and all of a sudden I got the error that a reference management pack wasn’t found.

printscreen-0024

Ok no big issue. Locate the management pack and reference it…

printscreen-0025

no go… Still the same error…

printscreen-0026

So why? My version should be correct no? Wrong…

Apparently the Microsoft.SystemCenter.Library management pack is included in the CU4 but is installed while running the SQL update script (that’s why it’s so important to run them!). It bypasses the verification code.

CAUTION: Just download the management pack but do not import it into your environment. It’s already in there and functioning correctly. In rare cases reimporting the management pack again in your environment can cause a corrupt dbase.

In fact Microsoft has released a KB2590414 to address this issue:

http://support.microsoft.com/kb/2590414

In the middle of the kb you can download the management pack which is transported in a MSI file

printscreen-0027

Read the license agreement carefully (yes you should!) and except it:

printscreen-0028

Select the folder (I kept it default) click next:

printscreen-0029

Confirm:

printscreen-0030

Installation complete:

printscreen-0031

When you click close the folder where the mp was copied will open:

printscreen-0033

Open your console again and browse to the newly installed Management pack:

printscreen-0034

This time no error anymore and you can happily start authoring…

SCOM 2012: Meet the SCOM 2012 experts at SCUG NL (wrap up)

Last friday 06/01 the event “Meet the SCOM 2012 experts” was held by SCUG NL near Amsterdam.

The turn up was really great and a lot of speakers (including yours truly Smile ) gave sessions regarding the next big version of System Center Operations Manager.

WP_000823

The day was quickly sold out and those who made it in enjoyed the session which gave away a nice first view of the different aspects of the SCOM 2012.

All the slide decks can be found here:

http://www.scug.nl/2012/01/13/presentaties-scug-nl-vrijdag-6-januari-2012/

Program (in order of appearance):

Dieter Wijckmans:

Session about the proper preparation to upgrade your environment from SCOM 2007 to SCOM 2012 with all the different tweaks and perks you need to do to make sure everything goes smoothly: scug_nl_How to prepare yourself for SCOM 2012_Dieter_Wijckmans

Walter Eikenboom:

Session about the End to End application monitoring in SCOM 2012. Nice session packed with demo’s how to take full advantage of the different aspects of correctly monitoring your applications with SCOM 2012: SCUG NL – OpsMgr 2012 End-To-End monitoring v1.0_Walter_Eikenboom

Michael Guthrie (Microsoft product team of SCOM 2012)

Session about the different aspects of Application Monitoring features in SCOM 2012. The features are greatly improved to give you even more in depth insight in where to pinpoint the issue of a problem with an application: SCUG NL APM with OM12_Michael_Guthrie

Vishnu Nath (Microsoft product team of SCOM 2012)

Session about the greatly improved Network monitoring features in SCOM 2012. Discussed a broad variety of new features and possibilities in the field of network monitoring. The initial configuration is also explained: SCUG NL OM2012_NetworkMonitoring_Vishnu Nath

Oskar Landman (SCOM MVP)

Session about the difference between SCOM 2007 and SCOM 2012. Oskar highlighted some well hidden new features which make your live as a SCOM admin a lot easier. An in depth insight deep down the SCOM application. A nice interactive session with lots of questions from the crowd (unfortunately they are not documented in the slide deck Smile with tongue out): SCUG NL – under the hood_Oskar_Landman

SCOM 2012: So what’s new?

So the System Center Operations Manager 2012 is almost rolling in the station…

shinkansen_300_series

What are the cool new things which will make your life so much easier as an SCOM Admin? Will they help you to convince others of the great things SCOM is capable of doing?

There will be small things which will make great difference and of course there will be big things which will make little difference (because the old situation was already SO good Smile )

The different parts will be linked here so you’ll have the overview in 1 post:

  1.  Agent Config in scom 2012 for multihoming agents
  2.  Maintenance mode changes in scom 2012
  3.  Default behavior of overrides in scom 2012

Get your ticket now to hop on and explore the new features in this blog series.

Enough talk, let’s build
Something together.