Blog

How to query custom logs data in Log analytics

This post is a follow-up on how to SCCM custom data into your log analytics environment.

As soon as you have your SCCM custom logs, or any other logs, in log analytics they get indexed under the type you have specified.

In this particular case I used SCCMLOG_CL (note that the CL is mandatory). So lets jump into the log analytics query window to find out what’s in the logs at this time:

Browse to Log analytics => Logs

clip_image002

The log analytics query window will open and will give you the opportunity to start your query journey:

clip_image004

Remember our custom type: SCCMLOGS_CL. Note the autosuggest feature which will help you to create your own queries

clip_image006

If you run this query you will get all the results within the type. This is a great way to check whether data is flying in.

clip_image008

So now we’ll start finding more in detail patterns. If you type where in the next line you’ll get all the fields in your data:

clip_image010

Let’s select Rawdata where the word “error” is in the line:

clip_image011

So we get already a lot of results:

clip_image013

So another trick in your sleeve. You don’t need to type everything. It’s a point and click world to combine your query. Just click the + sign next to a field. In this case “Computer”.

clip_image015

This will add the field AND the content of the field to your query:

clip_image017

So now you can really start building your searches on your custom data.

Next time we’ll go over how you can actually create custom fields you can search on.

How to upload SCCM logs in Log Analytics

One of the great powers and conveniences of having all logs in 1 place is in fact that they are getting indexed and you can query them for different scenarios.

Just recently I was working on a project together with SCCM engineers and they basically told me a couple of times “it’s in this or that logfile”, they fire up SCCMtrace and start looking for the specific entry and start troubleshooting from there.

“OK” I thought, maybe just maybe there’s a better solution. Because of my monitoring background I don’t like to think reactive as in “it already happened” but love to think proactive.

clip_image002

That’s why I proposed to dump all the logs in Azure log analytics to get them indexed and have alerting / reports on them.

It took some convincing to get the SCCM engineers to believe this is possible but it is actually quite simple to set it up using log analytics and custom logs.

So first up the requirements:

  • You need to have an active azure subscription
  • You need to have Log analytics workspace
  • You need to have a SCCM server onboarded on that workspace.

If these are met the following steps will ensure that the custom logs are coming in:

· Select your workspace in the log analytics blade and select “advanced settings”

clip_image004

Navigate to “Data” => Custom Logs => Add +

clip_image006

This opens the 4 step process with is basically all that is to it.

clip_image008

Step 1: Select a sample log file with the format required. Note that this sample logfile can’t exceed a size of 500k

For this I’ve selected a file on my SCCM site server which was called : SMS_CLOUDCONNECTION

clip_image010

Click browse => select the file => upload => click next

clip_image012

Step 2:

Select the record delimiter:

This is a 2 way choice :

  • Either you choose that every line is a new record in Log Analytics
  • You specify a date format

Note : If there’s no date format selected Log analytics will fill the field “date generated” with the date that the logfile was uploaded instead of the alert / log entry occured.

clip_image014

Step 3 : Adding log collection paths:

This is where Log analytics is going to look for the log files.

A couple of things to keep in mind:

  • The path you fill in here will be checked on ALL machines which are onboarded to the Azure Log Analytics workspace
  • If you want a specific log you fill in the full name
  • If you want all logs with a certain extension you can actually use wildcards as well
  • You can add multiple logs to the same custom type.

For demo purposes I’ve added the path to all logfiles in SCCM as shown below and I’m uploading all *.LOG files.

The advantage of using the wildcards is in fact that no logs get missed. If new logfiles are created due to size issues the new logfile will be picked up as well

clip_image016

Step 4 :

Add a name for all the records. This name is actually called a type within Log Analytics. This type will hold all the log entries and will be your first stop to start querying.

clip_image018

Click done and at this point the new custom log has been created. The log analytics agents will get notified and will search for logs in that specific directory.

clip_image020

After a while the logs will be parsed and be available in log analytics to query.

In the next blog post I’ll show how to efficiently search across these types.

Use OMS to calculate SCCM patch window

This blog post is part of the Coretech Global Xmas blogging marathon. To find all cool content please take a look at http://blog.coretech.dk/

Recently I have been exploring OMS a lot and came across a cool user scenario which really showcases the benefits of having all data in one place. Using this big data to connect the dots between different systems and creating even more insights in your environment and the relationships between the different systems.

One demo which really had some eyes popping was in fact the calculation of the SCCM patch window with OMS. A lot of people already know that there’s a specific System Update Assessment solution which points out which machines are missing which updates. But there’s more to this solution that meets the eye on first sight.

You can use this solution, but also the data gathered by OMS for all your updates, to calculate very precisely how long it will take to patch a particular machine to create a patch window accordingly.

Let’s get started shall we!

For this demo I presume you already have an active OMS subscription + workspace. For more info please refer to my OMS quick start guide to get you going fast: http://scug.be/dieter/2015/05/08/microsoft-operations-management-suite-quickstart-guide/

Log on to your workspace and make sure you have machines connected + the solution installed:

First click on Solutions Gallery:

printscreen-23-12-2015 0000

Find System Update Assessment Solution and make sure it is added to your workspace. If it’s not yet added make sure to click the icon and add in the next screen

printscreen-23-12-2015 0001

Make sure to add the Solution to your workspace

printscreen-23-12-2015 0002

If you add the Solution for the first time it will perform an Assessment to gather the data for your environment:

printscreen-23-12-2015 0003

When the Initial Assessment has been complete you will get your info on the tile which represents the System Update Assessment:

printscreen-23-12-2015 0004

TIP: No worries my environment is not that badly patched but if you are looking into taking this solution for a test drive you can always install Azure VM’s with an earlier image (a couple of patch Tuesday’s ago) to have a machine which is in fact missing updates)

Click on the tile to open the detailed pane shown below:

 

printscreen-23-12-2015 0005

Click on the Required Missing updates pane:

printscreen-23-12-2015 0006

The next window will give you by default a graphical overview of the patches missing + the days ago the patches were released. This gives you a nice overview of how severe your machines are not patched. You also get a nice pie chart to give you an overview on how many patches are missing + the category of the patches.

printscreen-23-12-2015 0007

Note on the right there’s an indication in minutes how long it will take on Average to install these missing updates:

printscreen-23-12-2015 0011

This is not just a “Guesstimate” but OMS is actually using data out of the logs collected by all machines to give you an accurate time of install of this particular set of patches missing on this machine.

The number (in this case 81) is indicating that in fact they have data for all patches missing regarding the install time they will take to install.

At this time you can clearly state that the machine will probably be patched in approximately 14 minutes. You can build in some margin but definitely don’t need an hour to patch this machine.

Create your own insights!

printscreen-23-12-2015 0012

This is just the pretty eye candy view of the Solution!

If you want to have the data by update you can dive into the big data gathered and create your own insights in your patch strategy. This can be achieved by using the “raw data” in the Search Query view and creating your own views. Let’s see how we can find out for example which patches will take more than 60 seconds to install so we can put them in a different patch group:

Click on “results” next to updates right underneath the search query window

printscreen-23-12-2015 0007

At this point you get the 81 results with all their data but… no install time?

 

printscreen-23-12-2015 0008

Click “Show More” on the bottom of the screen to unveil the InstallTimeAvailable / InstallTimePredictionSeconds / InstallTimeDeviationRangeSeconds properies

printscreen-23-12-2015 0009

 printscreen-23-12-2015 0010       

This is the data gathered for all the updates which are identified as missing on my systems.

InstallTimeAvailable: Will give you an indication whether enough data is gathered in the OMS system to give you an actual prediction of the install time. For new updates it can take some time to find the right data to be reliable to give you an accurate prediction of course.

InstallTimePredictionSeconds: This is the prediction based on all the data gathered through the OMS system (note this is not only based on your environment but across all environments connected to OMS showing the huge advantages of the Big Data approach of Microsoft Operations Management Suite.

InstallTimeDeviationRangeSeconds: Will give you an indication how much fluctuation is possible on the prediction. In this case the value is 0,83 meaning this can either be minus or plus.

Now to find out how many of the updates (81 of them) have an install time of more than 60 seconds we need to use the Search Query power:

Click in the Search Query window on the top of the screen and start typing Install at the end of the line:

printscreen-23-12-2015 0013

OMS will give you suggestions on which parameter you want to search. In this case we are going to search on “InstallTimePredictionSeconds =”

So just click on it to get it into the Search query as shown below. At this point we can put “Greater than” 60 and run the search query by clicking the search Icon on the right or hitting Enter:

printscreen-23-12-2015 0014

There we go… We have 6 patches will take longer than 60 seconds to install so we can take appropriate action regarding these patches in SCCM:

printscreen-23-12-2015 0015

 

This is just a small example of the huge amount of insights you can create with OMS to help you further tune the management of your environment.

SCOM: Connect management groups between on-prem and Azure

 

During a recent project I explored the benefits on hosting a 2 legged SCOM environment for both on-prem and cloud services. Although this is possible with just one management group and site to site VPN to the cloud they opted for a 2 management group approach to keep a certain sort of divider between the on-prem and the cloud.

In this blog post (who knows it could become a series) I’ll show you how to connect the management groups to each other so they can exchange alerts and use 1 console but benefit from presence of a management group on both platforms.

wall2top_z23gd-129

In this scenario I’m going to use connected management groups. As explained here http://technet.microsoft.com/en-us/library/hh230698.aspx

Connecting management groups in SCOM 2012 gives you a couple of benefits. The biggest one in my opinion is the fact you can have multiple management groups with different settings but use 1 console to get all the alerts. The customer wanted the ability to monitor their clients on different thresholds than their own systems. The own systems were mainly situated on site although the other systems were at the clients site or in the cloud.

The management group which will have the consolidated view is called the local management group. In my example it is VLAB which is on prem. The other management groups are called “connected management groups” in this case VCLOUD.

They relate to each other in a hierarchical fashion, with connected groups in the bottom tier and the local group in the top tier. The connected groups are in a peer-to-peer relationship with each other. Each connected group has no visibility or interaction with the other connected groups; the visibility is strictly from the local group into the connected group.

So in this scenario it’s a good idea to connect these management groups to see all data in 1 console for both on-prem and client based. In VCLOUD it’s not possible to see the alerts of VLAB but the other way around it’s possible.

So what do we need to do to obtain this (even without different AD domains and firewalls in between).

First of all prep the VCLOUD in Azure:

Create endpoints on Azure machine

In order to be able to resolve the Azure management group from the on prem we need to make sure that connection is possible to the VCLOUD management server. This is done through port 5723 and 5724.

Open the Azure management portal:

My server is called vcloud-ms1

printscreen-0231

Open the endpoints and add 5723 and 5724 to the endpoints. This in fact opens the firewall of azure to your machines. All communication will happen over these 2 ports.

printscreen-0232

Click add and fill in the endpoints as shown below.

printscreen-0233

Next find the following

  • The Public Virtual IP address (VIP) and take a note. In my case it’s 23.101.73.xxx
  • The DNS name: in my case vcloud-ms1.cloudapp.net

 

printscreen-0234

Prepare the onsite management server

Now that the management server of our VCLOUD management group is configured we need to configure the management server in our VLAB environment to become the local management group which will receive the alerts.

First we need to make sure that the onsite server can resolve AND reach the server in VCLOUD management group.

This can be done by changing the hosts file on the VLAB management server.

Go to c:\windows\system32\drivers\etc\ and open the hosts file:

printscreen-0235 

Note: I’ve deleted the last 3 digits of all the IP addresses above you need to fill in the full IP address as documented in the Windows Azure console.

Let’s check whether this works now from the VLAB management server. Doing THE route check: ping the hostname:

printscreen-0236

hmmm not working. Did we configure something incorrect? Check, double check. NO.

Well this makes perfect sense because: PING IS DISABLED towards Azure machines. Therefore you will get a Request timed out all the time you test no matter what you configure!

Connecting the management groups

Now that we have both ends configured it’s time to see whether we can connect the management groups. Remember: initiate the connection from the local management group (the one who needs to see all alerts and is on top of the hierarchy)

So let’s connect to the management server in VLAB:

Open the Administration pane and select Connected Management Groups and click

printscreen-0237

Right click and choose Add Management Group

printscreen-0238

Fill in all the data requested:

  • Management Group Name: The name of the VCLOUD management group
  • Management Server: The name of the management server in VCLOUD (make sure to use the exact name as filled in in the host file)
  • Account: Because the account we use as SDK service resides in the VLAB AD and is not known in the VCLOUD we need to use the VCLOUD credentials

printscreen-0239

Note: You need to initiate this from the management server where you have changed the host file so make sure there’s a console on there

You will get the message below because it’s not possible to validate the account in the local AD:

printscreen-0240

Just click next and normally you should be connected at this point:

printscreen-0241

Success!

So now all we have to do is configure what we want to show on the local management group.

 

I’ll explain this further in the next blog in this series.

Enough talk, let’s build
Something together.