A Networker's Log File: January 2012

Splunk is a great tool for consolidating, processing and analysing voluminous data of all sorts, including syslog, Windows events/WMI etc. It's pretty much like a Google search engine equivalent for your IT environment where you may have daily GBs or even TBs of raw logs and events to content with. Furthermore, you're likely to find free Splunk apps to support and analyse the events of your favourite IT appliances and applications - sort of iOS and Android to entertain and educate your kids (or your bosses in this case).

It's great in almost every aspect, except one - daily indexing volume that is limited by your Splunk licensing. And it doesn't come cheap - typically a high 5 digit figure for a meagre single digit GB daily indexing volume. If you have limited budget and "unlimited" amount of data, you'll have to start "rationing" and decide which types of data are of interest and value.

Of course, you can let Splunk do the auto-filtering but unwanted data are still counted to the daily volume, as it is performed after indexing. To "save" the daily volume limit, you have to filter out unwanted data before it reaches your Splunk indexer. Generally, there are 2 approaches i) filter it at the source devices (e.g. more specific and stringent ACL logging on Cisco IOS devices etc); ii) filtering using regular expression (REGEX) at the Splunk heavy forwarder (i.e. before Splunk indexer) installed on the data source. Unwanted data would be sent to the nullQueue for discard and wanted data would be sent to the index queue. I would be elaborating this method here. You may also want to check out this Splunk article as well.

How Splunk Moves Data through the Data Pipelines

First, we need to understand how data are being consumed, processed and moved about in Splunk. You can also read this full Splunk article, where I would just briefly summarise it here in sequential order:

Input : Data is fed into Splunk. No data processing here.
Parsing : Analyse and transform data according to regex transform rules
Indexing : Splunk takes the parsed events and writes them to the search index on disk.
Search : Search through the indexed events. This is what you would see eventually.

As we can see that both input and parsing segments occur before indexing, regex filtering is performed in the parsing stage, which we would be focusing on.

Distributed Deployment

The next thing we need to understand is distributed deployment. We have to segregate parsing transformation from indexer, in order to ensure that unwanted filtered data would not count toward the indexing volume. For scalability reasons, 3 different Splunk roles can be setup in 2 or more machines:

Forwarder : Input segment occurs here. For heavy forwarder, parsing and partial indexing can also be performed before sending it to the Indexer.
Indexer : Indexing activities are processed here. Indexed data are written to disk.
Search Head : The search GUI for users. It would search through various indexers to present the search results.

Minimally, we have to setup one heavy forwarder for input and parsing regex before sending the data to the indexer.

Step-by-Step Distributed Setup for Pre-Indexing Filter

One prerequisite is distributed setup whereby the Splunk Forwarder is separated from the Splunk Indexer. You may want to setup two separate Virtual Machines for testing purposes. One is designated as a dedicated input Heavy Forwarder and another is designated as Receiver cum Indexer cum Search Head.

Step 1: Setup Receiver and Heavy Forwarder

On the receiving indexer, under Splunk path "etc/system/local", add the following lines to inputs.conf

# you may substitute "9997" with other TCP ports
[splunktcp://forwarder_IP:9997]
# ensure that the listener is enabled.
disabled = 0

When you run "netstat -ano" on Windows system, you should be able to see TCP 9997 as a listening port.

On the Heavy Forwarder, add the following lines to the inputs.conf
# Specify a data source. Monitor files on "C:\TestLog" folder, which is empty at this moment.
[monitor://C:\TestLog]
# specify a source type for later identification
sourcetype = cisco_syslog
disabled = 0

Step 2: Ensure communication between Heavy Forwarder and Receiver is working

You should not just rely on network ping! Check the Splunkd log under Splunk Path "var/log/splunk". You should see this line:

01-28-2012 21:48:18.502 +0800 INFO TcpOutputProc - Connected to idx=10.1.1.96:9997

Otherwise, you would see Warning or Error, rectify them accordingly. If you see these messages:

01-28-2012 21:42:49.338 +0800 WARN TcpOutputFd - Connect to 10.1.1.96:9997 failed. No connection could be made because the target machine actively refused it.
01-28-2012 21:42:49.338 +0800 ERROR TcpOutputFd - Connection to host=10.1.1.96:9997 failed

Ensure that correct IP address in bold on the listening host [splunktcp://forwarder_IP:9997]. Otherwise, use [splunktcp://:9997] instead to allow inputs from any Splunk forwarders.

Step 3: Configure Data Input on Heavy Forwarder

You'll have to specify the data source on the heavy forwarder. For testing purposes, I have created a sample syslog file with just 4 lines of sample data. I copied the file to the "C:\TestLog" folder that I created earlier.

Step 4: Test Distributed Processing is working

On the forwarder's Splunk Web, go to Manager -> Data Inputs -> Files and Directories. Check that the "Number of Files" processed on "C:\TestLog" is incremented by one.

On the indexer/receiver's Splunk Search app, check that all 4 lines are being indexed in distribution mode. No filtering is being done yet.

Step 5: Setup Data Filtering during Parsing Phase

You would only need to work on the heavy forwarder in this step. Out of the four syslog lines, only one line contains a keyword "error" that would be used as the REGEX key word here. To force Splunk forwarder to send data to the parsing queue (for REGEX filtering) instead of going directly into the Indexing queue, you have to add this bold line to the earlier inputs.conf:

[monitor://C:\TestLog]
sourcetype = cisco_syslog
queue = parsingQueue
disabled = 0

The bold line would cause Splunk to lookup to the props.conf under Splunk folder "etc/system/local". Create this new file and add the following lines:

# you may replace this spec for something else. In this example, I'm using a sourcetype "cisco_syslog"
# to match the "C:\TestLog" in the earlier inputs.conf
[cisco_syslog]
TRANSFORMS-set= setnull,setparsing

Create another file transforms.conf under the same folder and add the following lines:

[setnull]

#match anything with a single dot "."

REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setparsing]
REGEX = error
DEST_KEY = queue
FORMAT = indexQueue

In this example, all lines (except containing keyword "error") would be discarded on Null Queue. The order is important. Make sure the [setnull] is always on top, otherwise all data would be discarded on Null Queue. If you want to logic to be reversed (discard all lines containing "error"), you would just need to add the following lines instead:

[setnull]

REGEX = error
DEST_KEY = queue
FORMAT = nullQueue

Restart Splunk for the new configurations to take effect. One quick way is to use CLI "splunk restart" on the Splunk "bin" folder or click the "Restart Button" using Splunk Web manager.

Step 6: Test Pre-Indexing Filter

I copied another file (with a new file name) with the same 4 lines content. The file is consumed and sent over to the receiver. On the receiver's Search App, examine the contents of the indexed data using the source command or just simply click on the link on the search summary. As you can see from below, only the line containing keyword "error" is being indexed. The other 3 lines are discarded on the null queue during the parsing segment at the heavy forwarder.

Happy new year to all! This is my first posting for the year. Microsoft has released SCVMM 2012 Release Candidate (RC) version and I rushed to the trial without much reading. To my surprise, the concept is now radically different from its previous 2008 R2 version. First of all, SCVMM 2012 now has a concept called "Cloud". Microsoft has been marketing SCVMM as the foundational tool for private cloud deployment. As multiple Hyper-V hosts form a "Cluster", multiple clusters (including VMWare ESX, Xen, Hyper-V) and hosts now form a "Cloud". Services and VMs may now reside in any clusters and hosts within a cloud.

As I added existing Hyper-V clusters, the next error that I encountered was existing 2008 VMM agents were not compatible with SCVMM 2012. Hence, I have to removed all existing agents. If you are like me who prefer bare-metal Server Core and rushed into the new SCVMM, you might have to remove the agents manually looking at the "Uninstall String" within the Windows registry and run the uninstall command (See this explanation post).

Hyper-V hosts were finally added. Happily, I created the hardware profile and VM templates just like what I did in the previous version. In my first attempt of making a new VM out of the new templates, I encountered this error saying that I simply can't place the new VM into the Clustered Shared Volume.

New in 2012 RC, SCVMM now ensures that only VMs marked as "Highly Available" (HA) can be placed in a CSV. 2012 RC introduces a new concept called "Capability Profiles" and the relevant resources (clouds and clusters) must be marked with the relevant "HA" capability. See this post for step-by-step guide.

Also, you have to mark the new VM as highly available in the "Advanced" section of the VM hardware profile setting as shown below:

Finally, my first VM for the year 2012 was successfully rolled into the "cloud".

A Networker's Log File

Sunday, January 29, 2012

How to Filter Unwanted Data without adding to Splunk Daily Indexing Volume

Monday, January 16, 2012

Install Office 2007 on Terminal Server/Session Host

Saturday, January 7, 2012

VLAN settings on SCVMM 2012 RC

Friday, January 6, 2012

First VM attempt on SCVMM 2012 RC