
6 Splunk Administration: Parsing and Manipulating Data

1 Splunk Enterprise Administration: Parsing and Manipulating Data


  • Event processing in Splunk Enterprise
  • Role of configuration files
    • props.conf
    • transforms.conf
  • Introduction to the course
  • Demo: Setting up the test environment

1-1 Event Processing in Splunk Enterprise

Data Pipeline in Splunk Enterprise

Alt Image Text


  • Configuring pre-processing properties
  • Attribute/value pairs
  • Observe order of precedence
    • System local directory – highest
    • App local directory
    • App default directory
    • System default directory – lowest

Restart required


  • Settings to configure data transformations
  • Covers the “how” part of things
  • Requires corresponding setting in props.conf
  • Unique stanza name
    • REGEX
    • DEST_KEY
    • FORMAT

Data Input Methods

  • Event Line-breaking

    • Identifying and configuring line-breaks
  • Parsing Timestamps

    • Recognizing and parsing timestamps
  • Field Extractions

    • Custom and default field extractions and overriding
  • Routing and Filtering

    • Filtering and sending events to selective and multiple indexers

1-2 Demo: Setting up the Test Environment

  • Understand topology of test environment
  • Setup test environment
    • Configure forwarding and receiving
    • Install EventGen and the demo app

Alt Image Text

add New Receiving Port

Alt Image Text

Forward data

Alt Image Text

Alt Image Text

root@splunk_hf:/opt/splunk/etc/system/local# vi outputs.conf

defaultGroup = default-autolb-group

[tcpout: default-autolb-group]
server =

root@splunk_hf:/opt/splunk/etc/system/local# /opt/splunk/bin/./splunk list forward-server

root@splunk_hf:/opt/splunk/etc/apps/psdemo/default# vi inputs.conf
sourcetype = customers
disabled = 0
interval = 60
index = main
source = linebreaking_script
sourcetype = linebreaking

disabled = 0
interval = 60
index = main
source = linemerging_script
sourcetype = linemerging

disabled = 0
interval = 60
index = main
source = configurelinebreaker_script
sourcetype = configlinebreaker

disabled = 0
interval = 60
index = main
source = timestamp_override_script
sourcetype = timestamp_override

index=main | stats count by sourcetype

2 Configuring Event Line-breaking

  • Event line-breaking
  • Demo: Breaking simple events
  • Demo: Breaking multiline events
  • Demo: Breaking events using line breaker

2-1 Configuring Line-breaks in Splunk Enterprise

Line breaking

  • Uses LINE_BREAKER regex that by default is: ([\r\n]+)

Line merging

  • Explicitly provide line break setting using attributes like BREAK_ONLY_BEFORE or MUST_BREAK_AFTER etc

Configuring Multi-line Event Boundaries

  • Break and re-assemble data stream
  • Add a stanza in props.conf
  • Set the line merging attributes

Breaking Data Stream into Real Events

  • More efficient but hard to work with
  • Add a stanza in props.conf
  • Set SHOULD_LINEMERGE = false
  • Set LINE_BREAKER attribute
    • Default is ([\r\n]+)
    • Set to regex matching event boundary
    • Restart Splunk

2-2 Demo: Configuring Line-breaking Using a Regex

  • Configuring simple line breaking
  • Based on regex pattern
  • Configurations in props.conf
    • MUST_BREAK_AFTER = <regex>
  • Restart Splunk
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi default/inputs.conf

disabled= 0
index = main
source = perfmon_script
sourcetype = performance

disabled = 0
interval = 300
index = main
source = custgen_script
sourcetype = customers

disabled = 0
interval = 60
index = main
source = linebreaking_script
sourcetype = linebreaking
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi bin/lineBreaking.py

import random as r

for i in range(1,10) :
    print ("This is an event with id" + str(r.randrange(111,999)))+ "\n"

index=main sourcetype=linebreaking

Alt Image Text

Alt Image Text

vi bin/lineBreaking.py

vi local/props.conf

root@splunk_hf:/opt/splunk/etc/apps/psdemo#/opt/splunk/bin/splunk restart

Alt Image Text

2-3 Demo: Breaking Multi-line Events and Line-merging

  • Line breaking in multiline events
  • Based on regex pattern
  • Configurations in props.conf
    • BREAK_ONLY_BEFORE= <regex>
  • Restart Splunk
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi default/inputs.conf

disabled = 0
index = main
source = perfmon_script
sourcetype = performance

disabled = 0
interval = 300
index = main
source = custgen_script
sourcetype = customers

disabled = 0
interval = 60
index = main
source = linebreaking_script
sourcetype = linebreaking

disabled = 0
interval = 60
index = main
source = linemerging_script
sourcetype = linemerging


from datetime import datetime as dt

def getTimestamp():
    return dt.now().strftime("%Y-%m-%d %H:%M")

for i in range(1,10):
    print ("<soe> The temperature received from the sensors was NORMAL. Temperature recorded at: ")
    print (getTimestamp()+" was 38 degrees celcius. This is the end of event")
root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/pythonbin/lineMerging.py

<soe> The temperature received from the sensors was NORMAL. Temperature recorded at:
2020-09-05 13:07 was 38 degrees celcius. This is the end of event
<soe> The temperature received from the sensors was NORMAL. Temperature
<soe> The temperature received from the sensors was NORMAL. Temperature recorded at
2020-09-05 13:07 was 38 degrees celcius. This is the end of event

index=main sourcetype=linemerging

Alt Image Text




/opt/splunk/bin/splunk restart

index=main sourcetype=linemerging

Alt Image Text

2-4 Demo: Configuring Line-breaking with Line Breaker Attribute

  • Configuring simple line breaking
  • More efficient method
  • Configurations in props.conf
    • SHOULD_LINEMERGE = false
    • LINE_BREAKER = <regex>
  • Restart Splunk
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi default/inputs.conf

sourcetype = performance

disabled = 0
interval = 300
index = main
source = custgen_script
sourcetype = customers

disabled = 0
interval = 60
index = main
source = linebreaking_script
sourcetype = linebreaking

disabled = 0
interval = 60
index = main
source = linemerging_script
sourcetype = linemerging

disabled = 0
interval = 60
index = main
source = configurelinebreaker_script
sourcetype = configlinebreaker
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi default/inputs.conf


for i in range(1,10) :
    print('<This is line number: "+str(i)+" of event>")

index=main sourcetype = configlinebreaker

Alt Image Text

root@splunk_hf:/opt/splunk/etc/apps/psdemo#/opt/splunk/bin/python bin/configureLineBreaker.py

<This is line number: 1 of event>
<This is line number: 2 of event>
<This is line number: 3 of event>
<This is line number: 4 of event>
<This is line number: 5 of event>
<This is line number: 6 of event>
<This is line number: 7 of event>
<This is line number: 8 of event>
<This is line number: 9 of event>
vi local/props.conf

/opt/splunk/bin/splunk restart

index=main sourcetype = configlinebreaker

Alt Image Text



/opt/splunk/bin/splunk restart

index=main sourcetype="confielinebreaker"

Alt Image Text

3 Identifying and Parsing Timestamps

  • Timestamp properties in Splunk
  • Timestamp attributes
  • Timestamp formats
  • Demo: Time-zone recognition
  • Demo: Timestamp attributes in Splunk Web
  • Demo: Timestamp overriding

3-1 Configuring Timestamp Recognition in Splunk Enterprise

  • Nothing without a timestamp

    • All events must have a timestamp. If not, Splunk assigns one
  • Edit props.conf

    • To configure or modify timestamps we edit props.conf
  • Timestamp recognition

    • Splunk recognizes most of the timestamp formats out-of-the box
  • Enhanced strptime() support

    • Keep the text to three lines or fewer
  • Timestamp validity attributes

    • For accepting or rejecting the timestamp in the events

Editing Timestamp Properties

  • Timestamp configurations are done on heavy forwarders or indexers
  • Configuration applied to <sourcetype>, source::<source> or host::<host>
  • Identifying correct format of the timestamp is the key
  • Select the correct timestamp if multiple timestamps are present
  • Account for correct time-zones

Timestamp Attributes


    • Provide a file that contains timestamp formats. NONE to disable & CURRENT for current ts
    • $SPLUNK_HOME/etc/datetime.xml

    • Regex pattern found before each timestamp
    • Empty string

    • Number of characters from start of event
    • 128 characters

    • Unix strptime() format string
    • Empty string
  • TZ

    • Time-zone like UTC or +10:00 for Australia/Brisbane etc
    • Empty string

    • Ignore events older than this value. MAX_DAYS_AGO=10 will ignore events over 10 days old
    • 2000 days

    • Ignore events more than this values in future. MAX_DAYS_HENCE=3 will ignore events over 3 days in future
    • 2 days

Timestamp Format

  • %d, %m, %Y Number of day, month and year (four digit). %y for two digits
  • %b, %B: Abbreviated month (Jan, Jun), Full name (January, June)
  • %H, %I, %p, %M, %S: Hour (24 hours) %I (12 hours) with %p as locale’s equivalent of AM or PM. %M minutes and %S seconds
  • %Z Time zone abbreviations like GMT
  • %z, %:z, %::z: Time zone offset +1000, +10:00 and +10:00:00
  • %s Epoch time (10 digits)
  • %N For GNU date-time nanoseconds. Sub-seconds %3N, %6N
  • %+ For standard Unix date format timestamps

3-2 Demo: Translating Time Zones

Configuring Time-zone

  • Sample events in UTC time
  • Adding configuration to props.conf
  • Converting to local time before indexing
root@splunk_hf:~# cd /opt/splunk/etc/apps/psdemo/ 
root@splunk_hf:/opt/splunk/etc/apps/psdemo# ll bin

vi bin/utcTimestamp.py

from time import strftime, gmtime
for i in range(0,5) :
    print(strftime("%Y-%m-%d %H:%M" ,gmtime())+" This event is originally in UTC time with id: " + str(i))

root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/splunk cmd python bin/utcTimestamp.py

2020-09-19 01:50 This event is originally in UTC time with id: 0
2020-09-19 01:50 This event is originally in UTC time with id: 1
2020-09-19 01:50 This event is originally in UTC time with id: 2
2020-09-19 01:50 This event is originally in UTC time with id: 3
2020-09-19 01:50 This event is originally in UTC time with id: 4
root@splunk_hf:/opt/splunk/etc/apps/psdemo# date
Sat Sep 19 11:50:16 AEST 2020
disabled = 0
index = main
source = utcts_script
sourcetype = utc_timestamp

disabled = 0
index = main
source = perfmon_script
sourcetype = performance

disabled = O
interval = 300
index = main
source = custgen_script
sourcetype= customers

disabled = 0
interval = 60
index = main
source = linebreaking_script
sourcetype = linebreaking

disabled =0
interval = 60
index = main
sourcetype = linemerging

index=main sourcetype=utc_timestamp

Alt Image Text

Alt Image Text

  • Timestamp format: %d%b-%Y:%H:%M:%S

  • Timestamp prefix \

Alt Image Text


root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi local/props.conf
root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/splunk restart

index=main sourcetype=utc_timestamp

Alt Image Text

3-3 Demo: Using Splunk Web for Timestamp Recognition and Extraction

Configure timestamp through Splunk Web

  • Events with non-standard timestamp
  • Very old events
  • Timestamp prefix value
  • Timestamp recognition & line breaking
  • Stanza in props.conf
root@splunk_hf:~# cd /opt/splunk/etc/apps/psdemo/
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi data/web_old.log -- [30Mar-1998:14:03:13] "GET /cart.do?action=addtocart&itemId-HYD-29&JSESSIONID-CA9MO3AZIUSANA34656 HTTP 1.1" 503 694 "http://www.salesonline.com/category.screen?categoryId-BLUE_GIZMOS" "Mozilla/5.0 (Windows; U;
Windows NT 5.1; en-GB; rv: Gecko/20070725 Firefox/" 444

Alt Image Text


Alt Image Text

Alt Image Text

index=main sourcetype=old_weblogs

Alt Image Text


LINE_BREAKER = ([\r\n]+)
TIME_FORMAT = %d%b-%Y:%H:%M:%S
category = Custom
description = Very old web server logs with weird date format
pulldown_type = true

3-4 Demo: Overriding Timestamps

Overriding event timestamp

  • Events with multiple timestamps
  • Custom timestamp
  • Configuration in props.conf
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi bin/timestampoverride.py

import datetime, time
for i in range(0,5):
    print (datetime.datetime.now().strftime("%d/%m/%Y %H:%M:%S") + ". is the timestamp on this event number: " 
str(i) + ". But I want timestamp=" + (datetime.datetime.now()- datetime.timedelta(minutes=90)).strftime("%Y, %B %d at %H:%M:%S"))
root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/splunk cmd pythonbin/timestampOverride.py

20/09/2020 17:38:45 is the timestamp on this event number: O. But I want timestamp=2020. September 20 at 16:08:45
20/09/2020 17:38:50 is the timestamp on this event number: 1. But I want timestamp=2020. September 20 at 16:08:50

index=main sourcetype=timestamp_override

Alt Image Text



[old _weblogs]
LINE_BREAKER = ([\r\n]+)
TIME FORMAT = %d%b-%Y:%H:%M:%S
category = Custom
description = Very old web server logs with weird date format
pulldown_type = true

TIME_PREFIX = timestamp=
TIME_FORMAT = %Y, %B %d at %H:%M:%S
root@splunk_hf:/opt/splunk/etc/apps/psdemo#/opt/splunk/bin/splunk restart

index=main sourcetype=timestamp_override

Alt Image Text

  • Timestamp recognition in Splunk
  • Timestamp attributes
  • Translating time-zones
  • Using Splunk web for timestamp handling
  • Overriding timestamp

4 Overriding Default Fields and Custom Fields Extraction

  • Defaults fields in Splunk Enterprise
  • Process of overriding default fields
  • Demo: Overriding default fields
  • Index-time field extraction
  • Demo: Custom fields extraction

4-1 Overriding Default Fields

Default Fields

  • Tagged and added automatically to event data at index time
  • Each default field holds information about the data it is tagged to
  • Three types of default fields: Internal, basic and datetime
  • Overridden when default behavior needs to be changed
  • Change does not apply to previously indexed events

  • Internal

    • _raw, _time, _indextime, _cd

Default Field Override – Scenarios

Default fields that we often need to override

  • host
  • source
  • sourcetype

Some examples:

  • Host is replaced when events originate from log server and host value exists in the event
  • Source is overridden when file name is replaced with more suitable text
  • Sourcetype is changed based on existence of some text or specific pattern in the event

Overriding Default Fields at Input Creation Time

  • Through Splunk Web
  • In configuration files
  • Provide values for relevant attribute
    • host
    • source
    • sourcetype
    • index


  • date_hour, date_minute,
  • date_year, date_zone,
  • date_mday, date_month,
  • date_second, date_wday

Splunk Enterprise Administration: Monitoring and Creating Inputs

Overriding Default Fields in Existing Inputs

  • Define suitable value to assign
    • Static value
    • Regular expression
  • Identify events to apply change
    • Regex based
  • Write stanzas in configuration files
    • transforms.conf
    • props.conf
  • Restart Splunk

Overriding “host” Field

Configuration to override the host field



REGEX = \w+\s\d+\s\d+:\d+:\d+\s(router\d+)\s
FORMAT = host::$1
DEST_KEY = MetaData:Host
Apr 11 10:32:22 router1 mgd[3606]: UI_DBASE_LOGOUT_EVENT: User ‘smith' exiting configuration mode

Apr 11 11:36:15 switch2 mgd[3606]: UI_COMMIT: User 'root' performed commit: no comment

Apr 11 11:46:37 router5 mib2d[2905]: SNMP_TRAP_LINK_DOWN: ifIndex 82, ifAdminStatus up(1), ifOperStatus
down(2), ifName at-1/0/0

Overriding “host” Field

Configuration to override the host field



TRANSFORMS-override_host = my_field_override_stanza_name
Apr 11 10:32:22 router1 mgd[3606]: UI_DBASE_LOGOUT_EVENT: User ‘smith' exiting configuration mode

Apr 11 11:36:15 switch2 mgd[3606]: UI_COMMIT: User 'root' performed commit: no comment

Apr 11 11:46:37 router5 mib2d[2905]: SNMP_TRAP_LINK_DOWN: ifIndex 82, ifAdminStatus up(1), ifOperStatus
down(2), ifName at-1/0/0
  • host = router1
  • host = old_value
  • host = router5

4-2 Overriding Sourcetype

  • Use-case scenario
  • Creating regular expression
  • Setting up configuration files
  • Testing Results

Overriding Sourcetype: Use-Case Scenario

Alt Image Text

root@splunk_hf:/opt/splunk/etc/apps/psdemo/default# /opt/splunk/bin/python ../bin/getPerformance.py

2020-07-08 11:57:33 src=srv_pssepm01, cpu=59, ram=81, disk=37
2020-07-08 11:57:33 src=srv_psmai101, cpu=56, ram=85, disk=32
2020-07-08 11:57:33 sr=srv_psmailo2, cpu=58, ram=82, disk=35
2020-07-08 11:57:33 src=srv_psdc01, cpu=80, ram=82, disk=37
2020-07-08 11:57:33 sc=srv_psdhcp01, cpu=79, ram=93, disk=32

index=main source=perfmon_script

Alt Image Text

Alt Image Text

  • sourcetype=performance:srv
  • sourcetype=performance:ws
root@splunk_hf:/opt/splunk/etc/apps/psdemo/default# cd ../local


TRANSFORMS-changesourcetype = performance_sourcetype_override


REGEX = src\=(ws|srv)\_
FORMAT = sourcetype::performance:$1
DEST_KEY = MetaData:Sourcetype
root@splunk_hf:/opt/splunk/etc/apps/psdemo/local#/opt/splunk/bin/./splunk restart

index=main source=perfmon_script

Alt Image Text

Alt Image Text

4-3 Creating Custom Fields at Index-time

Index-time Field Extractions

  • Fields extracted and written to metadata
  • Similar to default fields overriding process
    • transforms.conf
    • props.conf
    • fields.conf
  • Restart Splunk after changes


REGEX = <suitable regular exp>

FORMAT = <key>::<val>

  • Changes to transforms.conf
  • Regular expression that identifies events to apply changes as well as captures groups to assign values to extracted fields
  • key1::$1 key2::“some string” $2::$3
    • Default value is <stanza_name>::$1
    • Not required if you are using namecapturing groups for example:
    • (?<_KEY_1>[^\=]*)\=(?<_VAL_1>[^\,]*)\, _KEY_<string> & _VAL_<string> is special capturing group
    • Only valid for index time extractions. It writes REGEX to metadata automatically. Default value is false
SOURCE_KEY = <key>

REPEAT_MATCH = true|false
LOOKAHEAD = <integer>
  • What would be the default value if regex fails?
  • The key to which the value of the regex would apply to. It should be present at the time of this extraction. Default is _raw.
  • Regex is applied multiple times to SOURCE_KEY after finding first match
  • How far to look ahead in an event for the regex match. Default 4096


TRANSFORMS-<name> = <stanza_name>

Reference the transform created previously from props.conf using TRANSFORMS- keyword for index time extractions.

You’ll use EXTRACT- in the case of search time extractions



INDEXED = true
  • Reference the transform created previously from props.conf
  • Tells Splunk to extract and store the field in metadata at index time.
    • If not set to true, the field will not be extracted at index time.
    • Default is false. Check for index & search time extraction conflicts

4-4 Demo: Creating Custom Fields at Index-time

  • Identifying custom fields
  • Changes in configuration files
  • Check results in Spunk Enterprise

index=main sourcetype=netflow

Alt Image Text

Alt Image Text

root@spunk_hf:~# cd /opt/splunk/etc/apps/psdemo/local/

root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# vi transforms.conf 

REGEX = src\=(ws|srv)\
FORMAT = sourcetype::performance:$1
DEST_KEY = MetaData:Sourcetype

REGEX = (tcpludp|icmp)\.(.*?)\s
FORMAT = protocol::$1 protocol_detail::$2


[source: :perfmon_script]
TRANSFORMS-changesourcetype = performance_sourcetype_override

TRANSFORMS-netflow = netflow_custom_fields



root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# /opt/splunk/bin/./splunk restart

index=main sourcetype=netflow

Alt Image Text

  • Default fields in Splunk Enterprise
  • Overriding default fields – why and how
  • Demo: Overriding default fields
  • Index time extractions – why and how
  • Demo: Extracting custom fields

5 Routing and Filtering Events

  • Routing event data
    • Route all or specific events
    • Indexer groups
    • Default routing
  • Filtering events
    • Filtering queues
    • Filtering in/out events
  • Demo: Routing and filtering events

5-1 Routing Event Data

  • Based on host, source, source type or a pattern in events
  • Pattern based routing can only be done on a full Splunk instance
  • Universal Forwarders route data based on host, source or sourcetype
  • Can forward to multiple target indexer groups
  • Can be routed to third party systems

Routing Event Data

Alt Image Text

index=main source=WinEventLog:Security

Alt Image Text

Configuring Event Routing

A sample stanza that routes events to target groups.


REGEX = action=\"blocked\" OR .
FORMAT = indexerGroup1,indexerGroup2
  • A regular expression that captures all the events to be routed. (.) or just a '.' refers to all events.By default all events routed to default indexer
  • DEST_KEYattribute tells Splunk what we need to do with the event captured in the previous step
  • FORMAT attribute will hold the values of indexer group or groups where these events will be routed


TRANSFORMS-routing = stanza1, stanza2
  • Reference thestanza created in transfroms.conf from props.conf


server =,

server = indexer3:9997,indexer4:9997

defaultGroup = myFirstIndexerGroup
  • Groups are defined with suitable names and list of server ip/dns name:port
  • Default group is selected

5-2 Demo: Routing Event Data

  • Part I
    • Identify subset of event to rout
    • Understanding current topology
    • Route the subset of events
  • Part II
    • Route same events to multiple indexers
    • Overriding default indexer group

Overriding Sourcetype: Use-Case Scenario

Alt Image Text

Routing Events

Alt Image Text

Alt Image Text

Alt Image Text

index=main source=WinEventLog:Security EventCode=4625

Alt Image Text

Alt Image Text

root@splunk_hf:/# cd /opt/splunk/bin/
root@splunk_hf:/opt/splunk/bin# ./splunk list forward-server

Active forwards:
Configured but inactive forwards:

root@splunk_hf:/opt/splunk/bin# cd ../etc/system/local/
root@splunkhf:/opt/splunk/etc/system/local# vi outputs.conf




root@splunk_hf:/opt/splunk/etc/system/local# /opt/splunk/bin/./splunk restart

root@splunk_hf:/opt/splunk/etc/system/local# /opt/splunk/bin/./splunk list forward-server

Active forwards:
Configured but inactive forwards:


vi transforms.conf

REGEX = EventCode= 4625
FORMAT = filtered_indexer

vi props.conf


root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# /opt/splunk/bin/./splunk restart

index=main source=WinEventLog:Security EventCode=4625

Alt Image Text

Alt Image Text

vi transforms.conf

REGEX = EventCode=4625
FORMAT = filtered_indexers, psdemo_indexers
root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# /opt/splunk/bin/./splunk restart

Alt Image Text

Alt Image Text

root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# vi transforms.conf

REGEX = (.)
FORMAT = psdemo_indexers

REGEX = EventCode=4625
FORMAT = filtered_indexers, psdemo_indexers


root@splunk_hf:/opt/splunk/etc/apps/psdemo/local#/opt/splunk/bin/./splunk restart

Alt Image Text

6 Manipulating Raw Data

  • Anonymize, mask or modify event data
    • Configuration and working
    • Demo
  • Regex transform
    • Configuration and working
    • Demo

6-1 Manipulating Raw Data in Splunk Enterprise

Manipulating Raw Data in Splunk

  • Anonymize, mask or delete sensitive or unwanted information
  • Configure heavy forwarders or indexers to manipulate arriving data
  • Splunk cloud customers use a heavy forwarder
  • Two methods: SEDCMD like sed script or regex transform
  • Use stanza based on host, source, sourcetype to select events

6-2 Manipulating Events Using SEDCMD


  • Configured through props.conf
  • Anonymize with sed script

    • SEDCMD-<class>
    • s/<regex>/<replace>/flags
    • flags: g – global or a number
    • Applied to _raw only
  • Replace characters with sed script

    • SEDCMD-<class> = y/<string1>/string2/
    • y/abc/ABC/

6-3 Demo: Manipulating Events Using SEDCMD

  • Part I
    • Masking data using SEDCMD script
    • 1234-4567-7890-0123
    • XXXX-XXXX-XXXX-0123
  • Part II
    • Replacing data with SEDCMD script
    • +61 412 234 456
    • (+61) (412 234 456)

index=main source=WineventLog:Security

Alt Image Text

Alt Image Text

Alt Image Text

root@splunk_hf:~# cd /opt/splunk/etc/apps/psdemo/local/
root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# ll



REGEX = (.)
FORMAT = psdemo_indexers

REGEX = EventCode=4625
FORMAT = filtered_indexers, psdemo_indexers

REGEX = EventCode=(4624|4634)
DEST_KEY = queue
FORMAT = nullQueue


TRANSFORMS-routing = allwinSecurityEventsRouting, loginFailedRouting
TRANSFORMS-null = setNull
root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# /opt/splunk/bin/./splunk restart

index=main source=WineventLog:Security

Alt Image Text

Filtering and Routing Events

Alt Image Text


REGEX = (.)
FORMAT = psdemo_indexers

REGEX = EventCode=4625
FORMAT = filtered_indexers, psdemo_indexers

DEST_KEY = queue
FORMAT = nul1Queue

REGEX = EventCode=(4624|4625)
DEST_KEY = queue
FORMAT = indexOueue


TRANSFORMS-routing =allwinSecurityEventsRouting,loginFailedRouting
TRANSFORMS-null = setNu1l,setParsing
/opt/splunk/bin/./splunk restart

Alt Image Text

index=main source=WineventLog:Security

root@splunk_hf:~# cd /opt/splunk/etc/apps/psdemo/
root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi default/inputs.conf

import random
from datetime import datetime

def getCreditCardNum():
    for _ in range(3):
        creditCardNum += "-" + str(random. randint(2222,9999))
    return creditCardNum

def getMobileNum():
    mobileNum = "+61 42"+str(random.randint(0,9))
    for _ in range(2):
    mobileNum += " "+ str (random.randint(111.999))
    return mobileNum

def getTimeStamp() :
    return datetime.now().strftime("%Y-%m-%d%H:%M:%S")

firstNames = [""]   
lastNames = [""]    

for firstName in firstNames:
    record = getTimeStamp() + ' CustomerName=" ' + firstName + ' ' +random.choice(lastNames)+'", ' + 'CreditCard=" '+getCreditCardNum()+" , MobileNumber="'+getMobileNum() + '"'
print (record)  

index=main sourcetype=customers

Alt Image Text

Alt Image Text

root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi local/props.conf

SEDCMD-cc = S/(1d{4})\-){33}/XXXX-XXXX-XXXX-/g

root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/./splunk restart

index=main sourcetype=customers

Alt Image Text

Alt Image Text

root@splunk_hf:/opt/splunk/etc/apps/psdemo# vi local/props.conf

SEDCMD-CC = S/(\d{4}\-) {3}/XXXX-XXXX-XXXX-/g
SEDCMD-mn = s/MobileNumber\=\"(\+61)\s([^\"]*)/MobileNumber="0\2/g

vi local/props.conf

root@splunk_hf:/opt/splunk/etc/apps/psdemo# /opt/splunk/bin/./splunk restart

index=main sourcetype=customers

Alt Image Text

Alt Image Text

SEDCMD-cust = S/(\d{4}\-) {3}/XXXX-XXXX-XXXX-/g s/MobileNumber\=\"(\+61)\s([^\"]*)/MobileNumber="0\2/g

Alt Image Text

6-4 Manipulating Events Using Regex Transform

REGEX Transform

  • Configured through props and transforms
  • transforms.conf

    • REGEX – a PERC regex
    • FORMAT – arrange event post change
    • DEST_KEY – applied to field e. g raw
  • props.conf

    • TRANSFORM-<class>
    • Comma separated list of transforms

6-5 Demo: Manipulating Events Using Regex Transform

Anonymizing data with regex transform

  • Masking credit card number
  • 1234-4567-7890-0123
  • Creating a suitable regex
  • Configuring transforms and props.conf
  • Verifying results

props. conf

SEDCMD-cust = S/(\d{4}\-) {3}/XXXX-XXXX-XXXX-/g s/MobileNumber\=\"(\+61)\s([^\"]*)/MobileNumber="0\2/g

index=main sourcetype=customers

Alt Image Text


root@splunk_hf:~# cd /opt/splunk/etc/apps/psdemo/local/
root@splunk_hf:/opt/splunk/etc/apps/psdemo/local# vi transforms.conf

REGEX = ^(.*)CreditCard\=\"(\d+\-){3}(.*)$
FORMAT = $1CreditCard- "XXXX-XXXX-XXXX-$3"
_DEST_KEY = _raw


TRANSFORMS-maskcreditcard = maskCC

index=main sourcetype=customers

Alt Image Text

Alt Image Text