Splunk

Splunk Search Language

Search Terms
Commands | Blue (creating charts, computing statistics and formatting)
Fuctions | Purple (explaining how we want to chart, compute and evaluate the results)
Arguments | Green (variables that we want to apply to the function)
Clauses (explaining how you want results grouped or defined)

Image

Using TERM() function to ensure searching efficiency
1
index=splunkTraining sourcetype=access_combined TERM(192.168.1.222)
Using table command
1
2
index=splunkTraining sourcetype=access_combined | table host _raw
index=splunkTraining sourcetype=access_combined | table _time, host, source*, _raw
Using timechart command

[span=1d —> 1day span; span=5m —> 5min span]

1
2
index=splunkTraining sourcetype=access_combined TXXXXXX | timechart span=1d count by status
index=splunkTraining sourcetype=access_combined status >= 400 | timechart span=1d count by status

The timechart command allows for the use of subcommands or “evaluation expressions” such as count, avg(), max(), min(), stdev(), and others.

[limit=5 means limiting to 5 clientips]

1
2
index=splunkTraining sourcetype=access_combined | timechart limit=5 sum(bytes) as total_bytes max(bytes) as max_bytes by clientip
index=splunkTraining sourcetype=access_combined | timechart sum(bytes) as max_bytes by clientip WHERE max in top10
Using eval command
1
index=splunkTraining sourcetype=access_combined | eval status_category=if(status<400, "ok", "bad")

[mbytes = bytes / 1024 / 1024 - Divided by 1024 twice]

1
2
index=splunkTraining sourcetype=access_combined | eval mbytes = bytes / 1024 / 1024
index=splunkTraining sourcetype=access_combined | eval mbytes = bytes / 1024 / 1024 | top 5 mbytes

Maybe we’d like to provide a description of the activity taking place in the web request.

1
2
3
index=splunkTraining sourcetype=access_combined itemId=*
| eval event_description=clientip . " made a " . action . " on time " . itemId
| table event_description

Play around with HTTP User-Agent using eval agent_category = case ()

1
2
3
4
5
6
index=splunkTraining sourcetype=access_combined itemId=* | 
eval agent_category = case
(useragent LIKE "%Android%", "Android",
useragent LIKE "%iPhone%", "iOS",
useragent LIKE "%iPad%", "iOS", true(), "Other") |
table _raw, agent_category
Using top and rare command
1
2
3
index=splunkTraining sourcetype=access_combined | top 1 action
index=splunkTraining sourcetype=access_combined | top itemId by action showperc=false
index=splunkTraining sourcetype=access_combined | top itemId by status showperc=false
1
index=splunkTraining sourcetype=access_combined | rare useragent showperc=false

(Seeing the unique list of xxx - groupd statics) If you were interested in seeing the unique list of products that each customer checked out you can use the values() function:

1
index=splunkTraining sourcetype=access_combined | eval mbytes=bytes / 1024 / 1024 | stats values(itemId) as products by clientip

We could use the timechart command, or we could employ the bin and stats commands together. The bin command allows us to specify that we want to group the time values by 1 day or 1 minute or 1 week, etc. It gives us the opportunity to use the span argument even though stats doesn’t support it itself.

1
index=splunkTraining sourcetype=access_combined | bin _time span=1d | stats distinct_count(clientip) as distinct_clients by _time
Using eventstats command

We used timechart before, but want to perform more calculations from the counts that are calculated without losing the form of the results during the new operation. The eventstats command is a close cousin of stats, but with a very special difference: eventstats adds its calculations inline to each event, meaning that the previous result set remains fully intact, but is enriched with new data produced from eventstats.

1
2
3
index=splunkTraining sourcetype=access_combined | timechart span=1d count | eventstats avg(count) as avg
index=splunkTraining sourcetype=access_combined action=purchase | timechart span=1d count | eventstats avg(count) as avg
index=splunkTraining sourcetype=access_combined action=purchase | timechart span=1d count | eventstats avg(count) as avg, max(count) as max, min(count) as min
Using sort, head and tail command

We can use sort against field names to instruct that we’d like the results to be sorted either ascending or descending (where descending is indicated by a preceding “-“ dash). The example below will return results ordered by the clientip field values.

1
index=splunkTraining sourcetype=access_combined | sort clientip
1
2
3
4
index=splunkTraining sourcetype=access_combined |  stats count by clientip, method | sort clientip, -method, -count
index=splunkTraining sourcetype=access_combined | stats count by clientip, method | sort clientip, -method, -count | head 10
index=splunkTraining sourcetype=access_combined | stats count by clientip, method | sort 10 clientip, -method, -count
index=splunkTraining sourcetype=access_combined | stats count by clientip, method | sort clientip, -method, -count | tail 10
Using iplocation, geostats and geom command

Let’s start out by using iplocation on its own.
Enter the search below and look out for the lat, lon, City, Country, and Region fields in your output

1
2
3
index=splunkTraining sourcetype=access_combined | iplocation clientip
index=splunkTraining sourcetype=access_combined | iplocation clientip | geostats count by itemId
index=splunkTraining sourcetype=access_combined | iplocation clientip | stats count by Region | geom geo_us_stats featureIdField="Region"

Splunk Searching Exampels

“On October 1 2016, how many web hits were there for iPhone devices?”

1
2
index=splunkTraining sourcetype=access_combined useragent = "*iPhone*" | table req_time, clientip, useragent
index=splunkTraining sourcetype=access_combined useragent = "*iPhone*" | table req_time, clientip, _raw | sort req_time

Optimize the result by using stats values(), so that to show how many URIs that each ‘clientip’ visited via ‘iPhone’

1
index=splunkTraining sourcetype=access_combined useragent = "*iPhone*" | stats values(uri) count by clientip

If we just need the total count for each of the unique ‘useragents’ (there will then be 2 colums – useragent and each unique count)

1
2
index=splunkTraining sourcetype=access_combined useragent != "*iPhone*" AND useragent=*mobile* | table req_time, clientip, useragent | stats count by useragent
index=splunkTraining sourcetype=access_combined useragent != "*iPhone*" AND useragent=*mobile* | table req_time, clientip, useragent | stats count by useragent | sort by - count

I’d like to see the useragents broken out into categories: Android, iOS, and Other.

1
2
3
4
5
6
index=splunkTraining sourcetype=access_combined useragent=* 
| eval pippo = case(useragent LIKE "%Android%", "Android",
useragent LIKE "%iPhone%", "iOS",
useragent LIKE "%iPad%", "iOS",
true(), "Other")
| top pippo

image

1
index=splunkTraining sourcetype=access_combined useragent=* | eval pippo = case(useragent LIKE "%Android%", "Android", useragent LIKE "%iPhone%", "iOS", useragent LIKE "%iPad%", "iOS", true(), "Other") | top pippo

Can we get those results for just the purchase actions, and instead of just the total, is it possible to see how many per hour?

1
2
3
4
5
6
index=splunkTraining sourcetype=access_combined useragent=* action=purchase 
| eval pippo = case(useragent LIKE "%Android%", "Android",
useragent LIKE "%iPhone%", "iOS",
useragent LIKE "%iPad%", "iOS",
true(), "Other")
| timechart span=1h count by pippo

image

Can we see this data in a chart?

image

We’ve been trying to understand product popularity in different parts of the world for the entire month of October 2016. Can we do that too?
(we need iplocation | geostats )

1
index=splunkTraining sourcetype=access_combined product_id=* | iplocation clientip | geostats count by product_id

image

Splunk Investigation Cheatsheet

Syntax 1 - Using table

“search query (key word)” sourcetype=”value here” | table field1, field2, field3, field4

1
2
3
Example:
U125121 sourcetype="websense" | table _time, user, url, dst_ip, dst_port, action, category_name
user="U125121" sourcetype="websense" | table _time, user, url, dst_ip, dst_port, action, category_name

Syntax 2 - Using dedup

“search query (key word)” sourcetype=”value here” | table field1, field2, field3, field4 | dedup field2

Removing duplicate events. We use “dedup” to remove events which contain an identical combination of values for selected fields.

1
2
Example:
user="U125121" sourcetype="websense" | table _time, user, url, dst_ip, dst_port, action, category_name | dedup url

Syntax 3 - Using sort

“search query (key word)” sourcetype=”value here” | table field1, field2, field3, field4 | dedup field2 | sort field3

1
2
Example:
user="U125121" sourcetype="websense" | table _time, user, url, dst_ip, dst_port, action, category_name | dedup url | sort time

Syntax 4 - Using stats values(field)

“search query (key word)” sourcetype=”value here” | stats values(field1) values(field2) count by user

This syntax provides statistics grouped optionally by field. For example, we can let splunk show the total count of user activites by grouping “values of url” and “values of dst_ip” in two different colomns.

1
2
Example:
user="U125121" sourcetype="websense" | stats values(url) values(dst_ip) count by user

Syntax 5 - Using rename

“search query (key word)” sourcetype=”value here” | stats values(field1) as field1_rename count by user

“search query (key word)” sourcetype=”value here” | table field1, field2, field3, field4 | rename field2 as field2_rename

This syntax allows us rename any field that we prefer.

1
2
3
Example:
user="U125121" sourcetype="websense" | stats values(url) as website count by user
user="U125121" sourcetype="websense" | table _time, url, action, category_name | rename url as website

Configure Splunk Receiver

Splunk start and stop

1
$ sudo /opt/splunk/bin/splunk [start|stop]

Once started we will get below information

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ sudo /opt/splunk/bin/splunk start
Splunk> Like an F-18, bro.

Checking prerequisites...
Checking http port [8000]: open
Checking mgmt port [8089]: open
Checking appserver port [127.0.0.1:8065]: open
Checking kvstore port [8191]: open
Checking configuration... Done.
Checking critical directories... Done
Checking indexes...
Validated: _audit _internal _introspection _telemetry _thefishbucket history main summary
Done
Checking filesystem compatibility... Done
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunk/splunk-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...
Done

Waiting for web server at http://127.0.0.1:8000 to be available........ Done

If you get stuck, we're here to help.
Look for answers here: http://docs.splunk.com

The Splunk web interface is at http://pippo-vm:8000

Clean splunk event data

1
$ sudo /opt/splunk/bin/splunk clean eventdata

Configure Splunk Forwarder

Specify which system log folder to be monitored (e.g. bro log)

1
$ sudo splunkforwarder/bin/splunk add monitor /var/spool/bro/bro/

Set splunk deploy poll (management)

1
$ sudo splunkforwarder/bin/splunk set deploy-poll splunk-receiver-ip:8089

Specify which port splunk receiver is listening on

1
$ sudo splunkforwarder/bin/splunk add forward-server splunk-receiver-ip:9997

Splunk forwarder auto-restart

1
$ sudo splunkforwarder/bin/splunk enable boot-start

Start splunk forwarder

1
$ sudo splunkforwarder/bin/splunk start