Splunk Boss of the SOC (BOTS)¶
Introduction¶
Splunk Boss of the SOC (BOTS) was a competitive, capture-the-flag–style event focused on security operations center skills. Participants investigate simulated security incidents using Splunk data to identify threats, analyze attacks, and answer challenge questions. It is designed to test threat detection, incident response, and data analysis capabilities in realistic, hands-on scenarios.
Since most CTFs have an offensive focus, Splunk BOTS is a refreshing alternative. They were organized through 2016 until 2019 (four editions). They still host various challenges on their website, including the first edition. Edition two and three can be found on GitHub.
I first learned about Splunk BOTS during a FIRST Technical Colloquium talk given by ex-employees and decided to give it a go myself.
Info
The answers should adhere to Splunk tips and tricks as listed on my Splunk cheatsheet.
BOTS V1 (2015)¶
An overview of data sourcetypes and download link for the dataset can be found on GitHub.
Scenario 1: Web site defacement¶
Context
Today is Alice's first day at the Wayne Enterprises' Security Operations Center. Lucius sits Alice down and gives her first assignment: A memo from Gotham City Police Department (GCPD). Apparently GCPD has found evidence online (http://pastebin.com/Gw6dWjS9) that the website www.imreallynotbatman.com hosted on Wayne Enterprises' IP address space has been compromised. The group has multiple objectives... but a key aspect of their modus operandi is to deface websites in order to embarrass their victim. Lucius has asked Alice to determine if www.imreallynotbatman.com. (the personal blog of Wayne Corporations CEO) was really compromised.
Question 101
Question
What is the likely IPv4 address of someone from the Po1s0n1vy group scanning imreallynotbatman.com for web application vulnerabilities?
Look for the domain as raw term in the HTTP logs. Using the sidebar we can see that only two source IPs are present in the results. We can quickly identify that one IP is sending suspicious requests: 40.80.148.42.
Question 102
Question
What company created the web vulnerability scanner used by Po1s0n1vy? Type the company name.
Look at HTTP requests coming from the malicious IP address. In the headers of the requests (src_headers
) we find the product: Acunetix-Product: WVS/10.0 (Acunetix Web Vulnerability Scanner - Free Edition).
Question 103
Question
What content management system is imreallynotbatman.com likely using?
Using the same query as that for question 102 we can see that many of the URIs attacked contain /joomla/
. A Google search reveals that this is a CMS and therefore the answer: joomla.
Question 104
Question
What is the name of the file that defaced the imreallynotbatman.com website? Please submit only the name of the file with extension?
The FortiGate firewalls have the unified threat management (UTM) feature enabled. This index will contain incoming and outgoing analyzed traffic (WAF/IDS-like). We look for events coming from the webserver itself as the attacker likely compromised it before defacing it and pulled the image from somewhere else. We find the webserver IP using the query in question 102. The answer we find is: poisonivy-is-coming-for-you-batman.jpeg.
Question 105
Question
This attack used dynamic DNS to resolve to the malicious IP. What fully qualified domain name (FQDN) is associated with this attack?
Using the same event that provided we answer for 104 we find: prankglassinebracket.jumpingcrab.com.
Question 106
Question
What IPv4 address has Po1s0n1vy tied to domains that are pre-staged to attack Wayne Enterprises?
Using the same event that provided we answer for 104 and 105 we find: 23.22.63.114.
Question 108
Question
What IPv4 address is likely attempting a brute force password attack against imreallynotbatman.com?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST src_ip="23.22.63.114"
From question 101 we know that there are two IPs connecting to the webserver, of which one is a vulnerability scanner. While the attacker could conduct a brute-force attack from this IP, we see the brute-force attempts when analyzing traffing from the other one ('src_content'). Therefore, the answer is: 23.22.63.114.
Question 109
Question
What is the name of the executable uploaded by Po1s0n1vy?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST "upload" ".exe"
Build a query that looks for HTTP traffic with the webserver as destination IP, HTTP method POST, and containing references to uploading and Windows executables. This returns a single event regarding an upload. This executable appears to be: 3791.exe.
Question 110
Question
What is the MD5 hash of the executable uploaded?
index=botsv1 sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventCode=1 process="3791.exe" "3791.exe"
Sysmon will have recorded this when the executable was launched (event code 1). Look for the executable using both a raw term search as it can appear in multiple fields and in the process field. This query is much quicker than when not supplying the raw term. The hash we find is: AAE3F5A29935E6ABCC2C2754D12A9AF0.
Question 111
Question
GCPD reported that common TTPs (Tactics, Techniques, Procedures) for the Po1s0n1vy APT group, if initial compromise fails, is to send a spear phishing email with custom malware attached to their intended target. This malware is usually connected to Po1s0n1vys initial attack infrastructure. Using research techniques, provide the SHA256 hash of this malware.
We leverage VirusTotal to look for relationships of the attacker infrastructure 23.22.63.114. Under 'Communicating Files' there are four likely malicious files listed. One of them has an ominous looking name that could be used in a spearsphing attack called MirandaTateScreensaver.src.exe
. The SHA256 hash of this file is: 9709473ab351387aab9e816eff3910b9f28a7a70202e250ed46dba8f820f34a8.
Question 112
Question
What special hex code is associated with the customized malware discussed in question 111?
The VirusTotal page for the malware found in question 111 contains comments. The first comment from 'ryan_kovar' contains a hex string: 53 74 65 76 65 20 42 72 61 6e 74 27 73 20 42 65 61 72 64 20 69 73 20 61 20 70 6f 77 65 72 66 75 6c 20 74 68 69 6e 67 2e 20 46 69 6e 64 20 74 68 69 73 20 6d 65 73 73 61 67 65 20 61 6e 64 20 61 73 6b 20 68 69 6d 20 74 6f 20 62 75 79 20 79 6f 75 20 61 20 62 65 65 72 21 21 21
This translates to Steve Brant's Beard is a powerful thing. Find this message and ask him to buy you a beer!!!
.
Question 114
Question
What was the first brute force password used?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST src_ip="23.22.63.114" | tail 1
Look for all brute-force attempts and filter out the first one. The password shown in form_data
is: 12345678.
Question 115
Question
One of the passwords in the brute force attack is James Brodsky's favorite Coldplay song. We are looking for a six character word on this one. Which is it?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST src_ip="23.22.63.114"
| rex field=src_content "(?:^|&)passwd=(?P<password>[^&]+)"
| eval length=len(password)
| table password, length
| where length=6
Again look for all brute-force attempts and extract the password used from the src_content
field using regex. We can filter on the field name passwd
and take all characters until the next &
delimiter. We then compute the length of the password and create a table with only password with a length of six. Using some common Coldplay knowledge, we identify the password with the name of a Coldplay song: yellow.
AI usage
The regex was created with help of ChatGPT and summarized as follows:
-
(?:^|&)
Non-capturing group. Matches either the beginning of the string (^
) or an ampersand (&
). Ensures we only match whenpasswd=
is a standalone parameter and not a substring inside another value. -
passwd=
Literal match for the field name. -
(?P<password>[^&]+)
Named capturing group calledpassword
. Captures one or more characters (+
) that are not an ampersand ([^&]
). This ensures the captured value ends before the next delimiter.
Question 116
Question
What was the correct password for admin access to the content management system running "imreallynotbatman.com"?
index=botsv1 dest_ip="192.168.250.70" sourcetype="stream:http" http_method=POST form_data="*passwd*" form_data="*username=admin*" src_ip="40.80.148.42"
The data from the brute-force IP does not show an obvious candidate. However, when swapping out the IP with the vulnerability scanner we find a single event with the correct password: batman.
Question 117
Question
What was the average password length used in the password brute forcing attempt?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST src_ip="23.22.63.114"
| rex field=src_content "(?:^|&)passwd=(?P<password>[^&]+)"
| eval length=len(password)
| table password, length
| stats avg(length)
We reuse the query used for question 115 and now compute the average length. This returns (rounded down): 6.
Question 118
Question
How many seconds elapsed between the time the brute force password scan identified the correct password and the compromised login?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST form_data="*batman*"
| sort _time
| streamstats current=f last(_time) as prev_time
| eval diff_seconds=_time - prev_time
We look for requests containing the password batman
. The query returns two events; the brute-force attempt and the actual login. We can compute the time in seconds as diff_seconds
between these two events: 92.17.
AI usage
ChatGPT came up with the computation of the difference in time directly in the query using streamstats
, which it explained as follows:
streamstats
processes events as they stream in, maintaining state across rows.
current=f
prevents including the current row in the calculation, so only past rows are referenced.last(_time) as prev_time
stores the previous event’s_time
value in a new fieldprev_time
.- Each event gets paired with the immediately preceding event’s time, enabling
eval
to compute their difference.
Question 119
Question
How many unique passwords were attempted in the brute force attempt?
index=botsv1 sourcetype="stream:http" dest_ip="192.168.250.70" http_method=POST src_ip="23.22.63.114"
| rex field=src_content "(?:^|&)passwd=(?P<password>[^&]+)"
| stats values(password)
We again reuse the query and compute all the unique passwords: 412.
Scenario 2: Ransomware¶
Context
After the excitement of yesterday, Alice has started to settle into her new job. Sadly, she realizes her new colleagues may not be the crack cybersecurity team that she was led to believe before she joined. Looking through her incident ticketing queue she notices a “critical” ticket that was never addressed. Shaking her head, she begins to investigate. Apparently on August 24th Bob Smith (using a Windows 10 workstation named we8105desk) came back to his desk after working-out and found his speakers blaring (click below to listen), his desktop image changed (see below) and his files inaccessible.
Alice has seen this before... ransomware. After a quick conversation with Bob, Alice determines that Bob found a USB drive in the parking lot earlier in the day, plugged it into his desktop, and opened up a word document on the USB drive called "Miranda_Tate_unveiled.dotm". With a resigned sigh she begins to dig into the problem...
Question 200
Question
What was the most likely IPv4 address of we8105desk on 24AUG2016?
Filter for the hostname using the indexed field and look for the most frequent source IP in the sidebar. This will be: 192.168.250.100.
Question 201
Question
Amongst the Suricata signatures that detected the Cerber malware, which one alerted the fewest number of times? Submit ONLY the signature ID value as the answer.
Filter for logs generated by suricata containing the IP mentioned in the last question, and containing the malware name (this query could be optimized by searching in specific fields). Use the sidebar to locate the ID, which will be: 2816763.
Question 202
Question
What fully qualified domain name (FQDN) does the Cerber ransomware attempt to direct the user to at the end of its encryption phase?
We can reuse the query on Suricata data of question 201 to get the time range we need to look for in DNS data. We run the query and click on the time field of the latest result. Splunk will then prompt to set a time range based on that time. We choose to set it to look for events 5 minutes before and after. We can then run this next query to summarize the DNS queries run from the workstation in that timeframe:
There are six unique DNS queries of which one is the obviously malicious one: cerberhhyed5frqa.xmfir0.win.
Question 203
Question
What was the first suspicious domain visited by we8105desk on 24AUG2016?
index="botsv1" sourcetype="stream:dns" src_ip="192.168.250.100" earliest="08/24/2016:0:0:0" latest="08/24/2016:23:59:59" | reverse | table _time, query
Run a query on the DNS data with the correct time range. The IP address is searched with a raw term search as the host_addr
field needed is an array and thus doesn't search easily. The query formats the output into a table that is sorted on ascending time. On page nine the suspicious domain is found: solidaritedeproximite.org.
Question 204
Question
During the initial Cerber infection a VB script is run. The entire script from this execution, pre-pended by the name of the launching .exe, can be found in a field in Splunk. What is the length of the value of this field?
index="botsv1" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" dvc_nt_host="we8105desk" ".vbs" | eval length=len(cmdline) | table _time, CommandLine,length | reverse
Sysmon data will include the VB script. Look for anything related to .vbs
and compute the cmdline length. We then use a table to display the results in ascending order. Don't forget to reset the time range to all time after answering the questions before.
We find ten different command lines. The first suspicious one has a length of: 4490.
Question 205
Question
What is the name of the USB key inserted by Bob Smith?
The WinEvent logs provided (application
, security
, and system
as per the GitHub repository) lack the needed operational
channel to look for USB events. We do have the winregistry events at our disposal. USB devices are added to the USBSTOR
in the registry. We look for this term and look at the unique data values that are being set. The query returns many typical Windows registry terms and IDs except a suspicious name: MIRANDA_PRI.
Question 206
Question
Bob Smith's workstation (we8105desk) was connected to a file server during the ransomware outbreak. What is the IPv4 address of the file server?
Look for SMB traffic coming from the workstation and identify the IP connected most often to: 192.168.250.20.
Question 207
Question
How many distinct PDFs did the ransomware encrypt on the remote file server?
index="botsv1" sourcetype="WinEventLog:Security" host="we9041srv" Accesses=DELETE ".pdf" | stats values(Relative_Target_Name)
We can identify the hostname of the file server (we9041srv
) by reversing the trick used for question 200. Then look for deletion of PDF files and count the unique events: 257.
Question 208
Question
The VBscript found in question 204 launches 121214.tmp. What is the ParentProcessId of this initial launch?
index="botsv1" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" host="we8105desk" EventCode=1 "121214.tmp" ".vbs"
Look for process start events with both terms. We find the parent process ID is: 3968.
Question 209
Question
The Cerber ransomware encrypts files located in Bob Smith's Windows profile. How many .txt files does it encrypt?
index="botsv1" host="we8105desk" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventCode=2 TargetFilename="C:\\Users\\bob.smith.WAYNECORPINC\\*" TargetFilename="*.txt"
Count the Sysmon events that involve file creation with the target including the user profile and the file extension, resulting in the number of encrypted files: 406.
Question 210
Question
The malware downloads a file that contains the Cerber ransomware cryptor code. What is the name of that file?
index="botsv1" sourcetype="suricata" dest_ip="192.168.250.100" | table fileinfo.filename, fileinfo.size | sort -fileinfo.size
Look for all file names and sizes with the infected workstation as destination. A ransomware executable will typically be larger than other files so we sort the results by size. The top result immediately stands out as it's a relatively large JPG file with a suspicious name: mhtr.jpg.
Question 211
Question
Now that you know the name of the ransomware's encryptor file, what obfuscation technique does it likely use?
The technique to hide data in files such as images is typically called: steganography.