Live Cyber Attack Lab 🎯 Watch our IR team detect & respond to a rogue insider trying to steal data! Choose a Session

X

A Quick Dip into Event Query Language (EQL)

Data Security

Did you ever have the urge to put together a few PowerShell scripts leveraging the Windows Management Interface (WMI) to create your own file security monitoring package? My advice: wait for the urge to go away. I tried this myself once long ago, and while the project was educational, you quickly realize how hard it must be to develop real enterprise-level threat monitoring.

For what it’s worth, I could kind of monitor a single directory to see if there was unusual activity possibly indicating a hacker copying files.  Thankfully, there’s now open-source software, known as EQL, that makes it much easier to conduct your own security and threat monitoring projects!  With EQL, you can practically dive right in, avoiding the steep curve I went through with PowerShell and WMI.

Overview of EQL

EQL (pronounced equal) is a cybersecurity language designed by the folks at Endgame — love that name! — a cyber research and consulting company. While EQL was initially restricted for use within the company, it was released in 2018 as an open-source project on Github  to nurture collaboration among security practitioners worldwide. Great idea! EQL also has potential as a pen-testing tool, which we’ll explore in a future post.

The EQL core language is based on Python, there is an integration with Windows Sysmon, and there are extensive analytics. EQL benefits from its ability to match events, stack data, and perform analysis of aggregate data sets. In plain-speak, you can easily tap into a lot of process context that would usually require a complex query and coding for something like, “all the processes that performed network activity that are descended from regsrvr32.exe”.  It’s also schema-independent and OS-agnostic, and so can be used with any data set or operating system (Linux, Windows).

The goal of EQL is to go beyond legacy reliance on Indicators of Compromise (IoCs)  by using familiar shell-type syntax to craft queries for spotting interesting behaviors. By the way, the security analytics capabilities match up with the Mitre ATT@CK model. There’s a lot more information on the EQL way of doing things in this very watchable video — skip to about the 6:50 mark.

EQL Ideas

Let’s get into the basis in this initial dive. To simplify search, EQL thankfully drops the over-abundance of keywords found in PowerShell scripts in favor of a simpler, more practical function syntax. These functions can be used to perform math, and create more sophisticated expressions without entering long keyword-heavy strings. Yay!

Booleans

As you’d expect, EQL has boolean operators (and, or, not),  the usual comparers (<,>,<=,>=,!=), and there’s a case-insensitive wildcard search available via the asterisk character. For example, if you wanted to look up a “svchost” service process that doesn’t have either -k in the command line, or services.exe as a parent process, you would write the query like this:

process where process_name == "svchost.exe" and (command_line != "* -k *" or parent_process_name != "services.exe")

Sequences

EQL sequences can be used to identify data points that share common attributes, such as a common process path and file path. These sequence queries can also be made time- and event-sensitive, for example, you can set a tracking point to end when a log-off event occurs, or after a set period of time has elapsed. This can be helpful to remove non-unique entries and reduce memory usage.

A generic sequence query looks like this:

sequence[event_type1 where condition1][event_type2 where condition2] ...[event_typeN where conditionN]

Joins

Bless them for making database-like joins very simple. Joins can be used to match unordered events that share one, or several, user-defined properties. EQL’s join can be thought of as a form of the sequences syntax, but without accounting for time constraints.

join // by shared_field1, shared_field2, ...[event_type1 where condition1] // by field1[event_type2 where condition2] // by field2...[event_typeN where conditionN] // by field3

Pipes

Pipes can be used to conveniently filter and reduce the number of results from a data set and add more specificity to queries in post-processing. You can remove duplicate entries using the ‘unique’ pipe, or you can request the most (or least) common entries in a data set with the ‘filter’ command.

The ‘count’ pipe returns the total number of entries found matching the search query. The basic pipe structure is as follows:

process where true | count// results look like// "count": 100, "key": totals"

Process Lineage

Process lineage is automatically tracked to simplify the discovery of vital information about a process, such as its origin and how long it has been live. Using this lineage data, EQL can really isolate results you need.

You can limit results to just process identifiers (PIDs) with a specific parentage, or remove PIDs based on when they became active. This helps avoid searching the same results over and over since you can quickly ignore PIDs that were active before you last conducted the same search.  To search for PowerShell entries that weren’t launched by Windows Explorer, you’d enter this string:

network where process_name == "powershell.exe"and not descendant of [process where process_name == "explorer.exe"]

Hello World in EQL

This post is just a very quick introduction, and we’ll explore EQL more thoroughly next time. Can an infosec blogger install and run a very  trivial EQL query in about 30 minutes? The answer is yes.

I should have pointed out earlier that EQL works with JSON input. So if you want to analyze Sysmon output you need to convert it to JSON — which is easy to accomplish in PowerShell —and then it run through EQL. Fortunately, the Endgame gang has already published a few datasets on Github to work with.

Using one of the sets, I ran the EQL equivalent of “hello world” to count the number of Windows processes that start with “cmd”:

Victory! I installed EQL and got it to parse JSON input in about 30 minutes!

My point is to show that this is a very approachable language. Your assignment till next time is to install EQL and start experimenting with Endgame’s Github datasets. Good luck!

Andy Green

Andy Green

Andy blogs about data privacy and security regulations. He also loves writing about malware threats and what it means for IT security.

 

Does your cybersecurity start at the heart?

Get a highly customized data risk assessment run by engineers who are obsessed with data security.