Log Regex Markup

Log Regex Markup

The Log Regex Markup engine is yet another unique and highly useful feature of IO Ninja dedicated to providing visual aids when you analyze captured data logs.

The engine relies on user-defined regular expressions to automatically — and instantaneously! — highlight data patterns or insert packet delimiters. Under the hood, it is powered by Google RE2 — one of the world's fastest and well-tested regular expression libraries.

Colorize

The most straightforward way to use this regex markup feature is to write a regular expression for important tokens in your protocol — and then colorize (i.e., highlight those tokens with color).

For example, in the screenshot, all XTERM CSI escape sequences are automatically colored pink.

Packetize

An alternative approach is to insert delimiters between packet boundaries and use a regular expression to define such boundaries.

For instance, if packets in your protocol always start with a specific header/preamble, you can define a regular expression to match those headers and insert delimiters before matches.

Likewise, if packets have well-defined terminators — define a terminator-matching regex and insert delimiters after matches.

Multiple Rules

Want to colorize different entities with different colors? How about also highlighting packet boundaries at the same time? No problem! You can define as many rules and use as many colors as necessary to mark up everything you need.

And the best part — it's going to be just as fast as marking up with a single regex!

Regex Engine Specifics

The IO Ninja regular expression engine is a DFA-based one (unlike backtracking engines used in PCRE, Python, Ruby, etc.)

Backtracking engines could suffer from performance drops (sometimes exponential!) depending on particular regular expressions and the number of patterns. IO Ninja doesn't have these performance drawbacks — no matter what the regular expressions are and how many markup rules you define, the log will always get colorized and packetized FAST!

On the other hand, IO Ninja doesn't support some advanced regular expression features available in PCRE and other backtrackers — most notably, backreferences and named groups.

Syntax

Since the backend for our regex markup engine is based on Google RE2, we support everything it does. Please refer to the official RE2 wiki page for a complete syntax specification. Or check the quick cheat-sheet below — that's more than enough to get you started.

ConstructDescriptionExample
.Any character.
|Alternativeabc|def
[ ]Character class[ghi]
[^ ]Negated character class (e.g., [^abc])[^jkl]
[[: :]]ASCII Character class[[:alpha:]]
[[:^ :]]Negated ASCII Character class[[:^space:]]
( )Group a sub-expression(mno)+
?Preceding element is optionalp+
??Preceding element is optional (non-greedy)q??
*Preceding element is repeated zero or more timesr*
*?Preceding element is repeated zero or more times (non-greedy)s*?
+Preceding element is repeated one or more timest+
+?Preceding element is repeated one or more times (non-greedy)u+?
{ n }Preceding element is repeated exactly n timesv{3}
{ n, }Preceding element is repeated at least n timesw{4,}
{ n, }?Preceding element is repeated at least n times (non-greedy)x{5,}?
{ n, m }Preceding element is repeated from n to m timesy{6,7}
{ n, m }?Preceding element is repeated from n to m times (non-greedy)z{8,9}?
AnchorDescription
^Match at the beginning of line (or beginning of text)
$Match at the end of line (or end of text)
\AMatch at the beginning of text
\zMatch at the end of text
\bMatch at a word boundary
\BMatch if not at a word boundary
Special characterDescription
\aU+0007 — alarm character
\eU+001B — escape character
\fU+000C — formfeed character
\nU+000A — newline character
\rU+000D — carriage return character
\tU+0009 — tabulation character
\vU+000B — vertical tabulation character
\*U+002A — asterisk character (escape any special character with \ to match it literally)
Character codeDescriptionExample
\OOOCharacter specified by three octal digits OOO\033
\xHHCharacter specified by two hexadecimal digits HH\x1B
\x{HHHHHH}Character specified by hexadecimal digits HHHHHH (up to 10FFFF)\x{25A1}
Perl character classDescriptionExpands to
\dDecimal digits[0-9]
\DNot decimal digits[^0-9]
\wWord characters[0-9A-Za-z_]
\WNot word characters[^0-9A-Za-z_]
\sWhitespace[\t\n\f\r ]
\SNot whitespace[^\t\n\f\r ]
ASCII character classDescriptionExpands to
[[:alnum:]]Alphanumeric[0-9A-Za-z]
[[:alpha:]]Alphabetic[A-Za-z]
[[:ascii:]]ASCII[\x00-\x7F]
[[:blank:]]Blank[\t ]
[[:cntrl:]]Control[\x00-\x1F\x7F]
[[:digit:]]Decimal digits[0-9]
[[:graph:]]Graphical[!-~]
[[:lower:]]Lower case letters[a-z]
[[:print:]]Printable[ -~]
[[:punct:]]Punctuation[!-/:-@[-`{-~]
[[:space:]]Whitespace[\t\n\v\f\r ]
[[:upper:]]Upper case letters[A-Z]
[[:word:]]Word characters[0-9A-Za-z_]
[[:xdigit:]]Hexadecimal digits[0-9A-Fa-f]

Gallery