Log Regex Markup
Requires: | com.ioninja.log-regex-markup |
The Log Regex Markup engine is yet another unique and highly useful feature of IO Ninja dedicated to providing visual aids when you analyze captured data logs.
The engine relies on user-defined regular expressions to automatically — and instantaneously! — highlight data patterns or insert packet delimiters. Under the hood, it is powered by Google RE2 — one of the world's fastest and well-tested regular expression libraries.
Colorize
The most straightforward way to use this regex markup feature is to write a regular expression for important tokens in your protocol — and then colorize (i.e., highlight those tokens with color).
For example, in the screenshot, all XTERM CSI escape sequences are automatically colored pink.
Packetize
An alternative approach is to insert delimiters between packet boundaries and use a regular expression to define such boundaries.
For instance, if packets in your protocol always start with a specific header/preamble, you can define a regular expression to match those headers and insert delimiters before matches.
Likewise, if packets have well-defined terminators — define a terminator-matching regex and insert delimiters after matches.
Multiple Rules
Want to colorize different entities with different colors? How about also highlighting packet boundaries at the same time? No problem! You can define as many rules and use as many colors as necessary to mark up everything you need.
And the best part — it's going to be just as fast as marking up with a single regex!
Regex Engine Specifics
The IO Ninja regular expression engine is a DFA-based one (unlike backtracking engines used in PCRE, Python, Ruby, etc.)
Backtracking engines could suffer from performance drops (sometimes exponential!) depending on particular regular expressions and the number of patterns. IO Ninja doesn't have these performance drawbacks — no matter what the regular expressions are and how many markup rules you define, the log will always get colorized and packetized FAST!
On the other hand, IO Ninja doesn't support some advanced regular expression features available in PCRE and other backtrackers — most notably, backreferences and named groups.
Syntax
Since the backend for our regex markup engine is based on Google RE2, we support everything it does. Please refer to the official RE2 wiki page for a complete syntax specification. Or check the quick cheat-sheet below — that's more than enough to get you started.
Construct | Description | Example |
---|---|---|
. | Any character | . |
| | Alternative | abc|def |
[ ] | Character class | [ghi] |
[^ ] | Negated character class (e.g., [^abc] ) | [^jkl] |
[[: :]] | ASCII Character class | [[:alpha:]] |
[[:^ :]] | Negated ASCII Character class | [[:^space:]] |
( ) | Group a sub-expression | (mno)+ |
? | Preceding element is optional | p+ |
?? | Preceding element is optional (non-greedy) | q?? |
* | Preceding element is repeated zero or more times | r* |
*? | Preceding element is repeated zero or more times (non-greedy) | s*? |
+ | Preceding element is repeated one or more times | t+ |
+? | Preceding element is repeated one or more times (non-greedy) | u+? |
{ n } | Preceding element is repeated exactly n times | v{3} |
{ n, } | Preceding element is repeated at least n times | w{4,} |
{ n, }? | Preceding element is repeated at least n times (non-greedy) | x{5,}? |
{ n, m } | Preceding element is repeated from n to m times | y{6,7} |
{ n, m }? | Preceding element is repeated from n to m times (non-greedy) | z{8,9}? |
Anchor | Description |
---|---|
^ | Match at the beginning of line (or beginning of text) |
$ | Match at the end of line (or end of text) |
\A | Match at the beginning of text |
\z | Match at the end of text |
\b | Match at a word boundary |
\B | Match if not at a word boundary |
Special character | Description |
---|---|
\a | U+0007 — alarm character |
\e | U+001B — escape character |
\f | U+000C — formfeed character |
\n | U+000A — newline character |
\r | U+000D — carriage return character |
\t | U+0009 — tabulation character |
\v | U+000B — vertical tabulation character |
\* | U+002A — asterisk character (escape any special character with \ to match it literally) |
Character code | Description | Example |
---|---|---|
\OOO | Character specified by three octal digits OOO | \033 |
\xHH | Character specified by two hexadecimal digits HH | \x1B |
\x{HHHHHH} | Character specified by hexadecimal digits HHHHHH (up to 10FFFF ) | \x{25A1} |
Perl character class | Description | Expands to |
---|---|---|
\d | Decimal digits | [0-9] |
\D | Not decimal digits | [^0-9] |
\w | Word characters | [0-9A-Za-z_] |
\W | Not word characters | [^0-9A-Za-z_] |
\s | Whitespace | [\t\n\f\r ] |
\S | Not whitespace | [^\t\n\f\r ] |
ASCII character class | Description | Expands to |
---|---|---|
[[:alnum:]] | Alphanumeric | [0-9A-Za-z] |
[[:alpha:]] | Alphabetic | [A-Za-z] |
[[:ascii:]] | ASCII | [\x00-\x7F] |
[[:blank:]] | Blank | [\t ] |
[[:cntrl:]] | Control | [\x00-\x1F\x7F] |
[[:digit:]] | Decimal digits | [0-9] |
[[:graph:]] | Graphical | [!-~] |
[[:lower:]] | Lower case letters | [a-z] |
[[:print:]] | Printable | [ -~] |
[[:punct:]] | Punctuation | [!-/:-@[-`{-~] |
[[:space:]] | Whitespace | [\t\n\v\f\r ] |
[[:upper:]] | Upper case letters | [A-Z] |
[[:word:]] | Word characters | [0-9A-Za-z_] |
[[:xdigit:]] | Hexadecimal digits | [0-9A-Fa-f] |