The redact operator in APL replaces sensitive or unwanted data in string fields using regular expressions. You can use it to sanitize log data, obfuscate personal information, or anonymize text for auditing or analysis. The operator allows you to define one or multiple regular expressions to identify and replace matching patterns. You can customize the replacement token, generate hashes of redacted values, or retain structural elements while obfuscating specific segments of data.

This operator is useful when you need to ensure data privacy or compliance with regulations such as GDPR or HIPAA. For example, you can redact credit card numbers, email addresses, or personally identifiable information from logs and datasets.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, data sanitization is often achieved using custom regex-based transformations or eval functions. The redact operator in APL simplifies this process by directly applying regular expressions and offering options for replacement or hashing.

```sql Splunk example | eval sanitized_field=replace(field, "regex_pattern", "*") ```
| redact 'regex_pattern' on field

ANSI SQL typically requires a combination of functions like REPLACE or REGEXP_REPLACE for data obfuscation. APL’s redact operator consolidates these capabilities into a single, flexible command.

```sql SQL example SELECT REGEXP_REPLACE(field, 'regex_pattern', '*') AS sanitized_field FROM table; ```
| redact 'regex_pattern' on field

Usage

Syntax

| redact [replaceToken="*"] [replaceHash=false] [redactGroups=false] <regex>, (<regex>) [on Field]

Parameters

Parameter Type Description
replaceToken string The string with which to replace matches. If you specify a single character, Axiom replaces each character in the matching text with replaceToken. If you specify more than one character, Axiom replaces the whole of the matching text with replaceToken. The default replaceToken is the * character.
replaceHash bool Specifies whether to replace matches with a hash of the data. You can’t use both replaceToken and replaceHash in the same query.
redactGroups bool Specifies whether to look for capturing groups in the regex and only redact characters in the capturing groups. Use this option for partial replacements or replacements that maintain the structure of the data. The default is false.
regex regex A single regex or an array/map of regexes to match against field values.
on Field Limits redaction to specific fields. If you omit this parameter, Axiom redacts all string fields in the dataset.

Returns

Returns the input dataset with sensitive data replaced or hashed.

Sample regular expressions

Operation Sample regex Original string Redacted string
Redact email addresses [a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+ Incoming Mail - abc@test.com Incoming Mail - ************
Redact social security numbers \d{3}-\d{2}-\d{4} SSN 123-12-1234.pdf SSN ***********.pdf
Redact IBAN [A-Z]{2}[0-9]{2}(?:[ ]?[0-9]{4}){4}(?!(?:[ ]?[0-9]){3})(?:[ ]?[0-9]{1,2})? AB12 1234 1234 1234 1234 ************************

Use case examples

Use the redact operator to sanitize HTTP logs by obfuscating geographical data.

Query

['sample-http-logs']
| redact replaceToken="x" @'.*' on ['geo.city'], ['geo.country']

Run in Playground

Output

_time geo.city geo.country
2025-01-01 12:00:00 xxx xxxxxxxx
2025-01-01 12:05:00 xxxxxx xxxxxxxxxx

The query replaces all characters matching the pattern .* with the character x in the geo.city and geo.country fields.

In OpenTelemetry traces, use redact to anonymize Kubernetes node names with their hashes while preserving the service structure.

Query

['otel-demo-traces']
| redact replaceHash=true @'.*' on ['resource.k8s.node.name']

Run in Playground

Output

_time resource.k8s.node.name service.name
2025-01-01 12:00:00 QQXRv6VU frontend
2025-01-01 12:05:00 Q1urOteW checkoutservice

The query replaces Kubernetes node names with hashed values while keeping the rest of the trace intact.

Use the redact operator to remove parts of a URL from security logs.

Query

['sample-http-logs']
| redact replaceToken="<REDACTED>" redactGroups=true @'.*/(.*)' on uri

Run in Playground

Output

_time uri
2025-01-01 12:00:00 /api/v1/pub/sub/<REDACTED>
2025-01-01 12:05:00 /api/v1/textdata/<REDACTED>
2025-01-01 12:10:00 /api/v1/payment/<REDACTED>

The query performs a partial redaction in the capturing groups of the regex. It replaces the slug of the URL (the part after the last /) with the text <REDACTED>.

  • project: Select specific fields from the dataset. Useful for focused analysis.
  • summarize: Aggregate data. Helpful when combining redacted data with statistical analysis.
  • parse: Extract and parse structured data using regex patterns.

When you need custom replacement patterns, use the replace_regex function for precise control over string replacements. redact provides a simpler, security-focused interface. Use redact if you’re primarily focused on data privacy and compliance, and replace_regex if you need more control over the replacement text format.

Good morning

I'm here to help you with the docs.

I
AIBased on your context