The extract_all function retrieves all substrings that match a regular expression from a source string. Use this function when you need to capture multiple matches of a pattern, such as extracting all email addresses, URLs, or repeated patterns from log entries.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, you use rex with max_match=0 to extract all matches. APL's extract_all provides a more direct approach.

```sql Splunk example | rex field=message max_match=0 "error_(?\d+)" | mvexpand code ```
['sample-http-logs']
| extend codes = extract_all('error_(\\d+)', dynamic([1]), uri)

In ANSI SQL, extracting all regex matches typically requires recursive queries or database-specific functions. APL's extract_all simplifies this operation.

```sql SQL example SELECT REGEXP_EXTRACT_ALL(field, 'pattern') AS matches FROM logs; ```
['sample-http-logs']
| extend matches = extract_all('pattern', dynamic([1]), field)

Usage

Syntax

extract_all(regex, captureGroups, text)

Parameters

Name Type Required Description
regex string Yes A regular expression with one or more capture groups.
captureGroups dynamic array Yes An array of capture group numbers to extract (e.g., dynamic([1]) or dynamic([1,2])).
text string Yes The source string to search.

Returns

Returns a dynamic array containing all matches. For single capture groups, returns a one-dimensional array. For multiple capture groups, returns a two-dimensional array.

Use case examples

Extract all numeric values from URIs to analyze parameter patterns in API requests.

Query

['sample-http-logs']
| extend numbers = extract_all('([0-9]+)', dynamic([1]), uri)
| where array_length(numbers) > 0
| project _time, uri, numbers, method
| limit 10

Run in Playground

Output

_time uri numbers method
2024-11-06T10:00:00Z /api/users/123/posts/456 ["123", "456"] GET
2024-11-06T10:01:00Z /products/789 ["789"] GET
2024-11-06T10:02:00Z /orders/111/items/222/details/333 ["111", "222", "333"] POST

This query extracts all numeric values from URIs, helping analyze how many IDs are typically passed in API requests and their patterns.

Extract all service names mentioned in span attributes to understand service dependencies.

Query

['otel-demo-traces']
| extend service_mentions = extract_all('(frontend|checkout|cart|product-catalog)', dynamic([1]), ['service.name'])
| where array_length(service_mentions) > 0
| summarize mention_count = count() by service_mention = tostring(service_mentions)
| sort by mention_count desc
| limit 10

Run in Playground

Output

service_mention mention_count
["frontend"] 4532
["checkout"] 3421
["cart"] 2987
["product-catalog"] 2341

This query extracts all service name patterns from span data, helping understand which services are most frequently referenced in traces.

Extract all suspicious keywords from URIs to detect potential SQL injection or XSS attempts.

Query

['sample-http-logs']
| extend threats = extract_all('(union|select|script|alert|drop|insert|delete)', dynamic([1]), uri)
| where array_length(threats) > 0
| project _time, uri, threats, id, status, ['geo.country']
| sort by array_length(threats) desc
| limit 10

Run in Playground

Output

_time uri threats id status geo.country
2024-11-06T10:00:00Z /search?q= ["script", "alert"] user123 403 Unknown
2024-11-06T10:01:00Z /api?id=1'union select * ["union", "select"] user456 403 Russia

This query extracts all SQL and XSS-related keywords from URIs, helping identify potential injection attacks by counting how many threat indicators appear in each request.

  • extract: Extracts only the first match of a regex pattern. Use this when you only need the first occurrence rather than all matches.
  • split: Splits strings by a delimiter into an array. Use this for simpler tokenization without regex complexity.
  • parse_json: Parses JSON strings into dynamic objects. Use this when working with structured JSON data rather than regex patterns.
  • countof_regex: Counts regex pattern occurrences. Use this when you only need the count of matches, not the actual matched text.

Good evening

I'm here to help you with the docs.

I
AIBased on your context