The where operator in APL is used to filter rows based on specified conditions. You can use the where operator to return only the records that meet the criteria you define. It’s a foundational operator in querying datasets, helping you focus on specific data by applying conditions to filter out unwanted rows. This is useful when working with large datasets, logs, traces, or security events, allowing you to extract meaningful information quickly.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, the where operator filters events based on boolean expressions. APL’s where operator functions similarly, allowing you to filter rows that satisfy a condition.

```sql Splunk example index=main | where status="200" ```
['sample-http-logs']
| where status == '200'

In ANSI SQL, the WHERE clause filters rows in a SELECT query based on a condition. APL’s where operator behaves similarly, but the syntax reflects APL’s specific dataset structures.

```sql SQL example SELECT * FROM sample_http_logs WHERE status = '200' ```
['sample-http-logs']
| where status == '200'

Usage

Syntax

| where condition

Parameters

  • condition: A Boolean expression that specifies the filtering condition. The where operator returns only the rows that satisfy this condition.

Returns

The where operator returns a filtered dataset containing only the rows where the condition evaluates to true.

Use case examples

In this use case, you filter HTTP logs to focus on records where the HTTP status is 404 (Not Found).

Query

['sample-http-logs']
| where status == '404'

Run in Playground

Output

_time id status method uri req_duration_ms geo.city geo.country
2024-10-17 10:20:00 12345 404 GET /notfound.html 120 Seattle US

This query filters out all HTTP requests except those that resulted in a 404 error, making it easy to investigate pages that were not found.

Here, you filter OpenTelemetry traces to retrieve spans where the duration exceeded 500 milliseconds.

Query

['otel-demo-traces']
| where duration > 500ms

Run in Playground

Output

_time span_id trace_id duration service.name kind status_code
2024-10-17 11:15:00 abc123 xyz789 520ms frontend server OK

This query helps identify spans with durations longer than 500 milliseconds, which might indicate performance issues.

In this security use case, you filter logs to find requests from users in a specific country, such as Germany.

Query

['sample-http-logs']
| where ['geo.country'] == 'Germany'

Run in Playground

Output

_time id status method uri req_duration_ms geo.city geo.country
2024-10-17 09:45:00 54321 200 POST /login 100 Berlin Germany

This query helps filter logs to investigate activity originating from a specific country, useful for security and compliance.

where * has

The * has pattern in APL is a dynamic and powerful tool within the where operator. It offers you the flexibility to search for specific substrings across all fields in a dataset without the need to specify each field name individually. This becomes especially advantageous when dealing with datasets that have numerous or dynamically named fields.

where * has is an expensive operation because it searches all fields. For a more efficient query, explicitly list the fields in which you want to search. For example: where firstName has "miguel" or lastName has "miguel".

Basic where * has usage

Find events where any field contains a specific substring.

['sample-http-logs'] 
| where * has "GET"

Run in Playground

Combine multiple substrings

Find events where any field contains one of multiple substrings.

['sample-http-logs'] 
| where * has "GET" or * has "text"

Run in Playground

Use * has with other operators

Find events where any field contains a substring, and another specific field equals a certain value.

['sample-http-logs'] 
| where * has "css" and req_duration_ms == 1

Run in Playground

Advanced chaining

Filter data based on several conditions, including fields containing certain substrings, then summarize by another specific criterion.

['sample-http-logs']
| where * has "GET" and * has "css"
| summarize Count=count() by method, content_type, server_datacenter

Run in Playground

Use with aggregations

Find the average of a specific field for events where any field contains a certain substring.

['sample-http-logs']
| where * has "Japan"
| summarize avg(req_duration_ms)

Run in Playground

String case transformation

The has operator is case insensitive. Use has if you’re unsure about the case of the substring in the dataset. For the case-sensitive operator, use has_cs.

['sample-http-logs']
| where * has "mexico"
| summarize avg(req_duration_ms)

Run in Playground

  • count: Use count to return the number of records that match specific criteria.
  • distinct: Use distinct to return unique values in a dataset, complementing filtering.
  • take: Use take to return a specific number of records, typically in combination with where for pagination.

Good evening

I'm here to help you with the docs.

I
AIBased on your context