The extend-valid operator in Axiom Processing Language (APL) allows you to extend a set of fields with new calculated values, where these calculations are based on conditions of validity for each row. It’s particularly useful when working with datasets that contain missing or invalid data, as it enables you to calculate and assign values only when certain conditions are met. This operator helps you keep your data clean by applying calculations to valid data points, and leaving invalid or missing values untouched.

This is a shorthand operator to create a field while also doing basic checking on the validity of the field. In many cases, additional checks are required and it’s recommended in those cases a combination of an extend and a where operator are used. The basic checks that Axiom preform depend on the type of the expression:

  • Dictionary: Check if the dictionary isn’t null and has at least one entry.
  • Array: Check if the array isn’t null and has at least one value.
  • String: Check if the string isn’t empty and has at least one character.
  • Number: Check if the value isn’t one of the following: zero, infinity, or NaN.
  • Other types: The same logic as tobool and a check for true.

You can use extend-valid to perform conditional transformations on large datasets, especially in scenarios where data quality varies or when dealing with complex log or telemetry data.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, similar functionality is achieved using the eval function, but with the if command to handle conditional logic for valid or invalid data. In APL, extend-valid is more specialized for handling valid data points directly, allowing you to extend fields based on conditions.

```sql Splunk example | eval new_field = if(isnotnull(field), field + 1, null()) ```
['sample-http-logs'] 
| extend-valid new_field = req_duration_ms + 100

In ANSI SQL, similar functionality is often achieved using the CASE WHEN expression within a SELECT statement to handle conditional logic for fields. In APL, extend-valid directly extends a field conditionally, based on the validity of the data.

```sql SQL example SELECT CASE WHEN req_duration_ms IS NOT NULL THEN req_duration_ms + 100 ELSE NULL END AS new_field FROM sample_http_logs; ```
['sample-http-logs'] 
| extend-valid new_field = req_duration_ms + 100

Usage

Syntax

| extend-valid FieldName1 = Expression1, FieldName2 = Expression2, FieldName3 = ...

Parameters

  • FieldName: The name of the existing field that you want to extend.
  • Expression: The expression to evaluate and apply for valid rows.

Returns

The operator returns a table where the specified fields are extended with new values based on the given expression for valid rows. The original value remains unchanged.

Use case examples

In this use case, you normalize the HTTP request methods by converting them to uppercase for valid entries.

Query

['sample-http-logs']
| extend-valid upper_method = toupper(method)

Run in Playground

Output

_time method upper_method
2023-10-01 12:00:00 get GET
2023-10-01 12:01:00 POST POST
2023-10-01 12:02:00 NULL NULL

In this query, the toupper function converts the method field to uppercase, but only for valid entries. If the method field is null, the result remains null.

In this use case, you extract the first part of the service namespace (before the hyphen) from valid namespaces in the OpenTelemetry traces.

Query

['otel-demo-traces']
| extend-valid namespace_prefix = extract('^(.*?)-', 1, ['service.namespace'])

Run in Playground

Output

_time service.namespace namespace_prefix
2023-10-01 12:00:00 opentelemetry-demo opentelemetry
2023-10-01 12:01:00 opentelemetry-prod opentelemetry
2023-10-01 12:02:00 NULL NULL

In this query, the extract function pulls the first part of the service namespace. It only applies to valid service.namespace values, leaving nulls unchanged.

In this use case, you extract the first letter of the city names from the geo.city field for valid log entries.

Query

['sample-http-logs']
| extend-valid city_first_letter = extract('^([A-Za-z])', 1, ['geo.city'])

Run in Playground

Output

_time geo.city city_first_letter
2023-10-01 12:00:00 New York N
2023-10-01 12:01:00 NULL NULL
2023-10-01 12:02:00 London L
2023-10-01 12:03:00 1Paris NULL

In this query, the extract function retrieves the first letter of the city names from the geo.city field for valid entries. If the geo.city field is null or starts with a non-alphabetical character, no city name is extracted, and the result remains null.

  • extend: Use extend to add calculated fields unconditionally, without validating data.
  • project: Use project to select and rename fields, without performing conditional extensions.
  • summarize: Use summarize for aggregation, often used before extending fields with further calculations.
  • isfinite: Determines whether the input is a finite value (neither infinite nor NaN).

Good morning

I'm here to help you with the docs.

I
AIBased on your context