The lookup operator extends a primary dataset with a lookup table based on a specified key column. It retrieves matching rows from the lookup table and appends relevant fields to the primary dataset. You can use lookup for enriching event data, adding contextual information, or correlating logs with reference tables.

The lookup operator is useful when:

  • You need to enrich log events with additional metadata, such as mapping user IDs to user profiles.
  • You want to correlate security logs with threat intelligence feeds.
  • You need to extend OpenTelemetry traces with supplementary details, such as service dependencies.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, the lookup command performs a similar function by enriching event data with fields from an external lookup table. However, unlike Splunk, APL’s lookup operator only performs an inner join.

```sql Splunk example index=web_logs | lookup port_lookup port AS client_port OUTPUT service_name ```
['sample-http-logs']
| lookup kind=inner ['port_lookup'] on port

In ANSI SQL, lookup is similar to an INNER JOIN, where records from both tables are matched based on a common key. Unlike SQL, APL doesn’t support other types of joins in lookup.

```sql SQL example SELECT logs.*, ports.service_name FROM logs INNER JOIN port_lookup ports ON logs.port = ports.port; ```
['sample-http-logs']
| lookup kind=inner ['port_lookup'] on port

Usage

Syntax

PrimaryDataset
| lookup kind=KindOfLookup LookupTable on Conditions

Parameters

  • PrimaryDataset: The primary dataset that you want to extend. If you expect one of the tables to contain consistently more data than the other, specify the larger table as the primary dataset.
  • LookupTable: The data table containing additional data, also known as the dimension table or lookup table.
  • KindOfLookup: Optionally, specifies the lookup type as leftouter or inner. The default is leftouter.
    • leftouter lookup includes all rows from the primary dataset even if they don’t match the conditions. In unmatched rows, the new fields contain nulls.
    • inner lookup only includes rows from the primary dataset if they match the conditions. Unmatched rows are excluded from the output.
  • Conditions: The conditions for matching rows from PrimaryDataset to rows from LookupTable. The conditions are equality expressions that determine how Axiom matches rows from the PrimaryDataset (left side of the equality expression) with rows from the LookupTable (right side of the equality expression). The two sides of the equality expression must have the same data type.
    • To use lookup on a key column that has the same name in the primary dataset and the lookup table, simply use the field name. For example, on id.
    • To use lookup on a key column that has different names in the primary dataset and the lookup table, define the two field names in an equality expression such as on id == trace_id.
    • You can define multiple conditions. To separate conditions, use commas (,). Don’t use and. For example, on id == trace_id, span == span_id.

Returns

A dataset where rows from PrimaryDataset are enriched with matching columns from LookupTable based on the key column.

Use case example

Add a field with human-readable names for each service.

Query

let LookupTable=datatable(serviceName:string, humanreadableServiceName:string)[
	'frontend', 'Frontend',
	'frontendproxy', 'Frontend proxy',
	'flagd', 'Flagd',
	'productcatalogservice', 'Product catalog',
	'loadgenerator', 'Load generator',
	'checkoutservice', 'Checkout',
	'cartservice', 'Cart',
	'recommendationservice', 'Recommendations',
	'emailservice', 'Email',
	'adservice', 'Ads',
	'shippingservice', 'Shipping',
	'quoteservice', 'Quote',
	'currencyservice', 'Currency',
	'paymentservice', 'Payment',
	'frauddetectionservice', 'Fraud detection',
];
['otel-demo-traces']
| lookup kind=leftouter LookupTable on $left.['service.name'] == $right.serviceName
| project _time, span_id, ['service.name'], humanreadableServiceName

Run in Playground

Output

_time span_id service.name humanreadableServiceName
Feb 27, 12:01:55 15bf0a95dfbfcd77 loadgenerator Load generator
Feb 27, 12:01:55 86c27626407be459 frontendproxy Frontend proxy
Feb 27, 12:01:55 89d9b5687056b1cf frontendproxy Frontend proxy
Feb 27, 12:01:55 bbc1bac7ebf6ce8a frontend Frontend
Feb 27, 12:01:55 cd12307e154a4817 frontend Frontend
Feb 27, 12:01:55 21fd89efd3d36b15 frontend Frontend
Feb 27, 12:01:55 c6e8db2d149ab273 frontend Frontend
Feb 27, 12:01:55 fd569a8fce7a8446 cartservice Cart
Feb 27, 12:01:55 ed61fac37e9bf220 loadgenerator Load generator
Feb 27, 12:01:55 83fdf8a30477e726 frontend Frontend
Feb 27, 12:01:55 40d94294da7b04ce frontendproxy Frontend proxy
  • join: Performs more flexible join operations, including left, right, and outer joins.
  • project: Selects specific columns from a dataset, which can be used to refine the output of a lookup operation.
  • union: Combines multiple datasets without requiring a key column.

Good morning

I'm here to help you with the docs.

I
AIBased on your context