project operator

The project operator in Axiom Processing Language (APL) is used to select specific fields from a dataset, potentially renaming them or applying calculations on the fly. With project, you can control which fields are returned by the query, allowing you to focus on only the data you need.

This operator is useful when you want to refine your query results by reducing the number of fields, renaming them, or deriving new fields based on existing data. It’s a powerful tool for filtering out unnecessary fields and performing light transformations on your dataset.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, the equivalent of the project operator is typically the table or fields command. While SPL’s table focuses on selecting fields, fields controls both selection and exclusion, similar to project in APL.

```sql Splunk example | table _time, status, uri ```
['sample-http-logs']
| project _time, status, uri

In ANSI SQL, the SELECT statement serves a similar role to the project operator in APL. SQL users will recognize that project behaves like selecting fields from a table, with the ability to rename or transform fields inline.

```sql SQL example SELECT _time, status, uri FROM sample_http_logs; ```
['sample-http-logs']
| project _time, status, uri

Usage

Syntax

| project FieldName [= Expression] [, ...]

Or

| project FieldName, FieldName, FieldName, ...

Or

| project [FieldName, FieldName[,] = Expression [, ...]

Parameters

  • FieldName: The names of the fields in the order you want them to appear in the result set. If there is no Expression, then FieldName is compulsory and a field of that name must appear in the input.
  • Expression: Optional scalar expression referencing the input fields.

Returns

The project operator returns a dataset containing only the specified fields.

Use case examples

In this example, you’ll extract the timestamp, HTTP status code, and request URI from the sample HTTP logs.

Query

['sample-http-logs']
| project _time, status, uri

Run in Playground

Output

_time status uri
2024-10-17 12:00:00 200 /api/v1/getData
2024-10-17 12:01:00 404 /api/v1/getUser

The query returns only the timestamp, HTTP status code, and request URI, reducing unnecessary fields from the dataset.

In this example, you’ll extract trace information such as the service name, span ID, and duration from OpenTelemetry traces.

Query

['otel-demo-traces']
| project ['service.name'], span_id, duration

Run in Playground

Output

service.name span_id duration
frontend span-1234abcd 00:00:02
cartservice span-5678efgh 00:00:05

The query isolates relevant tracing data, such as the service name, span ID, and duration of spans.

In this example, you’ll focus on security log entries by projecting only the timestamp, user ID, and HTTP status from the sample HTTP logs.

Query

['sample-http-logs']
| project _time, id, status

Run in Playground

Output

_time id status
2024-10-17 12:00:00 user1 200
2024-10-17 12:01:00 user2 403

The query extracts only the timestamp, user ID, and HTTP status for analysis of access control in security logs.

  • extend: Use extend to add new fields or calculate values without removing any existing fields.
  • summarize: Use summarize to aggregate data across groups of rows, which is useful when you’re calculating totals or averages.
  • where: Use where to filter rows based on conditions, often paired with project to refine your dataset further.

Good morning

I'm here to help you with the docs.

I
AIBased on your context