Project Keep Operator — axiomhq/docs

The project-keep operator in APL is a powerful tool for field selection. It allows you to explicitly keep specific fields from a dataset, discarding any others not listed in the operator’s parameters. This is useful when you only need to work with a subset of fields in your query results and want to reduce clutter or improve performance by eliminating unnecessary fields.

You can use project-keep when you need to focus on particular data points, such as in log analysis, security event monitoring, or extracting key fields from traces.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, the table command performs a similar task to APL’s project-keep. It selects only the fields you specify and excludes any others.

```splunk Splunk example index=main | table _time, status, uri ```

['sample-http-logs'] 
| project-keep _time, status, uri

In ANSI SQL, the SELECT statement combined with field names performs a task similar to project-keep in APL. Both allow you to specify which fields to retrieve from the dataset.

```sql SQL example SELECT _time, status, uri FROM sample_http_logs ```

['sample-http-logs'] 
| project-keep _time, status, uri

Usage

Syntax

| project-keep FieldName1, FieldName2, ...

Parameters

FieldName: The field you want to keep in the result set.

Returns

project-keep returns a dataset with only the specified fields. All other fields are removed from the output. The result contains the same number of rows as the input table.

Use case examples

For log analysis, you might want to keep only the fields that are relevant to investigating HTTP requests.

Query

['sample-http-logs'] 
| project-keep _time, status, uri, method, req_duration_ms

Run in Playground

Output

_time	status	uri	method	req_duration_ms
2024-10-17 10:00:00	200	/index.html	GET	120
2024-10-17 10:01:00	404	/non-existent.html	GET	50
2024-10-17 10:02:00	500	/server-error	POST	300

This query filters the dataset to show only the request timestamp, status, URI, method, and duration, which can help you analyze server performance or errors.

For OpenTelemetry trace analysis, you may want to focus on key tracing details such as service names and trace IDs.

Query

['otel-demo-traces'] 
| project-keep _time, trace_id, span_id, ['service.name'], duration

Run in Playground

Output

_time	trace_id	span_id	service.name	duration
2024-10-17 10:03:00	abc123	xyz789	frontend	500ms
2024-10-17 10:04:00	def456	mno345	checkoutservice	250ms

This query extracts specific tracing information, such as trace and span IDs, the name of the service, and the span’s duration.

In security log analysis, focusing on essential fields like user ID and HTTP status can help track suspicious activity.

Query

['sample-http-logs'] 
| project-keep _time, id, status, uri, ['geo.city'], ['geo.country']

Run in Playground

Output

_time	id	status	uri	geo.city	geo.country
2024-10-17 10:05:00	user123	403	/admin	New York	USA
2024-10-17 10:06:00	user456	200	/login	San Francisco	USA

This query narrows down the data to track HTTP status codes by users, helping identify potential unauthorized access attempts.

project: Use project to explicitly specify the fields you want in your result, while also allowing transformations or calculations on those fields.
extend: Use extend to add new fields or modify existing ones without dropping any fields.
summarize: Use summarize when you need to perform aggregation operations on your dataset, grouping data as necessary.

Wildcard

Wildcard refers to a special character or a set of characters that can be used to substitute for any other character in a search pattern. Use wildcards to create more flexible queries and perform more powerful searches.

The syntax for wildcard can either be data* or ['data.fo']*.

Here’s how you can use wildcards in project-keep:

['sample-http-logs']
| project-keep resp*, content*,  ['geo.']*

Run in Playground

['github-push-event']
| project-keep size*, repo*, ['commits']*, id*

Run in Playground

#For users of other query languages

#Usage

#Syntax

#Parameters

#Returns

#Use case examples

#List of related operators

#Wildcard

Good afternoon

For users of other query languages

Usage

Syntax

Parameters

Returns

Use case examples

List of related operators

Wildcard