The project-keep operator in APL is a powerful tool for field selection. It allows you to explicitly keep specific fields from a dataset, discarding any others not listed in the operator’s parameters. This is useful when you only need to work with a subset of fields in your query results and want to reduce clutter or improve performance by eliminating unnecessary fields.

You can use project-keep when you need to focus on particular data points, such as in log analysis, security event monitoring, or extracting key fields from traces.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, the table command performs a similar task to APL’s project-keep. It selects only the fields you specify and excludes any others.

```splunk Splunk example index=main | table _time, status, uri ```
['sample-http-logs'] 
| project-keep _time, status, uri

In ANSI SQL, the SELECT statement combined with field names performs a task similar to project-keep in APL. Both allow you to specify which fields to retrieve from the dataset.

```sql SQL example SELECT _time, status, uri FROM sample_http_logs ```
['sample-http-logs'] 
| project-keep _time, status, uri

Usage

Syntax

| project-keep FieldName1, FieldName2, ...

Parameters

  • FieldName: The field you want to keep in the result set.

Returns

project-keep returns a dataset with only the specified fields. All other fields are removed from the output. The result contains the same number of rows as the input table.

Use case examples

For log analysis, you might want to keep only the fields that are relevant to investigating HTTP requests.

Query

['sample-http-logs'] 
| project-keep _time, status, uri, method, req_duration_ms

Run in Playground

Output

_time status uri method req_duration_ms
2024-10-17 10:00:00 200 /index.html GET 120
2024-10-17 10:01:00 404 /non-existent.html GET 50
2024-10-17 10:02:00 500 /server-error POST 300

This query filters the dataset to show only the request timestamp, status, URI, method, and duration, which can help you analyze server performance or errors.

For OpenTelemetry trace analysis, you may want to focus on key tracing details such as service names and trace IDs.

Query

['otel-demo-traces'] 
| project-keep _time, trace_id, span_id, ['service.name'], duration

Run in Playground

Output

_time trace_id span_id service.name duration
2024-10-17 10:03:00 abc123 xyz789 frontend 500ms
2024-10-17 10:04:00 def456 mno345 checkoutservice 250ms

This query extracts specific tracing information, such as trace and span IDs, the name of the service, and the span’s duration.

In security log analysis, focusing on essential fields like user ID and HTTP status can help track suspicious activity.

Query

['sample-http-logs'] 
| project-keep _time, id, status, uri, ['geo.city'], ['geo.country']

Run in Playground

Output

_time id status uri geo.city geo.country
2024-10-17 10:05:00 user123 403 /admin New York USA
2024-10-17 10:06:00 user456 200 /login San Francisco USA

This query narrows down the data to track HTTP status codes by users, helping identify potential unauthorized access attempts.

  • project: Use project to explicitly specify the fields you want in your result, while also allowing transformations or calculations on those fields.
  • extend: Use extend to add new fields or modify existing ones without dropping any fields.
  • summarize: Use summarize when you need to perform aggregation operations on your dataset, grouping data as necessary.

Wildcard

Wildcard refers to a special character or a set of characters that can be used to substitute for any other character in a search pattern. Use wildcards to create more flexible queries and perform more powerful searches.

The syntax for wildcard can either be data* or ['data.fo']*.

Here’s how you can use wildcards in project-keep:

['sample-http-logs']
| project-keep resp*, content*,  ['geo.']*

Run in Playground

['github-push-event']
| project-keep size*, repo*, ['commits']*, id*

Run in Playground

Good afternoon

I'm here to help you with the docs.

I
AIBased on your context