The project-keep operator in APL is a powerful tool for field selection. It allows you to explicitly keep specific fields from a dataset, discarding any others not listed in the operator’s parameters. This is useful when you only need to work with a subset of fields in your query results and want to reduce clutter or improve performance by eliminating unnecessary fields.
You can use project-keep when you need to focus on particular data points, such as in log analysis, security event monitoring, or extracting key fields from traces.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
In Splunk SPL, the table command performs a similar task to APL’s project-keep. It selects only the fields you specify and excludes any others.
['sample-http-logs']
| project-keep _time, status, uriIn ANSI SQL, the SELECT statement combined with field names performs a task similar to project-keep in APL. Both allow you to specify which fields to retrieve from the dataset.
['sample-http-logs']
| project-keep _time, status, uriUsage
Syntax
| project-keep FieldName1, FieldName2, ...Parameters
FieldName: The field you want to keep in the result set.
Returns
project-keep returns a dataset with only the specified fields. All other fields are removed from the output. The result contains the same number of rows as the input table.
Use case examples
For log analysis, you might want to keep only the fields that are relevant to investigating HTTP requests.
Query
['sample-http-logs']
| project-keep _time, status, uri, method, req_duration_msOutput
| _time | status | uri | method | req_duration_ms |
|---|---|---|---|---|
| 2024-10-17 10:00:00 | 200 | /index.html | GET | 120 |
| 2024-10-17 10:01:00 | 404 | /non-existent.html | GET | 50 |
| 2024-10-17 10:02:00 | 500 | /server-error | POST | 300 |
This query filters the dataset to show only the request timestamp, status, URI, method, and duration, which can help you analyze server performance or errors.
For OpenTelemetry trace analysis, you may want to focus on key tracing details such as service names and trace IDs.
Query
['otel-demo-traces']
| project-keep _time, trace_id, span_id, ['service.name'], durationOutput
| _time | trace_id | span_id | service.name | duration |
|---|---|---|---|---|
| 2024-10-17 10:03:00 | abc123 | xyz789 | frontend | 500ms |
| 2024-10-17 10:04:00 | def456 | mno345 | checkoutservice | 250ms |
This query extracts specific tracing information, such as trace and span IDs, the name of the service, and the span’s duration.
In security log analysis, focusing on essential fields like user ID and HTTP status can help track suspicious activity.
Query
['sample-http-logs']
| project-keep _time, id, status, uri, ['geo.city'], ['geo.country']Output
| _time | id | status | uri | geo.city | geo.country |
|---|---|---|---|---|---|
| 2024-10-17 10:05:00 | user123 | 403 | /admin | New York | USA |
| 2024-10-17 10:06:00 | user456 | 200 | /login | San Francisco | USA |
This query narrows down the data to track HTTP status codes by users, helping identify potential unauthorized access attempts.
List of related operators
- project: Use
projectto explicitly specify the fields you want in your result, while also allowing transformations or calculations on those fields. - extend: Use
extendto add new fields or modify existing ones without dropping any fields. - summarize: Use
summarizewhen you need to perform aggregation operations on your dataset, grouping data as necessary.
Wildcard
Wildcard refers to a special character or a set of characters that can be used to substitute for any other character in a search pattern. Use wildcards to create more flexible queries and perform more powerful searches.
The syntax for wildcard can either be data* or ['data.fo']*.
Here’s how you can use wildcards in project-keep:
['sample-http-logs']
| project-keep resp*, content*, ['geo.']*['github-push-event']
| project-keep size*, repo*, ['commits']*, id*