The externaldata operator in APL allows you to retrieve data from external storage sources, such as Azure Blob Storage, AWS S3, or HTTP endpoints, and use it within queries. You can specify the schema of the external data and query it as if it were a native dataset. This operator is useful when you need to analyze data that’s stored externally without importing it into Axiom.

The `externaldata` operator currently supports external data sources with a file size of maximum 5 MB.

The externaldata operator is currently in public preview. For more information, see Feature states.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Splunk doesn’t have a direct equivalent to externaldata, but you can use inputlookup or | rest commands to retrieve data from external sources.

```sql Splunk example | inputlookup external_data.csv ```
externaldata (id:string, timestamp:datetime) ["https://storage.example.com/data.csv"] with (format="csv")

In SQL, the equivalent approach is to use OPENROWSET to access external data stored in cloud storage.

```sql SQL example SELECT * FROM OPENROWSET(BULK 'https://storage.example.com/data.csv', FORMAT = 'CSV') AS data; ```
externaldata (id:string, timestamp:datetime) ["https://storage.example.com/data.csv"] with (format="csv")

Usage

Syntax

externaldata (FieldName1:FieldType1, FieldName2:FieldType2, ...) ["URL1", "URL2", ...] [with (format = "FormatType", ignoreFirstRecord=false)]

Parameters

Parameter Description
FieldName1:FieldType1, FieldName2:FieldType2, ... Defines the schema of the external data.
URL1, URL2, ... The external storage URIs where the source data resides.
format Optional: Specifies the file format. The supported types are csv, scsv, tsv, psv, json, multijson, raw, txt.
ignoreFirstRecord Optional: A Boolean value that specifies whether to ignore the first record in the external data sources. The default is false. Use this property for CSV files with headers.

Returns

The operator returns a table with the specified schema, containing data retrieved from the external source.

Use case examples

You have an Axiom dataset that contains access logs with a field employeeID. You want to add extra information to your APL query by cross-referencing each employee ID in the Axiom dataset with an employee ID defined in an external lookup table. The lookup table is hosted somewhere else in CSV format.

External lookup table

employeeID, email, name, location
00001, tifa@acme.com, Tifa Lockhart, US
00002, barret@acme.com, Barret Wallace, Europe
00003, cid@acme.com, Cid Highwind, Europe

Query

let employees = externaldata (employeeID: string, email: string, name: string, location: string) ["http://example.com/lookup-table.csv"] with (format="csv", skipFirstRow=true);
accessLogs
| where severity == "high"
| lookup employees on employeeID
| project _time, severity, employeeID, email, name

Output

_time severity employeeID email name
Mar 13, 10:08:23 high 00001 tifa@acme.com Tifa Lockhart
Mar 13, 10:05:03 high 00001 tifa@acme.com Tifa Lockhart
Mar 13, 10:04:51 high 00003 cid@acme.com Cid Highwind
Mar 13, 10:02:29 high 00002 barret@acme.com Barret Wallace
Mar 13, 10:01:13 high 00001 tifa@acme.com Tifa Lockhart

This example extends the original dataset with the fields email and name. These new fields come from the external lookup table.

Use a lookup table from an external source to extend an OTel logs dataset with a field that contains human-readable names for each service.

External lookup table

serviceName,humanreadableServiceName
frontend,Frontend
frontendproxy,Frontendproxy
flagd,Flagd
productcatalogservice,Productcatalog
loadgenerator,Loadgenerator
checkoutservice,Checkout
cartservice,Cart
recommendationservice,Recommendations
emailservice,Email
adservice,Ads
shippingservice,Shipping
quoteservice,Quote
currencyservice,Currency
paymentservice,Payment
frauddetectionservice,Frauddetection

Query

let LookupTable = externaldata (serviceName: string, humanreadableServiceName: string) ["http://example.com/lookup-table.csv"] with (format="csv", ignoreFirstRecord=true);
['otel-demo-traces']
| lookup kind=leftouter LookupTable on $left.['service.name'] == $right.serviceName
| project _time, span_id, ['service.name'], humanreadableServiceName

Output

_time span_id service.name humanreadableServiceName
Mar 13, 10:02:28 398050797bb646ef flagd Flagd
Mar 13, 10:02:28 0ccd6baca8bea890 flagd Flagd
Mar 13, 10:02:28 2e579cbb3632381a flagd Flagd
Mar 13, 10:02:29 468be2336e35ca32 loadgenerator Loadgenerator
Mar 13, 10:02:29 e06348cc4b50ab0d frontend Frontend
Mar 13, 10:02:29 74571a6fa797f769 frontendproxy Frontendproxy
Mar 13, 10:02:29 7ab5eb0a5cd2e0cd frontendproxy Frontendproxy
Mar 13, 10:02:29 050cf3e9ab7efdda frontend Frontend
Mar 13, 10:02:29 b2882e3343414175 frontend Frontend
Mar 13, 10:02:29 fd7c06a6a746f3e2 frontend Frontend
Mar 13, 10:02:29 606d8a818bec7637 productcatalogservice Productcatalog
  • lookup: Performs joins between a dataset and an external table.
  • union: Merges multiple datasets, including external ones.

Good morning

I'm here to help you with the docs.

I
AIBased on your context