Hash Sha1 — axiomhq/docs

Introduction

The hash_sha1 function returns the SHA-1 hash of a scalar value as a 40-character hexadecimal string. Use it to generate consistent identifiers, detect duplicates across datasets, or fingerprint values for comparison.

SHA-1 produces a 160-bit digest that's faster to compute than SHA-256 while producing a larger output than MD5. SHA-1 is deprecated for cryptographic security use cases, but remains appropriate for non-security tasks such as data deduplication, fingerprinting, and consistent grouping. For security-sensitive hashing, use hash_sha256 or hash_sha512.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Splunk provides the sha1(X) function that returns a 40-character hex string. APL's hash_sha1 works the same way.

```sql Splunk example ... | eval hashed = sha1(id) ```

['sample-http-logs']
| extend hashed = hash_sha1(id)

ANSI SQL has no standard SHA-1 function. PostgreSQL provides encode(digest(value, 'sha1'), 'hex'). APL's hash_sha1 returns the same 40-character lowercase hex digest.

```sql SQL example SELECT encode(digest(id, 'sha1'), 'hex') AS hashed FROM sample_http_logs ```

['sample-http-logs']
| extend hashed = hash_sha1(id)

Usage

Syntax

hash_sha1(source)

Parameters

Name	Type	Required	Description
source	scalar	Yes	The value to hash. APL converts it to a string before hashing.

Returns

The SHA-1 hash of source as a 40-character lowercase hexadecimal string.

Use case examples

Anonymize user IDs and count requests per hashed user to protect PII while tracking usage patterns.

Query

['sample-http-logs']
| extend hashed_id = hash_sha1(id)
| summarize request_count = count() by hashed_id
| top 5 by request_count

Run in Playground

Output

hashed_id	request_count
9f9af029585ba014e07cd3910ca976cf56160616	128
a94a8fe5ccb19ba61c4c0873d391e987982fbbd3	97
356a192b7913b04c54574d18c28d46e6395428ab	85
da4b9237bacccdf19c0760cab7aec4a8359010b0	74
77de68daecd823babbb58edb1c8e14d7106e83bb	69

The query replaces raw user IDs with SHA-1 hashes before aggregating, so the busiest users are visible without exposing their original identifiers.

Hash span IDs to create stable surrogate keys for cross-dataset joins or external reporting.

Query

['otel-demo-traces']
| extend hashed_span = hash_sha1(span_id)
| project _time, ['service.name'], hashed_span, duration
| take 10

Run in Playground

Output

_time	service.name	hashed_span	duration
2024-01-15 10:23:01	frontend	9f9af029585ba014e07cd3910ca976cf56160616	320ms
2024-01-15 10:23:02	checkout	a94a8fe5ccb19ba61c4c0873d391e987982fbbd3	875ms
2024-01-15 10:23:03	cart	356a192b7913b04c54574d18c28d46e6395428ab	140ms

The query projects the hashed span ID alongside service name and duration so you can work with span fingerprints in downstream pipelines.

hash_md5: Returns a 32-character MD5 hex digest. Use hash_md5 when a shorter digest is sufficient and speed is the priority.
hash_sha256: Returns a 64-character SHA-256 hex digest. Use hash_sha256 for security-sensitive hashing.
hash_sha512: Returns a 128-character SHA-512 hex digest for maximum hash strength.
hash: Returns a signed 64-bit integer hash. Use hash when you need a compact numeric key rather than a hex string.

#Introduction

#For users of other query languages

#Usage

#Syntax

#Parameters

#Returns

#Use case examples

#List of related functions

Good evening