The parse_url function parses an absolute URL string into a dynamic object containing all URL components (scheme, host, port, path, query parameters, etc.). Use this function to extract and analyze specific parts of URLs from logs, web traffic data, or API requests.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

In Splunk SPL, you use rex or URL-specific extractions. APL's parse_url provides structured URL parsing in one function.

```sql Splunk example | rex field=url "(?https?)://(?[^/]+)(?.*)" ```
['sample-http-logs']
| extend parsed = parse_url(url)
| extend scheme = parsed.scheme, host = parsed.host, path = parsed.path

In ANSI SQL, URL parsing requires complex string manipulation. APL's parse_url provides structured parsing natively.

```sql SQL example SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(url, '://', -1), '/', 1) AS host FROM logs; ```
['sample-http-logs']
| extend parsed = parse_url(url)
| extend host = parsed.host

Usage

Syntax

parse_url(url)

Parameters

Name Type Required Description
url string Yes An absolute URL string to parse.

Returns

Returns a dynamic object containing URL components: scheme, host, port, path, username, password, query, fragment.

Use case examples

Parse URLs from HTTP logs to analyze traffic patterns by host and path.

Query

['sample-http-logs']
| extend full_url = strcat('https://api.example.com', uri)
| extend parsed = parse_url(full_url)
| extend host = tostring(parsed.host)
| extend path = tostring(parsed.path)
| summarize request_count = count() by host, path
| sort by request_count desc
| limit 10

Run in Playground

Output

host path request_count
api.example.com /api/users 2341
api.example.com /api/orders 1987
api.example.com /api/products 1654

This query parses complete URLs to extract host and path information for traffic analysis and API endpoint usage patterns.

Extract URL components from span attributes to analyze service communication patterns.

Query

['otel-demo-traces']
| extend url = strcat('http://', ['service.name'], ':8080/api/endpoint')
| extend parsed = parse_url(url)
| extend host = tostring(parsed.host)
| extend port = toint(parsed.port)
| summarize span_count = count() by host, port
| sort by span_count desc
| limit 10

Run in Playground

Output

host port span_count
frontend 8080 4532
checkout 8080 3421
cart 8080 2987

This query parses service URLs from traces to understand port usage and service endpoints in a distributed system.

Parse URLs from security logs to identify suspicious patterns in schemes, hosts, or paths.

Query

['sample-http-logs']
| extend full_url = strcat('http://example.com', uri)
| extend parsed = parse_url(full_url)
| extend scheme = tostring(parsed.scheme)
| extend path = tostring(parsed.path)
| extend has_traversal = indexof(path, '..') >= 0
| project _time, full_url, scheme, path, has_traversal, id, status
| where has_traversal
| limit 10

Run in Playground

Output

_time full_url scheme path has_traversal id status
2024-11-06T10:00:00Z http://example.com/../../etc/passwd http /../../etc/passwd true user123 403

This query parses URLs from failed access attempts and checks for path traversal patterns, helping identify potential security threats.

  • parse_urlquery: Parses only URL query parameters. Use this when you only need query string parsing.
  • format_url: Constructs URLs from components. Use this to reverse the parsing operation and build URLs.
  • url_decode: Decodes URL-encoded strings. Use this to decode individual URL components.
  • split: Splits strings by delimiters. Use this for simpler URL tokenization without full parsing.

Good evening

I'm here to help you with the docs.

I
AIBased on your context