The parse_url function parses an absolute URL string into a dynamic object containing all URL components (scheme, host, port, path, query parameters, etc.). Use this function to extract and analyze specific parts of URLs from logs, web traffic data, or API requests.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
In Splunk SPL, you use rex or URL-specific extractions. APL's parse_url provides structured URL parsing in one function.
['sample-http-logs']
| extend parsed = parse_url(url)
| extend scheme = parsed.scheme, host = parsed.host, path = parsed.pathIn ANSI SQL, URL parsing requires complex string manipulation. APL's parse_url provides structured parsing natively.
['sample-http-logs']
| extend parsed = parse_url(url)
| extend host = parsed.hostUsage
Syntax
parse_url(url)Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | An absolute URL string to parse. |
Returns
Returns a dynamic object containing URL components: scheme, host, port, path, username, password, query, fragment.
Use case examples
Parse URLs from HTTP logs to analyze traffic patterns by host and path.
Query
['sample-http-logs']
| extend full_url = strcat('https://api.example.com', uri)
| extend parsed = parse_url(full_url)
| extend host = tostring(parsed.host)
| extend path = tostring(parsed.path)
| summarize request_count = count() by host, path
| sort by request_count desc
| limit 10Output
| host | path | request_count |
|---|---|---|
| api.example.com | /api/users | 2341 |
| api.example.com | /api/orders | 1987 |
| api.example.com | /api/products | 1654 |
This query parses complete URLs to extract host and path information for traffic analysis and API endpoint usage patterns.
Extract URL components from span attributes to analyze service communication patterns.
Query
['otel-demo-traces']
| extend url = strcat('http://', ['service.name'], ':8080/api/endpoint')
| extend parsed = parse_url(url)
| extend host = tostring(parsed.host)
| extend port = toint(parsed.port)
| summarize span_count = count() by host, port
| sort by span_count desc
| limit 10Output
| host | port | span_count |
|---|---|---|
| frontend | 8080 | 4532 |
| checkout | 8080 | 3421 |
| cart | 8080 | 2987 |
This query parses service URLs from traces to understand port usage and service endpoints in a distributed system.
Parse URLs from security logs to identify suspicious patterns in schemes, hosts, or paths.
Query
['sample-http-logs']
| extend full_url = strcat('http://example.com', uri)
| extend parsed = parse_url(full_url)
| extend scheme = tostring(parsed.scheme)
| extend path = tostring(parsed.path)
| extend has_traversal = indexof(path, '..') >= 0
| project _time, full_url, scheme, path, has_traversal, id, status
| where has_traversal
| limit 10Output
| _time | full_url | scheme | path | has_traversal | id | status |
|---|---|---|---|---|---|---|
| 2024-11-06T10:00:00Z | http://example.com/../../etc/passwd |
http | /../../etc/passwd |
true | user123 | 403 |
This query parses URLs from failed access attempts and checks for path traversal patterns, helping identify potential security threats.
List of related functions
- parse_urlquery: Parses only URL query parameters. Use this when you only need query string parsing.
- format_url: Constructs URLs from components. Use this to reverse the parsing operation and build URLs.
- url_decode: Decodes URL-encoded strings. Use this to decode individual URL components.
- split: Splits strings by delimiters. Use this for simpler URL tokenization without full parsing.