Use the isutf8 function to check whether a string is a valid UTF-8 encoded sequence. The function returns a boolean indicating whether the input conforms to UTF-8 encoding rules.

isutf8 is useful when working with data from external sources such as logs, telemetry events, or data pipelines, where encoding issues can cause downstream processing to fail or produce incorrect results. By filtering out or isolating invalid UTF-8 strings, you can ensure better data quality and avoid unexpected behavior during parsing or transformation.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Splunk doesn’t provide a built-in function to directly check if a string is valid UTF-8. Users typically rely on workarounds using field transformations or regex, which can be error-prone or incomplete. APL provides isutf8 as a simple and reliable alternative.

```sql Splunk example | eval valid_utf8=if(match(field, "some-regex-pattern"), true, false) ````
['sample-http-logs']
| where isutf8(uri)

ANSI SQL does not define a standard function to validate UTF-8 encoding in strings. Some platforms offer vendor-specific functions, but behavior varies. APL offers isutf8 as a consistent, built-in way to validate string encoding.

```sql SQL example SELECT * FROM logs WHERE IS_UTF8(uri) = true; ```
['sample-http-logs']
| where isutf8(uri)

Usage

Syntax

isutf8(value)

Parameters

Name Type Description
value string The input string to validate.

Returns

A bool value:

  • true if the input string is valid UTF-8.
  • false otherwise.

Use case examples

You can use isutf8 to detect and exclude malformed UTF-8 entries in HTTP request logs that could indicate issues with upstream data encoding.

Query

['sample-http-logs']
| where not(isutf8(uri)) or not(isutf8(method))
| project _time, id, method, uri, status

Run in Playground

Output

_time id method uri status
2025-07-09T13:32:05Z user42 GET �/broken-path 500
2025-07-09T14:10:17Z user99 POST /submit-form%80 200

This query identifies records where the uri or method fields contain invalid UTF-8 characters, which may point to upstream client encoding issues or malformed requests.

In distributed traces, malformed span names or service identifiers can cause trace visualizations to fail. Use isutf8 to validate such fields.

Query

['otel-demo-traces']
| where not(isutf8(['service.name']))
| project _time, trace_id, span_id, ['service.name'], kind, status_code

Run in Playground

Output

_time trace_id span_id ['service.name'] kind status_code
2025-07-09T13:50:10Z abc123 xyz789 fron��dproxy server 200

This query isolates spans where the service name contains invalid UTF-8 characters, helping you detect encoding issues in trace metadata.

Security analysts often inspect logs for anomalies. Use isutf8 to detect tampered or improperly encoded request paths that could indicate obfuscation or injection attempts.

Query

['sample-http-logs']
| where status == '403' and not(isutf8(uri))
| project _time, id, method, uri, ['geo.country']

Run in Playground

Output

_time id method uri geo.country
2025-07-09T12:45:02Z user23 GET /bad%FFpath Germany

This query flags 403 responses with URIs containing invalid UTF-8, which could signal attempts to bypass filters or exploit encoding vulnerabilities.

  • isimei: Checks whether a value is a valid International Mobile Equipment Identity (IMEI) number.
  • isreal: Checks whether a value is a real number.
  • iscc: Checks whether a value is a valid credit card (CC) number.
  • isstring: Checks whether a value is a string. Use this for scalar string validation.

Good morning

I'm here to help you with the docs.

I
AIBased on your context