strclean(X)

Returns a boolean value indicating whether the given string is a legal UTF-8 (Unicode) string.

Syntax

strclean(X)

Input

Argument Type Description
X text The string on which to apply the function

A scalar value or the name of a column

Return Value

Returns an integer value of 1 if a string is legal UTF-8; otherwise, it returns 0.

If X is N/A, the result is 1.

Sample Usage

value strclean(value)
'banana' 1
'琵琶' 1
'' 1
<arbitrary binary string with non-UTF-8 character sequences> 0

Additional Information

  • Not all sequences of bytes are legal UTF-8 Unicode.
  • All plain (7-bit) ASCII strings are legal, as are strings in which non-ASCII characters are represented by legal UTF-8 sequences.
  • Arbitrary sequences of 8-bit characters may not be legal.
  • The related function bstrclean(X) checks whether a string is plain ASCII.