strclean(X)
Returns a boolean value indicating whether the given string is a legal UTF-8 (Unicode) string.
Syntax
strclean(X)
Input
Argument | Type | Description |
---|---|---|
X |
text | The string on which to apply the function A scalar value or the name of a column |
Return Value
Returns an integer value of 1 if a string is legal UTF-8; otherwise, it returns 0.
If X
is N/A, the result is 1.
Sample Usage
value |
strclean(value) |
---|---|
'banana' | 1 |
'琵琶' | 1 |
'' | 1 |
<arbitrary binary string with non-UTF-8 character sequences> | 0 |
Additional Information
- Not all sequences of bytes are legal UTF-8 Unicode.
- All plain (7-bit) ASCII strings are legal, as are strings in which non-ASCII characters are represented by legal UTF-8 sequences.
- Arbitrary sequences of 8-bit characters may not be legal.
- The related function
bstrclean(X)
checks whether a string is plain ASCII.