transcode(X;F;T;S;O)
Converts a string from one encoding into a different encoding. (Available as of version 12.24)
Syntax
transcode(X;F;T;S;O)
Input
| Argument | Type | Description |
|---|---|---|
X |
text | The string on which to apply the function. This argument is required. A scalar value or the name of a column |
F |
text | F is the current encoding of X. If this argument is
omitted, the default encoding is UTF-8.A scalar value or the name of a column |
T |
text | T is the target encoding of X. If this argument is
omitted, the default encoding is UTF-8.A scalar value or the name of a column |
S |
text | S is the string which is returned in place of a result if
transcode cannot translate the string
X
The following special characters can be used: If
S is omitted and
transcode cannot translate the string
X, the function displays a system error.A scalar value or the name of a column |
| O | text | A list of one or more special options
|
Return Value
Returns a text value corresponding to the string X translated from encoding
F into encoding T.
transcode throws an error if the string is not actually in the
encoding F, contains characters illegal in encoding
F, or contains characters that cannot be expressed in encoding
T.
Sample Usage
In the following function call:
transcode(X;'UTF-8';'ISO-8859-1';'$ (%)';)
transcode converts UTF-8 text X into
Latin-1, but leaves any strings that aren't valid UTF-8 or cannot be expressed in
Latin-1 unchanged, with an error message added in parentheses.
In the following function call:
transcode(X;'UTF-8';'ISO-8859-1';;'discard''translit')
transcode tries to convert UTF-8 to Latin-1 as tolerantly as
possible, discarding illegal sequences in the UTF-8 input and transliterating
characters that can't be represented in Latin-1.
F (current encoding) and T (target
encoding) be the same value if you use the 'discard' option. The
following function call discards illegal non-UTF-8 sequences from a
string:transcode(string;;;;'discard')Additional Information
- The following are valid encoding types:
- UTF8
- UTF-8
- ASCII
- US
- US-ASCII
- LATIN1
- EBCDIC
- ISO_8859_1
- ISO-8859-2
- ISO8859-2
- ISO_8859-2
- ISO_8859_2
- ISO-8859-3
- ISO8859-3
- ISO_8859-3
- ISO_8859_3
- ISO-8859-4
- ISO8859-4
- ISO_8859-4
- ISO_8859_4
- ISO-8859-5
- ISO8859-5
- ISO_8859-5
- ISO_8859_5
- ISO-8859-6
- ISO8859-6
- ISO_8859-6
- ISO_8859_6
- ISO-8859-7
- ISO8859-7
- ISO_8859-7
- ISO_8859_7
- ISO-8859-8
- ISO8859-8
- ISO_8859-8
- ISO_8859_8
- ISO-8859-9
- ISO8859-9
- ISO_8859-9
- ISO_8859_9
- ISO-8859-13
- ISO8859-13
- ISO_8859-13
- ISO_8859_13
- ISO-8859-15
- ISO8859-15
- ISO_8859-15
- ISO_8859_15
- CP500
- CP65001
- CP1200
- UTF16LE
- UTF-16LE
- UCS2LE
- UCS-2LE
- UCS-2-INTERNAL
- CP1201
- UTF16BE
- UTF-16BE
- UCS2BE
- UCS-2BE
- UNICODEFFFE
- CP12000
- UTF32LE
- UTF-32LE
- UCS4LE
- UCS-4LE
- CP12001
- UTF32BE
- UTF-32BE
- UCS4BE
- UCS-4BE
- UTF16
- UTF-16
- UCS2
- UCS-2
- UTF32
- UTF-32
- UCS-4
- UCS4
- ANSI_X3.4-1968
- ANSI_X3.4-1986
- CP367
- IBM367
- ISO-IR-6
- ISO646-US
- ISO_646.IRV:1991
- CSASCII
- CP819
- IBM819
- ISO-8859-1
- ISO-IR-100
- ISO8859-1
- ISO_8859-1
- ISO_8859-1:1987
- L1
- CSISOLATIN1
- CP1250
- MS-EE
- WINDOWS-1250
- CP1251
- MS-CYRL
- WINDOWS-1251
- CP1252
- MS-ANSI
- WINDOWS-1252
- CP1253
- MS-GREEK
- WINDOWS-1253
- CP1254
- MS-TURK
- WINDOWS-1254
- CP1255
- MS-HEBR
- WINDOWS-1255
- CP1256
- MS-ARAB
- WINDOWS-1256
- CP1257
- WINBALTRIM
- WINDOWS-1257
- CP1258 WINDOWS-1258
- 850
- CP850
- IBM850
- CSPC850MULTILINGUAL
- 862 CP862 IBM862
- CSPC862LATINHEBREW
- 866
- CP866
- IBM866
- CSIBM866
- CP154
- CYRILLIC-ASIAN
- PT154
- PTCP154
- CSPTCP154
- CP1133
- IBM-CP1133
- CP874
- WINDOWS-874
- CP51932
- MS51932
- WINDOWS-51932
- EUC-JP
- CP932
- MS932
- SHIFFT_JIS
- SHIFFT_JIS-MS
- SJIS SJIS-MS
- SJIS-OPEN
- SJIS-WIN
- WINDOWS-31J
- WINDOWS-932
- CSWINDOW
- S31J
- CP50221
- ISO-2022-JP
- ISO-2022-JP-MS
- ISO2022-JP
- ISO2022-JP-MS
- MS50221
- WINDOWS-50221
- CP936
- GBK
- MS936
- WINDOWS-936
- CP950
- BIG5
- BIG5HKSCS
- BIG5-HKSCS
- CP949
- UHC
- EUC-KR
- CP1361
- JOHAB
- 437
- CP437
- IBM437
- CSPC8CODEPAGE437
- CP737
- CP775
- IBM775
- CSPC775BALTIC
- 852
- CP852
- IBM852
- CSPCP852
- CP853
- 855
- CP855
- IBM855
- CSIBM855
- 857
- CP857
- IBM857
- CSIBM857
- CP858
- 860
- CP860
- IBM860
- CSIBM860
- 861
- CP-IS
- CP861
- IBM861
- CSIBM861
- 863
- CP863
- IBM863
- CSIBM863
- CP864
- IBM864
- CSIBM864
- 865
- CP865
- IBM865
- CSIBM865
- 869
- CP-GR
- CP869 IBM869
- CSIBM869
- CP1125 IBM037
- IBM500
- ASMO-708
- DOS-720
- IBM737
- IBM00858
- DOS-862
- IBM870
- CP875
- SHIFT_JIS
- SHIFT-JIS
- GB2312
- KS_C_5601-1987
- IBM1026
- IBM01047
- IBM01140
- IBM01141
- IBM01142
- IBM01143
- IBM01144
- IBM01145
- IBM01146
- IBM01147
- IBM01148
- IBM01149
- MACINTOSH
- X-MAC-JAPANESE
- X-MAC-CHINESETRAD
- X-MAC-KOREAN
- X-MAC-ARABIC
- X-MAC-HEBREW
- X-MAC-GREEK
- X-MAC-CYRILLIC
- X-MAC-CHINESESIMP
- X-MAC-ROMANIAN
- X-MAC-UKRAINIAN
- X-MAC-THAI
- X-MAC-CE
- X-MAC-ICELANDIC
- X-MAC-TURKISH
- X-MAC-CROATIAN
- X-CHINESE_CNS
- X-CP20001
- X_CHINESE-ETEN
- X-CP20003
- X-CP20004
- X-CP20005
- X-IA5
- X-IA5-GERMAN
- X-IA5-SWEDISH
- X-IA5-NORWEGIAN
- X-CP20261
- X-CP20269
- IBM273
- IBM277
- IBM278
- IBM280
- IBM284
- IBM285
- IBM290
- IBM297
- IBM420
- IBM423
- IBM424
- X-EBCDIC-KOREANEXTENDED
- IBM-THAI
- KOI8-R
- IBM871
- IBM880
- IBM905
- IBM00924
- X-CP20936
- X-CP20949
- CP1025
- KOI8-U X-EUROPA
- ISO-8859-8-I
- ISO8859-8-I
- ISO_8859-8-I
- ISO_8859_8-I
- CSISO2022JP
- ISO-2022-KR
- ISO2022-KR
- X-CP50227
- EUC-CN
- HZ-GB-2312
- GB18030
- X-ISCII-DE
- X-ISCII-BE
- X-ISCII-TA
- X-ISCII-TE
- X-ISCII-AS
- X-ISCII-OR
- X-ISCII-KA
- X-ISCII-MA
- X-ISCII-GU
- X-ISCII-PA
