Enum Class TruffleString.Encoding
- All Implemented Interfaces:
Serializable, Comparable<TruffleString.Encoding>, Constable
- Enclosing class:
TruffleString
The list of encodings supported by
TruffleString
. TruffleString
is especially
optimized for the following encodings:
UTF-32
: this means UTF-32 in your system's endianness.TruffleString
transparently compacts UTF-32 strings to 8-bit or 16-bit representations, where possible.UTF-16
: this means UTF-16 in your system's endianness.TruffleString
transparently compacts UTF-16 strings to 8-bit representations, where possible.UTF-8
ISO-8859-1
US-ASCII
BYTES
, which is essentially identical to US-ASCII, with the only difference being thatBYTES
treats all byte values as valid codepoints.
All other encodings are supported using the JRuby JCodings library, which incurs more
CompilerDirectives.TruffleBoundary
calls. NOTE: to enable support for these encodings,
TruffleLanguage.Registration#needsAllEncodings()
must be set to true
in the
truffle language's registration.- Since:
- 22.1
-
Nested Class Summary
Nested classes/interfaces inherited from class Enum
Enum.EnumDesc<E>
-
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionBig5.Big5-HKSCS.Big5-UAO.Special "encoding" BYTES: This encoding is identical to US-ASCII, but treats all values outside the us-ascii range as valid codepoints as well.CESU-8.CP50220.CP50221.CP51932.CP850.CP852.CP855.CP949.CP950.CP951.Emacs-Mule.EUC-JIS-2004.EUC-JP.EUC-KR.EUC-TW.EucJP-ms.GB12345.GB18030.GB1988.GB2312.GBK.IBM037.IBM437.IBM720.IBM737.IBM775.IBM852.IBM855.IBM857.IBM860.IBM861.IBM862.IBM863.IBM864.IBM865.IBM866.IBM869.ISO-2022-JP.ISO-2022-JP-2.ISO-2022-JP-KDDI.ISO-8859-1, also known as LATIN-1, which is equivalent to US-ASCII + the LATIN-1 Supplement Unicode block.ISO-8859-10.ISO-8859-11.ISO-8859-13.ISO-8859-14.ISO-8859-15.ISO-8859-16.ISO-8859-2.ISO-8859-3.ISO-8859-4.ISO-8859-5.ISO-8859-6.ISO-8859-7.ISO-8859-8.ISO-8859-9.KOI8-R.KOI8-U.MacCentEuro.MacCroatian.MacCyrillic.MacGreek.MacIceland.MacJapanese.MacRoman.MacRomania.MacThai.MacTurkish.MacUkraine.Shift-JIS.SJIS-DoCoMo.SJIS-KDDI.SJIS-SoftBank.Stateless-ISO-2022-JP.Stateless-ISO-2022-JP-KDDI.TIS-620.US-ASCII, which maps only 7-bit characters.UTF-16BE.UTF-16LE.UTF-32BE.UTF-32LE.UTF-7.UTF-8.UTF8-DoCoMo.UTF8-KDDI.UTF8-MAC.UTF8-SoftBank.Windows-1250.Windows-1251.Windows-1252.Windows-1253.Windows-1254.Windows-1255.Windows-1256.Windows-1257.Windows-1258.Windows-31J.Windows-874. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final TruffleString.Encoding
UTF-16 in the current system's endianness, without byte-order mark, with transparent string compaction.static final TruffleString.Encoding
UTF-32 in the current system's endianness, without byte-order mark, with transparent string compaction. -
Method Summary
Modifier and TypeMethodDescriptionstatic TruffleString.Encoding
fromJCodingName
(String name) Get theTruffleString.Encoding
corresponding to the given encoding name from theJCodings
library.getEmpty()
Get an emptyTruffleString
with this encoding.static TruffleString.Encoding
Returns the enum constant of this class with the specified name.static TruffleString.Encoding[]
values()
Returns an array containing the constants of this enum class, in the order they are declared.
-
Enum Constant Details
-
UTF_32LE
UTF-32LE. Directly supported if the current system is little-endian.- Since:
- 22.1
-
UTF_32BE
UTF-32BE. Directly supported if the current system is big-endian.- Since:
- 22.1
-
UTF_16LE
UTF-16LE. Directly supported if the current system is little-endian.- Since:
- 22.1
-
UTF_16BE
UTF-16BE. Directly supported if the current system is big-endian.- Since:
- 22.1
-
ISO_8859_1
ISO-8859-1, also known as LATIN-1, which is equivalent to US-ASCII + the LATIN-1 Supplement Unicode block.- Since:
- 22.1
-
UTF_8
-
US_ASCII
US-ASCII, which maps only 7-bit characters.- Since:
- 22.1
-
BYTES
Special "encoding" BYTES: This encoding is identical to US-ASCII, but treats all values outside the us-ascii range as valid codepoints as well. Caution: no codepoint mappings are defined for non-us-ascii values in this encoding, soTruffleString.SwitchEncodingNode
will replace all of them with'?'
when converting from or to BYTES! To preserve all bytes and "reinterpret" a BYTES string in another encoding, useTruffleString.ForceEncodingNode
.- Since:
- 22.1
-
Big5
-
Big5_HKSCS
-
Big5_UAO
-
CESU_8
-
CP51932
-
CP850
-
CP852
-
CP855
-
CP949
-
CP950
-
CP951
-
EUC_JIS_2004
-
EUC_JP
-
EUC_KR
-
EUC_TW
-
Emacs_Mule
-
EucJP_ms
-
GB12345
-
GB18030
-
GB1988
-
GB2312
-
GBK
-
IBM437
-
IBM720
-
IBM737
-
IBM775
-
IBM852
-
IBM855
-
IBM857
-
IBM860
-
IBM861
-
IBM862
-
IBM863
-
IBM864
-
IBM865
-
IBM866
-
IBM869
-
ISO_8859_10
-
ISO_8859_11
-
ISO_8859_13
-
ISO_8859_14
-
ISO_8859_15
-
ISO_8859_16
-
ISO_8859_2
-
ISO_8859_3
-
ISO_8859_4
-
ISO_8859_5
-
ISO_8859_6
-
ISO_8859_7
-
ISO_8859_8
-
ISO_8859_9
-
KOI8_R
-
KOI8_U
-
MacCentEuro
-
MacCroatian
-
MacCyrillic
-
MacGreek
-
MacIceland
-
MacJapanese
-
MacRoman
-
MacRomania
-
MacThai
-
MacTurkish
-
MacUkraine
-
SJIS_DoCoMo
-
SJIS_KDDI
-
SJIS_SoftBank
-
Shift_JIS
-
Stateless_ISO_2022_JP
-
Stateless_ISO_2022_JP_KDDI
Stateless-ISO-2022-JP-KDDI.- Since:
- 22.1
-
TIS_620
-
UTF8_DoCoMo
-
UTF8_KDDI
-
UTF8_MAC
-
UTF8_SoftBank
-
Windows_1250
-
Windows_1251
-
Windows_1252
-
Windows_1253
-
Windows_1254
-
Windows_1255
-
Windows_1256
-
Windows_1257
-
Windows_1258
-
Windows_31J
-
Windows_874
-
CP50220
-
CP50221
-
IBM037
-
ISO_2022_JP
-
ISO_2022_JP_2
-
ISO_2022_JP_KDDI
-
UTF_7
-
-
Field Details
-
UTF_32
UTF-32 in the current system's endianness, without byte-order mark, with transparent string compaction.- Since:
- 22.1
-
UTF_16
UTF-16 in the current system's endianness, without byte-order mark, with transparent string compaction.- Since:
- 22.1
-
-
Method Details
-
values
Returns an array containing the constants of this enum class, in the order they are declared.- Returns:
- an array containing the constants of this enum class, in the order they are declared
-
valueOf
Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)- Parameters:
name
- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException
- if this enum class has no constant with the specified nameNullPointerException
- if the argument is null
-
getEmpty
-
fromJCodingName
Get theTruffleString.Encoding
corresponding to the given encoding name from theJCodings
library.- Since:
- 22.1
-