Package com.oracle.truffle.api.strings
Enum Class TruffleString.Encoding
- All Implemented Interfaces:
Serializable
,Comparable<TruffleString.Encoding>
,Constable
- Enclosing class:
TruffleString
The list of encodings supported by
TruffleString
. TruffleString
is especially
optimized for the following encodings:
UTF-32
: this means UTF-32 in your system's endianness.TruffleString
transparently compacts UTF-32 strings to 8-bit or 16-bit representations, where possible.UTF-16
: this means UTF-16 in your system's endianness.TruffleString
transparently compacts UTF-16 strings to 8-bit representations, where possible.UTF-8
ISO-8859-1
US-ASCII
BYTES
, which is essentially identical to US-ASCII, with the only difference being thatBYTES
treats all byte values as valid codepoints.
All other encodings are supported using the JRuby JCodings library, which incurs more
CompilerDirectives.TruffleBoundary
calls. NOTE: to enable support for these encodings,
TruffleLanguage.Registration#needsAllEncodings()
must be set to true
in the
truffle language's registration.- Since:
- 22.1
-
Nested Class Summary
Nested classes/interfaces inherited from class java.lang.Enum
Enum.EnumDesc<E extends Enum<E>>
-
Enum Constant Summary
Enum ConstantDescriptionBig5.Big5-HKSCS.Big5-UAO.Special "encoding" BYTES: This encoding is identical to US-ASCII, but treats all values outside the us-ascii range as valid codepoints as well.CESU-8.CP50220.CP50221.CP51932.CP850.CP852.CP855.CP949.CP950.CP951.Emacs-Mule.EUC-JIS-2004.EUC-JP.EUC-KR.EUC-TW.EucJP-ms.GB12345.GB18030.GB1988.GB2312.GBK.IBM037.IBM437.IBM720.IBM737.IBM775.IBM852.IBM855.IBM857.IBM860.IBM861.IBM862.IBM863.IBM864.IBM865.IBM866.IBM869.ISO-2022-JP.ISO-2022-JP-2.ISO-2022-JP-KDDI.ISO-8859-1, also known as LATIN-1, which is equivalent to US-ASCII + the LATIN-1 Supplement Unicode block.ISO-8859-10.ISO-8859-11.ISO-8859-13.ISO-8859-14.ISO-8859-15.ISO-8859-16.ISO-8859-2.ISO-8859-3.ISO-8859-4.ISO-8859-5.ISO-8859-6.ISO-8859-7.ISO-8859-8.ISO-8859-9.KOI8-R.KOI8-U.MacCentEuro.MacCroatian.MacCyrillic.MacGreek.MacIceland.MacJapanese.MacRoman.MacRomania.MacThai.MacTurkish.MacUkraine.Shift-JIS.SJIS-DoCoMo.SJIS-KDDI.SJIS-SoftBank.Stateless-ISO-2022-JP.Stateless-ISO-2022-JP-KDDI.TIS-620.US-ASCII, which maps only 7-bit characters.UTF-16BE.UTF-16LE.UTF-32BE.UTF-32LE.UTF-7.UTF-8.UTF8-DoCoMo.UTF8-KDDI.UTF8-MAC.UTF8-SoftBank.Windows-1250.Windows-1251.Windows-1252.Windows-1253.Windows-1254.Windows-1255.Windows-1256.Windows-1257.Windows-1258.Windows-31J.Windows-874. -
Field Summary
Modifier and TypeFieldDescriptionstatic final TruffleString.Encoding
UTF-16 in the current system's endianness, without byte-order mark, with transparent string compaction.static final TruffleString.Encoding
UTF-32 in the current system's endianness, without byte-order mark, with transparent string compaction. -
Method Summary
Modifier and TypeMethodDescriptionstatic TruffleString.Encoding
fromJCodingName
(String name) Get theTruffleString.Encoding
corresponding to the given encoding name from theJCodings
library.getEmpty()
Get an emptyTruffleString
with this encoding.static TruffleString.Encoding
Returns the enum constant of this class with the specified name.static TruffleString.Encoding[]
values()
Returns an array containing the constants of this enum class, in the order they are declared.
-
Enum Constant Details
-
UTF_32LE
UTF-32LE. Directly supported if the current system is little-endian.- Since:
- 22.1
-
UTF_32BE
UTF-32BE. Directly supported if the current system is big-endian.- Since:
- 22.1
-
UTF_16LE
UTF-16LE. Directly supported if the current system is little-endian.- Since:
- 22.1
-
UTF_16BE
UTF-16BE. Directly supported if the current system is big-endian.- Since:
- 22.1
-
ISO_8859_1
ISO-8859-1, also known as LATIN-1, which is equivalent to US-ASCII + the LATIN-1 Supplement Unicode block.- Since:
- 22.1
-
UTF_8
UTF-8.- Since:
- 22.1
-
US_ASCII
US-ASCII, which maps only 7-bit characters.- Since:
- 22.1
-
BYTES
Special "encoding" BYTES: This encoding is identical to US-ASCII, but treats all values outside the us-ascii range as valid codepoints as well. Caution: no codepoint mappings are defined for non-us-ascii values in this encoding, soTruffleString.SwitchEncodingNode
will replace all of them with'?'
when converting from or to BYTES! To preserve all bytes and "reinterpret" a BYTES string in another encoding, useTruffleString.ForceEncodingNode
.- Since:
- 22.1
-
Big5
Big5.- Since:
- 22.1
-
Big5_HKSCS
Big5-HKSCS.- Since:
- 22.1
-
Big5_UAO
Big5-UAO.- Since:
- 22.1
-
CESU_8
CESU-8.- Since:
- 23.0
-
CP51932
CP51932.- Since:
- 22.1
-
CP850
CP850.- Since:
- 22.1
-
CP852
CP852.- Since:
- 22.1
-
CP855
CP855.- Since:
- 22.1
-
CP949
CP949.- Since:
- 22.1
-
CP950
CP950.- Since:
- 22.1
-
CP951
CP951.- Since:
- 22.1
-
EUC_JIS_2004
EUC-JIS-2004.- Since:
- 22.1
-
EUC_JP
EUC-JP.- Since:
- 22.1
-
EUC_KR
EUC-KR.- Since:
- 22.1
-
EUC_TW
EUC-TW.- Since:
- 22.1
-
Emacs_Mule
Emacs-Mule.- Since:
- 22.1
-
EucJP_ms
EucJP-ms.- Since:
- 22.1
-
GB12345
GB12345.- Since:
- 22.1
-
GB18030
GB18030.- Since:
- 22.1
-
GB1988
GB1988.- Since:
- 22.1
-
GB2312
GB2312.- Since:
- 22.1
-
GBK
GBK.- Since:
- 22.1
-
IBM437
IBM437.- Since:
- 22.1
-
IBM720
IBM720.- Since:
- 23.0
-
IBM737
IBM737.- Since:
- 22.1
-
IBM775
IBM775.- Since:
- 22.1
-
IBM852
IBM852.- Since:
- 22.1
-
IBM855
IBM855.- Since:
- 22.1
-
IBM857
IBM857.- Since:
- 22.1
-
IBM860
IBM860.- Since:
- 22.1
-
IBM861
IBM861.- Since:
- 22.1
-
IBM862
IBM862.- Since:
- 22.1
-
IBM863
IBM863.- Since:
- 22.1
-
IBM864
IBM864.- Since:
- 22.1
-
IBM865
IBM865.- Since:
- 22.1
-
IBM866
IBM866.- Since:
- 22.1
-
IBM869
IBM869.- Since:
- 22.1
-
ISO_8859_10
ISO-8859-10.- Since:
- 22.1
-
ISO_8859_11
ISO-8859-11.- Since:
- 22.1
-
ISO_8859_13
ISO-8859-13.- Since:
- 22.1
-
ISO_8859_14
ISO-8859-14.- Since:
- 22.1
-
ISO_8859_15
ISO-8859-15.- Since:
- 22.1
-
ISO_8859_16
ISO-8859-16.- Since:
- 22.1
-
ISO_8859_2
ISO-8859-2.- Since:
- 22.1
-
ISO_8859_3
ISO-8859-3.- Since:
- 22.1
-
ISO_8859_4
ISO-8859-4.- Since:
- 22.1
-
ISO_8859_5
ISO-8859-5.- Since:
- 22.1
-
ISO_8859_6
ISO-8859-6.- Since:
- 22.1
-
ISO_8859_7
ISO-8859-7.- Since:
- 22.1
-
ISO_8859_8
ISO-8859-8.- Since:
- 22.1
-
ISO_8859_9
ISO-8859-9.- Since:
- 22.1
-
KOI8_R
KOI8-R.- Since:
- 22.1
-
KOI8_U
KOI8-U.- Since:
- 22.1
-
MacCentEuro
MacCentEuro.- Since:
- 22.1
-
MacCroatian
MacCroatian.- Since:
- 22.1
-
MacCyrillic
MacCyrillic.- Since:
- 22.1
-
MacGreek
MacGreek.- Since:
- 22.1
-
MacIceland
MacIceland.- Since:
- 22.1
-
MacJapanese
MacJapanese.- Since:
- 22.1
-
MacRoman
MacRoman.- Since:
- 22.1
-
MacRomania
MacRomania.- Since:
- 22.1
-
MacThai
MacThai.- Since:
- 22.1
-
MacTurkish
MacTurkish.- Since:
- 22.1
-
MacUkraine
MacUkraine.- Since:
- 22.1
-
SJIS_DoCoMo
SJIS-DoCoMo.- Since:
- 22.1
-
SJIS_KDDI
SJIS-KDDI.- Since:
- 22.1
-
SJIS_SoftBank
SJIS-SoftBank.- Since:
- 22.1
-
Shift_JIS
Shift-JIS.- Since:
- 22.1
-
Stateless_ISO_2022_JP
Stateless-ISO-2022-JP.- Since:
- 22.1
-
Stateless_ISO_2022_JP_KDDI
Stateless-ISO-2022-JP-KDDI.- Since:
- 22.1
-
TIS_620
TIS-620.- Since:
- 22.1
-
UTF8_DoCoMo
UTF8-DoCoMo.- Since:
- 22.1
-
UTF8_KDDI
UTF8-KDDI.- Since:
- 22.1
-
UTF8_MAC
UTF8-MAC.- Since:
- 22.1
-
UTF8_SoftBank
UTF8-SoftBank.- Since:
- 22.1
-
Windows_1250
Windows-1250.- Since:
- 22.1
-
Windows_1251
Windows-1251.- Since:
- 22.1
-
Windows_1252
Windows-1252.- Since:
- 22.1
-
Windows_1253
Windows-1253.- Since:
- 22.1
-
Windows_1254
Windows-1254.- Since:
- 22.1
-
Windows_1255
Windows-1255.- Since:
- 22.1
-
Windows_1256
Windows-1256.- Since:
- 22.1
-
Windows_1257
Windows-1257.- Since:
- 22.1
-
Windows_1258
Windows-1258.- Since:
- 22.1
-
Windows_31J
Windows-31J.- Since:
- 22.1
-
Windows_874
Windows-874.- Since:
- 22.1
-
CP50220
CP50220.- Since:
- 22.1
-
CP50221
CP50221.- Since:
- 22.1
-
IBM037
IBM037.- Since:
- 22.1
-
ISO_2022_JP
ISO-2022-JP.- Since:
- 22.1
-
ISO_2022_JP_2
ISO-2022-JP-2.- Since:
- 22.1
-
ISO_2022_JP_KDDI
ISO-2022-JP-KDDI.- Since:
- 22.1
-
UTF_7
UTF-7.- Since:
- 22.1
-
-
Field Details
-
UTF_32
UTF-32 in the current system's endianness, without byte-order mark, with transparent string compaction.- Since:
- 22.1
-
UTF_16
UTF-16 in the current system's endianness, without byte-order mark, with transparent string compaction.- Since:
- 22.1
-
-
Method Details
-
values
Returns an array containing the constants of this enum class, in the order they are declared.- Returns:
- an array containing the constants of this enum class, in the order they are declared
-
valueOf
Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)- Parameters:
name
- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException
- if this enum class has no constant with the specified nameNullPointerException
- if the argument is null
-
getEmpty
Get an emptyTruffleString
with this encoding.- Since:
- 22.1
-
fromJCodingName
Get theTruffleString.Encoding
corresponding to the given encoding name from theJCodings
library.- Since:
- 22.1
-