Package jflex.core.unicode
Class UnicodeProperties
java.lang.Object
jflex.core.unicode.UnicodeProperties
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate IntCharSet[]
private String
private int
private static final String
private int
private final Map
<String, IntCharSet> static final String
private static final Pattern
-
Constructor Summary
ConstructorsConstructorDescriptionUnpacks the Unicode data corresponding to the default Unicode version: ""12.1"".UnicodeProperties
(String version) Unpacks the Unicode data corresponding to the given version. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
bind
(String[] propertyValues, String[] intervals, String[] propertyValueAliases, int maximumCodePoint, String caselessMatchPartitions, int caselessMatchPartitionSize) Unpacks data for the selected Unicode version, populatingpropertyValueIntervals
.private void
Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals
.getCaselessMatches
(int c) Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.getIntCharSet
(String propertyValue) Returns the character interval set associated with the given property value for the selected Unicode version.int
Returns the maximum code point for the selected Unicode version.Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.private void
Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.private void
Unpacks the caseless match data.private static String
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
-
Field Details
-
UNICODE_VERSIONS
- See Also:
-
DEFAULT_UNICODE_VERSION
- See Also:
-
WORD_SEP_PATTERN
-
maximumCodePoint
private int maximumCodePoint -
propertyValueIntervals
-
caselessMatchPartitions
-
caselessMatchPartitionSize
private int caselessMatchPartitionSize -
caselessMatches
-
-
Constructor Details
-
UnicodeProperties
Unpacks the Unicode data corresponding to the default Unicode version: ""12.1"".- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the default version is not supported.
-
UnicodeProperties
public UnicodeProperties(String version) throws UnicodeProperties.UnsupportedUnicodeVersionException Unpacks the Unicode data corresponding to the given version.- Parameters:
version
- The Unicode version for which to unpack data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the given version is not supported.
-
-
Method Details
-
getMaximumCodePoint
public int getMaximumCodePoint()Returns the maximum code point for the selected Unicode version.- Returns:
- the maximum code point for the selected Unicode version.
-
getIntCharSet
Returns the character interval set associated with the given property value for the selected Unicode version.- Parameters:
propertyValue
- The Unicode property or property value (or alias for one of these) for which to return the corresponding character intervals.- Returns:
- The character interval set corresponding to the given property value, if a match exists, and null otherwise.
-
getPropertyValues
Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.- Returns:
- The set of all properties supported by the specified Unicode version
-
getCaselessMatches
Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.The first call to this method lazily initializes the backing data.
- Parameters:
c
- The character for which to return case-insensitive equivalents.- Returns:
- All case-insensitively equivalent characters, or null if the given character is case-insensitively equivalent only to itself.
-
initCaselessMatches
private void initCaselessMatches()Unpacks the caseless match data. Called fromgetCaselessMatches(int)
to lazily initialize. -
init
Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.- Parameters:
version
- The Unicode version for which to bind data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the given version is not supported.
-
bind
private void bind(String[] propertyValues, String[] intervals, String[] propertyValueAliases, int maximumCodePoint, String caselessMatchPartitions, int caselessMatchPartitionSize) Unpacks data for the selected Unicode version, populatingpropertyValueIntervals
.- Parameters:
propertyValues
- The list of property values, in same order as the packed data corresponding to them, in the given intervals, for the selected Unicode version.intervals
- The packed character intervals corresponding to and in the same order as the given propertyValues, for the selected Unicode version.propertyValueAliases
- Key/value pairs mapping property value aliases to property values, for the selected Unicode version.maximumCodePoint
- The maximum code point for the selected Unicode version.caselessMatchPartitions
- The packed caseless match partition data for the selected Unicode versioncaselessMatchPartitionSize
- The partition data record length (the maximum number of elements in a caseless match partition) for the selected Unicode version.
-
bindInvariantIntervals
private void bindInvariantIntervals()Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals
. -
normalize
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.- Parameters:
identifier
- The identifier to normalize- Returns:
- The normalized identifier
-