Class UnicodeProperties

java.lang.Object
jflex.core.unicode.UnicodeProperties

public class UnicodeProperties extends Object
  • Field Details

    • UNICODE_VERSIONS

      public static final String UNICODE_VERSIONS
      See Also:
    • DEFAULT_UNICODE_VERSION

      private static final String DEFAULT_UNICODE_VERSION
      See Also:
    • WORD_SEP_PATTERN

      private static final Pattern WORD_SEP_PATTERN
    • maximumCodePoint

      private int maximumCodePoint
    • propertyValueIntervals

      private final Map<String,IntCharSet> propertyValueIntervals
    • caselessMatchPartitions

      private String caselessMatchPartitions
    • caselessMatchPartitionSize

      private int caselessMatchPartitionSize
    • caselessMatches

      private IntCharSet[] caselessMatches
  • Constructor Details

  • Method Details

    • getMaximumCodePoint

      public int getMaximumCodePoint()
      Returns the maximum code point for the selected Unicode version.
      Returns:
      the maximum code point for the selected Unicode version.
    • getIntCharSet

      public IntCharSet getIntCharSet(String propertyValue)
      Returns the character interval set associated with the given property value for the selected Unicode version.
      Parameters:
      propertyValue - The Unicode property or property value (or alias for one of these) for which to return the corresponding character intervals.
      Returns:
      The character interval set corresponding to the given property value, if a match exists, and null otherwise.
    • getPropertyValues

      public Set<String> getPropertyValues()
      Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.
      Returns:
      The set of all properties supported by the specified Unicode version
    • getCaselessMatches

      public IntCharSet getCaselessMatches(int c)
      Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.

      The first call to this method lazily initializes the backing data.

      Parameters:
      c - The character for which to return case-insensitive equivalents.
      Returns:
      All case-insensitively equivalent characters, or null if the given character is case-insensitively equivalent only to itself.
    • initCaselessMatches

      private void initCaselessMatches()
      Unpacks the caseless match data. Called from getCaselessMatches(int) to lazily initialize.
    • init

      Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.
      Parameters:
      version - The Unicode version for which to bind data
      Throws:
      UnicodeProperties.UnsupportedUnicodeVersionException - if the given version is not supported.
    • bind

      private void bind(String[] propertyValues, String[] intervals, String[] propertyValueAliases, int maximumCodePoint, String caselessMatchPartitions, int caselessMatchPartitionSize)
      Unpacks data for the selected Unicode version, populating propertyValueIntervals.
      Parameters:
      propertyValues - The list of property values, in same order as the packed data corresponding to them, in the given intervals, for the selected Unicode version.
      intervals - The packed character intervals corresponding to and in the same order as the given propertyValues, for the selected Unicode version.
      propertyValueAliases - Key/value pairs mapping property value aliases to property values, for the selected Unicode version.
      maximumCodePoint - The maximum code point for the selected Unicode version.
      caselessMatchPartitions - The packed caseless match partition data for the selected Unicode version
      caselessMatchPartitionSize - The partition data record length (the maximum number of elements in a caseless match partition) for the selected Unicode version.
    • bindInvariantIntervals

      private void bindInvariantIntervals()
      Adds intervals for \p{ASCII} and \p{Any} to propertyValueIntervals.
    • normalize

      private static String normalize(String identifier)
      Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
      Parameters:
      identifier - The identifier to normalize
      Returns:
      The normalized identifier