Class CharClasses
- Version:
- JFlex 1.8.2
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate List
<IntCharSet> the char classesprivate static final boolean
debug flag (for char classes only)private static final Comparator
<IntCharSet> for sorting disjoint IntCharSetsstatic final int
the largest character that can be used in char classesprivate int
the largest character actually used in a specificationprivate UnicodeProperties
the @{link UnicodeProperties} the spec scanner used -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionComputes a two-level table structure representing this CharClass object, where second-level blocks are shared if equal.static CharClasses
Construct a (deep) copy of the the provided CharClasses object.void
dump()
Dumps charclasses to the dump output stream.private static int[]
flattenBlocks
(List<CMapBlock> blocks) Turn a list of second-level blocks into a flat array.getCharClass
(int code) Retuns a copy of a single char class partition by code.int
getClassCode
(int codePoint) Returns the code of the character class the specified character belongs to.int[]
getClassCodes
(IntCharSet set, boolean negate) Returns an array that contains the character class codes of all characters in the specified set of input characters.Returns an array of all CharClassIntervals in this char class collection.int
Returns the greatest Unicode value of the current input character set.int
Returns the current number of character classes.Pair
<int[], int[]> Returns a two-level table structure for this char-class object.void
Provides space for classes of characters from 0 to maxCharCode.boolean
Checks the invariants of this object.void
makeClass
(int singleChar, boolean caseless) Creates a new character class for the single charactersingleChar
.void
Creates a new character class for each character of the specified String.void
makeClass
(IntCharSet set, boolean caseless) Updates the current partition, so that the specified set of characters gets a new character class.void
Brings the partitions into a canonical order such that objects that implement the same partitions but in different order become equal.void
setMaxCharCode
(int maxCharCode) Sets the largest Unicode value of the current input character set.toString()
toString
(int theClass) Returns a string representation of one char class
-
Field Details
-
DEBUG
private static final boolean DEBUGdebug flag (for char classes only)- See Also:
-
INT_CHAR_SET_COMPARATOR
for sorting disjoint IntCharSets -
maxChar
public static final int maxCharthe largest character that can be used in char classes- See Also:
-
classes
the char classes -
maxCharUsed
private int maxCharUsedthe largest character actually used in a specification -
unicodeProps
the @{link UnicodeProperties} the spec scanner used
-
-
Constructor Details
-
CharClasses
public CharClasses()Constructs a new CharClasses object.CharClasses.init() is delayed until UnicodeProperties.init() has been called, since the max char code won't be known until then.
-
-
Method Details
-
init
Provides space for classes of characters from 0 to maxCharCode.Initially all characters are in class 0.
- Parameters:
maxCharCode
- the last character code to be considered. (127 for 7bit Lexers, 255 for 8bit Lexers and UnicodeProperties.getMaximumCodePoint() for Unicode Lexers).scanner
- the scanner containing the UnicodeProperties instance from which caseless
-
getMaxCharCode
public int getMaxCharCode()Returns the greatest Unicode value of the current input character set.- Returns:
- unicode value.
-
setMaxCharCode
public void setMaxCharCode(int maxCharCode) Sets the largest Unicode value of the current input character set.- Parameters:
maxCharCode
- the largest character code, used for the scanner (i.e. %7bit, %8bit, %16bit etc.)
-
getNumClasses
public int getNumClasses()Returns the current number of character classes.- Returns:
- number of character classes.
-
allClasses
- Returns:
- a deep-copy list of all char class partions.
-
makeClass
Updates the current partition, so that the specified set of characters gets a new character class.Characters that are elements of
set
are not in the same equivalence class with characters that are not elements ofset
.- Parameters:
set
- the set of characters to distinguish from the restcaseless
- if true upper/lower/title case are considered equivalent
-
getClassCode
public int getClassCode(int codePoint) Returns the code of the character class the specified character belongs to.- Parameters:
codePoint
- code point.- Returns:
- code of the character class.
-
getCharClass
Retuns a copy of a single char class partition by code.- Parameters:
code
- the code of the char class partition to return.- Returns:
- a copy of the char class with the specified code.
-
dump
public void dump()Dumps charclasses to the dump output stream. -
toString
Returns a string representation of one char class- Parameters:
theClass
- the index of the class to- Returns:
- a
String
object.
-
toString
-
makeClass
public void makeClass(int singleChar, boolean caseless) Creates a new character class for the single charactersingleChar
.- Parameters:
singleChar
- character.caseless
- if true upper/lower/title case are considered equivalent
-
makeClass
Creates a new character class for each character of the specified String.- Parameters:
str
- the String to iterate single char class creation over.caseless
- if true upper/lower/title case are considered equivalent
-
getClassCodes
Returns an array that contains the character class codes of all characters in the specified set of input characters. -
invariants
public boolean invariants()Checks the invariants of this object.All classes must be disjoint, and their union must be the entire input set.
- Returns:
- true when the invariants of this objects hold.
-
normalise
public void normalise()Brings the partitions into a canonical order such that objects that implement the same partitions but in different order become equal.For example, [ {0}, {1} ] and [ {1}, {0} ] implement the same partition of the set {0,1} but have different content. Different order will lead to different input assignments in the NFA and DFA phases and will make otherwise equal automata look distinct.
This is not needed for correctness, but it makes the comparison of output DFAs (e.g. in the test suite) for equivalence more robust.
-
copyOf
Construct a (deep) copy of the the provided CharClasses object.- Parameters:
c
- the CharClasses to copy- Returns:
- a deep copy of c
-
getIntervals
Returns an array of all CharClassIntervals in this char class collection.The array is ordered by char code, i.e.
result[i+1].start = result[i].end+1
Each CharClassInterval contains the number of the char class it belongs to.- Returns:
- an array of all
CharClassInterval
in this char class collection.
-
computeTables
Computes a two-level table structure representing this CharClass object, where second-level blocks are shared if equal. The hope is that this sharing happens (very) often with a large number of blocks being mapped to the same character class.- Returns:
- a pair of a top-level table, and a list of second-level blocks for this char class object.
-
flattenBlocks
Turn a list of second-level blocks into a flat array. -
getTables
Returns a two-level table structure for this char-class object. The char class of inputx
issnd[(fst[x >> BLOCK_BITS]) | (x && BLOCK_MASK))]
whereBLOCK_MASK = BLOCK_SIZE - 1
, and the index of the first block in the top level is guaranteed to be 0 (which means thefst
lookup can be skipped ifx <= BLOCK_MASK
).- See Also:
-