java.lang

Class Character

public final class Character extends Object implements Serializable, Comparable<Character>

Wrapper class for the primitive char data type. In addition, this class allows one to retrieve property information and perform transformations on the defined characters in the Unicode Standard, Version 4.0.0. java.lang.Character is designed to be very dynamic, and as such, it retrieves information on the Unicode character set from a separate database, gnu.java.lang.CharData, which can be easily upgraded.

For predicates, boundaries are used to describe the set of characters for which the method will return true. This syntax uses fairly normal regular expression notation. See 5.13 of the Unicode Standard, Version 4.0, for the boundary specification.

See http://www.unicode.org for more information on the Unicode Standard.

Since: 1.0

See Also: CharData

UNKNOWN: partly updated to 1.5; some things still missing

Nested Class Summary
static classCharacter.Subset
A subset of Unicode blocks.
static classCharacter.UnicodeBlock
A family of character subsets in the Unicode specification.
Field Summary
static byteCOMBINING_SPACING_MARK
Mc = Mark, Spacing Combining (Normative).
static byteCONNECTOR_PUNCTUATION
Pc = Punctuation, Connector (Informative).
static byteCONTROL
Cc = Other, Control (Normative).
static byteCURRENCY_SYMBOL
Sc = Symbol, Currency (Informative).
static byteDASH_PUNCTUATION
Pd = Punctuation, Dash (Informative).
static byteDECIMAL_DIGIT_NUMBER
Nd = Number, Decimal Digit (Normative).
static byteDIRECTIONALITY_ARABIC_NUMBER
Weak bidirectional character type "AN".
static byteDIRECTIONALITY_BOUNDARY_NEUTRAL
Weak bidirectional character type "BN".
static byteDIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Weak bidirectional character type "CS".
static byteDIRECTIONALITY_EUROPEAN_NUMBER
Weak bidirectional character type "EN".
static byteDIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Weak bidirectional character type "ES".
static byteDIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Weak bidirectional character type "ET".
static byteDIRECTIONALITY_LEFT_TO_RIGHT
Strong bidirectional character type "L".
static byteDIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Strong bidirectional character type "LRE".
static byteDIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Strong bidirectional character type "LRO".
static byteDIRECTIONALITY_NONSPACING_MARK
Weak bidirectional character type "NSM".
static byteDIRECTIONALITY_OTHER_NEUTRALS
Neutral bidirectional character type "ON".
static byteDIRECTIONALITY_PARAGRAPH_SEPARATOR
Neutral bidirectional character type "B".
static byteDIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Weak bidirectional character type "PDF".
static byteDIRECTIONALITY_RIGHT_TO_LEFT
Strong bidirectional character type "R".
static byteDIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Strong bidirectional character type "AL".
static byteDIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Strong bidirectional character type "RLE".
static byteDIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Strong bidirectional character type "RLO".
static byteDIRECTIONALITY_SEGMENT_SEPARATOR
Neutral bidirectional character type "S".
static byteDIRECTIONALITY_UNDEFINED
Undefined bidirectional character type.
static byteDIRECTIONALITY_WHITESPACE
Strong bidirectional character type "WS".
static byteENCLOSING_MARK
Me = Mark, Enclosing (Normative).
static byteEND_PUNCTUATION
Pe = Punctuation, Close (Informative).
static byteFINAL_QUOTE_PUNCTUATION
Pf = Punctuation, Final Quote (Informative).
static byteFORMAT
Cf = Other, Format (Normative).
static byteINITIAL_QUOTE_PUNCTUATION
Pi = Punctuation, Initial Quote (Informative).
static byteLETTER_NUMBER
Nl = Number, Letter (Normative).
static byteLINE_SEPARATOR
Zl = Separator, Line (Normative).
static byteLOWERCASE_LETTER
Ll = Letter, Lowercase (Informative).
static byteMATH_SYMBOL
Sm = Symbol, Math (Informative).
static intMAX_CODE_POINT
The maximum Unicode 4.0 code point, which is greater than the range of the char data type.
static charMAX_HIGH_SURROGATE
The maximum Unicode high surrogate code unit, or leading-surrogate, in the UTF-16 character encoding.
static charMAX_LOW_SURROGATE
The maximum Unicode low surrogate code unit, or trailing-surrogate, in the UTF-16 character encoding.
static intMAX_RADIX
Largest value allowed for radix arguments in Java.
static charMAX_SURROGATE
The maximum Unicode surrogate code unit in the UTF-16 character encoding.
static charMAX_VALUE
The maximum value the char data type can hold.
static intMIN_CODE_POINT
The minimum Unicode 4.0 code point.
static charMIN_HIGH_SURROGATE
The minimum Unicode high surrogate code unit, or leading-surrogate, in the UTF-16 character encoding.
static charMIN_LOW_SURROGATE
The minimum Unicode low surrogate code unit, or trailing-surrogate, in the UTF-16 character encoding.
static intMIN_RADIX
Smallest value allowed for radix arguments in Java.
static intMIN_SUPPLEMENTARY_CODE_POINT
The lowest possible supplementary Unicode code point (the first code point outside the basic multilingual plane (BMP)).
static charMIN_SURROGATE
The minimum Unicode surrogate code unit in the UTF-16 character encoding.
static charMIN_VALUE
The minimum value the char data type can hold.
static byteMODIFIER_LETTER
Lm = Letter, Modifier (Informative).
static byteMODIFIER_SYMBOL
Sk = Symbol, Modifier (Informative).
static byteNON_SPACING_MARK
Mn = Mark, Non-Spacing (Normative).
static byteOTHER_LETTER
Lo = Letter, Other (Informative).
static byteOTHER_NUMBER
No = Number, Other (Normative).
static byteOTHER_PUNCTUATION
Po = Punctuation, Other (Informative).
static byteOTHER_SYMBOL
So = Symbol, Other (Informative).
static bytePARAGRAPH_SEPARATOR
Zp = Separator, Paragraph (Normative).
static bytePRIVATE_USE
Co = Other, Private Use (Normative).
static intSIZE
The number of bits needed to represent a char.
static byteSPACE_SEPARATOR
Zs = Separator, Space (Normative).
static byteSTART_PUNCTUATION
Ps = Punctuation, Open (Informative).
static byteSURROGATE
Cs = Other, Surrogate (Normative).
static byteTITLECASE_LETTER
Lt = Letter, Titlecase (Informative).
static Class<Character>TYPE
Class object representing the primitive char data type.
static byteUNASSIGNED
Cn = Other, Not Assigned (Normative).
static byteUPPERCASE_LETTER
Lu = Letter, Uppercase (Informative).
Constructor Summary
Character(char value)
Wraps up a character.
Method Summary
static intcharCount(int codePoint)
Return number of 16-bit characters required to represent the given code point.
charcharValue()
Returns the character which has been wrapped by this class.
static intcodePointAt(CharSequence sequence, int index)
Get the code point at the specified index in the CharSequence.
static intcodePointAt(char[] chars, int index)
Get the code point at the specified index in the CharSequence.
static intcodePointAt(char[] chars, int index, int limit)
Get the code point at the specified index in the CharSequence.
static intcodePointBefore(char[] chars, int index)
Get the code point before the specified index.
static intcodePointBefore(char[] chars, int index, int start)
Get the code point before the specified index.
static intcodePointBefore(CharSequence sequence, int index)
Get the code point before the specified index.
static intcodePointCount(CharSequence seq, int beginIndex, int endIndex)
Returns the number of Unicode code points in the specified range of the given CharSequence.
static intcodePointCount(char[] a, int offset, int count)
Returns the number of Unicode code points in the specified range of the given char array.
intcompareTo(Character anotherCharacter)
Compares another Character to this Character, numerically.
static intdigit(char ch, int radix)
Converts a character into a digit of the specified radix.
static intdigit(int codePoint, int radix)
Converts a character into a digit of the specified radix.
booleanequals(Object o)
Determines if an object is equal to this object.
static charforDigit(int digit, int radix)
Converts a digit into a character which represents that digit in a specified radix.
static bytegetDirectionality(char ch)
Returns the Unicode directionality property of the character.
static bytegetDirectionality(int codePoint)
Returns the Unicode directionality property of the character.
static intgetNumericValue(char ch)
Returns the Unicode numeric value property of a character.
static intgetNumericValue(int codePoint)
Returns the Unicode numeric value property of a character.
static intgetType(char ch)
Returns the Unicode general category property of a character.
static intgetType(int codePoint)
Returns the Unicode general category property of a character.
inthashCode()
Returns the numerical value (unsigned) of the wrapped character.
static booleanisDefined(char ch)
Determines if a character is part of the Unicode Standard.
static booleanisDefined(int codePoint)
Determines if a character is part of the Unicode Standard.
static booleanisDigit(char ch)
Determines if a character is a Unicode decimal digit.
static booleanisDigit(int codePoint)
Determines if a character is a Unicode decimal digit.
static booleanisHighSurrogate(char ch)
Return true if the given character is a high surrogate.
static booleanisIdentifierIgnorable(char ch)
Determines if a character is ignorable in a Unicode identifier.
static booleanisIdentifierIgnorable(int codePoint)
Determines if a character is ignorable in a Unicode identifier.
static booleanisISOControl(char ch)
Determines if a character has the ISO Control property.
static booleanisISOControl(int codePoint)
Determines if the character is an ISO Control character.
static booleanisJavaIdentifierPart(char ch)
Determines if a character can follow the first letter in a Java identifier.
static booleanisJavaIdentifierPart(int codePoint)
Determines if a character can follow the first letter in a Java identifier.
static booleanisJavaIdentifierStart(char ch)
Determines if a character can start a Java identifier.
static booleanisJavaIdentifierStart(int codePoint)
Determines if a character can start a Java identifier.
static booleanisJavaLetter(char ch)
Determines if a character can start a Java identifier.
static booleanisJavaLetterOrDigit(char ch)
Determines if a character can follow the first letter in a Java identifier.
static booleanisLetter(char ch)
Determines if a character is a Unicode letter.
static booleanisLetter(int codePoint)
Determines if a character is a Unicode letter.
static booleanisLetterOrDigit(char ch)
Determines if a character is a Unicode letter or a Unicode digit.
static booleanisLetterOrDigit(int codePoint)
Determines if a character is a Unicode letter or a Unicode digit.
static booleanisLowerCase(char ch)
Determines if a character is a Unicode lowercase letter.
static booleanisLowerCase(int codePoint)
Determines if a character is a Unicode lowercase letter.
static booleanisLowSurrogate(char ch)
Return true if the given character is a low surrogate.
static booleanisMirrored(char ch)
Determines whether the character is mirrored according to Unicode.
static booleanisMirrored(int codePoint)
Determines whether the character is mirrored according to Unicode.
static booleanisSpace(char ch)
Determines if a character is a ISO-LATIN-1 space.
static booleanisSpaceChar(char ch)
Determines if a character is a Unicode space character.
static booleanisSpaceChar(int codePoint)
Determines if a character is a Unicode space character.
static booleanisSupplementaryCodePoint(int codePoint)
Determines whether the specified code point is in the range 0x10000 ..
static booleanisSurrogatePair(char ch1, char ch2)
Return true if the given characters compose a surrogate pair.
static booleanisTitleCase(char ch)
Determines if a character is a Unicode titlecase letter.
static booleanisTitleCase(int codePoint)
Determines if a character is a Unicode titlecase letter.
static booleanisUnicodeIdentifierPart(char ch)
Determines if a character can follow the first letter in a Unicode identifier.
static booleanisUnicodeIdentifierPart(int codePoint)
Determines if a character can follow the first letter in a Unicode identifier.
static booleanisUnicodeIdentifierStart(char ch)
Determines if a character can start a Unicode identifier.
static booleanisUnicodeIdentifierStart(int codePoint)
Determines if a character can start a Unicode identifier.
static booleanisUpperCase(char ch)
Determines if a character is a Unicode uppercase letter.
static booleanisUpperCase(int codePoint)
Determines if a character is a Unicode uppercase letter.
static booleanisValidCodePoint(int codePoint)
Determines whether the specified code point is in the range 0x0000 ..
static booleanisWhitespace(char ch)
Determines if a character is Java whitespace.
static booleanisWhitespace(int codePoint)
Determines if a character is Java whitespace.
static intoffsetByCodePoints(CharSequence seq, int index, int codePointOffset)
Returns the index into the given CharSequence that is offset codePointOffset code points from index.
static intoffsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)
Returns the index into the given char subarray that is offset codePointOffset code points from index.
static charreverseBytes(char val)
Reverse the bytes in val.
static char[]toChars(int codePoint)
Converts a unicode code point to a UTF-16 representation of that code point.
static inttoChars(int codePoint, char[] dst, int dstIndex)
Converts a unicode code point to its UTF-16 representation.
static inttoCodePoint(char high, char low)
Given a valid surrogate pair, this returns the corresponding code point.
static chartoLowerCase(char ch)
Converts a Unicode character into its lowercase equivalent mapping.
static inttoLowerCase(int codePoint)
Converts a Unicode character into its lowercase equivalent mapping.
StringtoString()
Converts the wrapped character into a String.
static StringtoString(char ch)
Returns a String of length 1 representing the specified character.
static chartoTitleCase(char ch)
Converts a Unicode character into its titlecase equivalent mapping.
static inttoTitleCase(int codePoint)
Converts a Unicode character into its titlecase equivalent mapping.
static chartoUpperCase(char ch)
Converts a Unicode character into its uppercase equivalent mapping.
static inttoUpperCase(int codePoint)
Converts a Unicode character into its uppercase equivalent mapping.
static CharactervalueOf(char val)
Returns an Character object wrapping the value.

Field Detail

COMBINING_SPACING_MARK

public static final byte COMBINING_SPACING_MARK
Mc = Mark, Spacing Combining (Normative).

Since: 1.1

CONNECTOR_PUNCTUATION

public static final byte CONNECTOR_PUNCTUATION
Pc = Punctuation, Connector (Informative).

Since: 1.1

CONTROL

public static final byte CONTROL
Cc = Other, Control (Normative).

Since: 1.1

CURRENCY_SYMBOL

public static final byte CURRENCY_SYMBOL
Sc = Symbol, Currency (Informative).

Since: 1.1

DASH_PUNCTUATION

public static final byte DASH_PUNCTUATION
Pd = Punctuation, Dash (Informative).

Since: 1.1

DECIMAL_DIGIT_NUMBER

public static final byte DECIMAL_DIGIT_NUMBER
Nd = Number, Decimal Digit (Normative).

Since: 1.1

DIRECTIONALITY_ARABIC_NUMBER

public static final byte DIRECTIONALITY_ARABIC_NUMBER
Weak bidirectional character type "AN".

Since: 1.4

DIRECTIONALITY_BOUNDARY_NEUTRAL

public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL
Weak bidirectional character type "BN".

Since: 1.4

DIRECTIONALITY_COMMON_NUMBER_SEPARATOR

public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Weak bidirectional character type "CS".

Since: 1.4

DIRECTIONALITY_EUROPEAN_NUMBER

public static final byte DIRECTIONALITY_EUROPEAN_NUMBER
Weak bidirectional character type "EN".

Since: 1.4

DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR

public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Weak bidirectional character type "ES".

Since: 1.4

DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR

public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Weak bidirectional character type "ET".

Since: 1.4

DIRECTIONALITY_LEFT_TO_RIGHT

public static final byte DIRECTIONALITY_LEFT_TO_RIGHT
Strong bidirectional character type "L".

Since: 1.4

DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING

public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Strong bidirectional character type "LRE".

Since: 1.4

DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE

public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Strong bidirectional character type "LRO".

Since: 1.4

DIRECTIONALITY_NONSPACING_MARK

public static final byte DIRECTIONALITY_NONSPACING_MARK
Weak bidirectional character type "NSM".

Since: 1.4

DIRECTIONALITY_OTHER_NEUTRALS

public static final byte DIRECTIONALITY_OTHER_NEUTRALS
Neutral bidirectional character type "ON".

Since: 1.4

DIRECTIONALITY_PARAGRAPH_SEPARATOR

public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
Neutral bidirectional character type "B".

Since: 1.4

DIRECTIONALITY_POP_DIRECTIONAL_FORMAT

public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Weak bidirectional character type "PDF".

Since: 1.4

DIRECTIONALITY_RIGHT_TO_LEFT

public static final byte DIRECTIONALITY_RIGHT_TO_LEFT
Strong bidirectional character type "R".

Since: 1.4

DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC

public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Strong bidirectional character type "AL".

Since: 1.4

DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING

public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Strong bidirectional character type "RLE".

Since: 1.4

DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE

public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Strong bidirectional character type "RLO".

Since: 1.4

DIRECTIONALITY_SEGMENT_SEPARATOR

public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR
Neutral bidirectional character type "S".

Since: 1.4

DIRECTIONALITY_UNDEFINED

public static final byte DIRECTIONALITY_UNDEFINED
Undefined bidirectional character type. Undefined char values have undefined directionality in the Unicode specification.

Since: 1.4

DIRECTIONALITY_WHITESPACE

public static final byte DIRECTIONALITY_WHITESPACE
Strong bidirectional character type "WS".

Since: 1.4

ENCLOSING_MARK

public static final byte ENCLOSING_MARK
Me = Mark, Enclosing (Normative).

Since: 1.1

END_PUNCTUATION

public static final byte END_PUNCTUATION
Pe = Punctuation, Close (Informative).

Since: 1.1

FINAL_QUOTE_PUNCTUATION

public static final byte FINAL_QUOTE_PUNCTUATION
Pf = Punctuation, Final Quote (Informative).

Since: 1.4

FORMAT

public static final byte FORMAT
Cf = Other, Format (Normative).

Since: 1.1

INITIAL_QUOTE_PUNCTUATION

public static final byte INITIAL_QUOTE_PUNCTUATION
Pi = Punctuation, Initial Quote (Informative).

Since: 1.4

LETTER_NUMBER

public static final byte LETTER_NUMBER
Nl = Number, Letter (Normative).

Since: 1.1

LINE_SEPARATOR

public static final byte LINE_SEPARATOR
Zl = Separator, Line (Normative).

Since: 1.1

LOWERCASE_LETTER

public static final byte LOWERCASE_LETTER
Ll = Letter, Lowercase (Informative).

Since: 1.1

MATH_SYMBOL

public static final byte MATH_SYMBOL
Sm = Symbol, Math (Informative).

Since: 1.1

MAX_CODE_POINT

public static final int MAX_CODE_POINT
The maximum Unicode 4.0 code point, which is greater than the range of the char data type. This value is 0x10FFFF.

Since: 1.5

MAX_HIGH_SURROGATE

public static final char MAX_HIGH_SURROGATE
The maximum Unicode high surrogate code unit, or leading-surrogate, in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MAX_LOW_SURROGATE

public static final char MAX_LOW_SURROGATE
The maximum Unicode low surrogate code unit, or trailing-surrogate, in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MAX_RADIX

public static final int MAX_RADIX
Largest value allowed for radix arguments in Java. This value is 36.

See Also: Character Character Integer valueOf

MAX_SURROGATE

public static final char MAX_SURROGATE
The maximum Unicode surrogate code unit in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MAX_VALUE

public static final char MAX_VALUE
The maximum value the char data type can hold. This value is '\\uFFFF'.

MIN_CODE_POINT

public static final int MIN_CODE_POINT
The minimum Unicode 4.0 code point. This value is 0.

Since: 1.5

MIN_HIGH_SURROGATE

public static final char MIN_HIGH_SURROGATE
The minimum Unicode high surrogate code unit, or leading-surrogate, in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MIN_LOW_SURROGATE

public static final char MIN_LOW_SURROGATE
The minimum Unicode low surrogate code unit, or trailing-surrogate, in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MIN_RADIX

public static final int MIN_RADIX
Smallest value allowed for radix arguments in Java. This value is 2.

See Also: Character Character Integer valueOf

MIN_SUPPLEMENTARY_CODE_POINT

public static final int MIN_SUPPLEMENTARY_CODE_POINT
The lowest possible supplementary Unicode code point (the first code point outside the basic multilingual plane (BMP)). This value is 0x10000.

MIN_SURROGATE

public static final char MIN_SURROGATE
The minimum Unicode surrogate code unit in the UTF-16 character encoding. This value is '?'.

Since: 1.5

MIN_VALUE

public static final char MIN_VALUE
The minimum value the char data type can hold. This value is '\\u0000'.

MODIFIER_LETTER

public static final byte MODIFIER_LETTER
Lm = Letter, Modifier (Informative).

Since: 1.1

MODIFIER_SYMBOL

public static final byte MODIFIER_SYMBOL
Sk = Symbol, Modifier (Informative).

Since: 1.1

NON_SPACING_MARK

public static final byte NON_SPACING_MARK
Mn = Mark, Non-Spacing (Normative).

Since: 1.1

OTHER_LETTER

public static final byte OTHER_LETTER
Lo = Letter, Other (Informative).

Since: 1.1

OTHER_NUMBER

public static final byte OTHER_NUMBER
No = Number, Other (Normative).

Since: 1.1

OTHER_PUNCTUATION

public static final byte OTHER_PUNCTUATION
Po = Punctuation, Other (Informative).

Since: 1.1

OTHER_SYMBOL

public static final byte OTHER_SYMBOL
So = Symbol, Other (Informative).

Since: 1.1

PARAGRAPH_SEPARATOR

public static final byte PARAGRAPH_SEPARATOR
Zp = Separator, Paragraph (Normative).

Since: 1.1

PRIVATE_USE

public static final byte PRIVATE_USE
Co = Other, Private Use (Normative).

Since: 1.1

SIZE

public static final int SIZE
The number of bits needed to represent a char.

Since: 1.5

SPACE_SEPARATOR

public static final byte SPACE_SEPARATOR
Zs = Separator, Space (Normative).

Since: 1.1

START_PUNCTUATION

public static final byte START_PUNCTUATION
Ps = Punctuation, Open (Informative).

Since: 1.1

SURROGATE

public static final byte SURROGATE
Cs = Other, Surrogate (Normative).

Since: 1.1

TITLECASE_LETTER

public static final byte TITLECASE_LETTER
Lt = Letter, Titlecase (Informative).

Since: 1.1

TYPE

public static final Class<Character> TYPE
Class object representing the primitive char data type.

Since: 1.1

UNASSIGNED

public static final byte UNASSIGNED
Cn = Other, Not Assigned (Normative).

Since: 1.1

UPPERCASE_LETTER

public static final byte UPPERCASE_LETTER
Lu = Letter, Uppercase (Informative).

Since: 1.1

Constructor Detail

Character

public Character(char value)
Wraps up a character.

Parameters: value the character to wrap

Method Detail

charCount

public static int charCount(int codePoint)
Return number of 16-bit characters required to represent the given code point.

Parameters: codePoint a unicode code point

Returns: 2 if codePoint >= 0x10000, 1 otherwise.

Since: 1.5

charValue

public char charValue()
Returns the character which has been wrapped by this class.

Returns: the character wrapped

codePointAt

public static int codePointAt(CharSequence sequence, int index)
Get the code point at the specified index in the CharSequence. This is like CharSequence#charAt(int), but if the character is the start of a surrogate pair, and there is a following character, and this character completes the pair, then the corresponding supplementary code point is returned. Otherwise, the character at the index is returned.

Parameters: sequence the CharSequence index the index of the codepoint to get, starting at 0

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is negative or >= length()

Since: 1.5

codePointAt

public static int codePointAt(char[] chars, int index)
Get the code point at the specified index in the CharSequence. If the character is the start of a surrogate pair, and there is a following character, and this character completes the pair, then the corresponding supplementary code point is returned. Otherwise, the character at the index is returned.

Parameters: chars the character array in which to look index the index of the codepoint to get, starting at 0

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is negative or >= length()

Since: 1.5

codePointAt

public static int codePointAt(char[] chars, int index, int limit)
Get the code point at the specified index in the CharSequence. If the character is the start of a surrogate pair, and there is a following character within the specified range, and this character completes the pair, then the corresponding supplementary code point is returned. Otherwise, the character at the index is returned.

Parameters: chars the character array in which to look index the index of the codepoint to get, starting at 0 limit the limit past which characters should not be examined

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is negative or >= limit, or if limit is negative or >= the length of the array

Since: 1.5

codePointBefore

public static int codePointBefore(char[] chars, int index)
Get the code point before the specified index. This is like #codePointAt(char[], int), but checks the characters at index-1 and index-2 to see if they form a supplementary code point. If they do not, the character at index-1 is returned.

Parameters: chars the character array index the index just past the codepoint to get, starting at 0

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is negative or >= length()

Since: 1.5

codePointBefore

public static int codePointBefore(char[] chars, int index, int start)
Get the code point before the specified index. This is like #codePointAt(char[], int), but checks the characters at index-1 and index-2 to see if they form a supplementary code point. If they do not, the character at index-1 is returned. The start parameter is used to limit the range of the array which may be examined.

Parameters: chars the character array index the index just past the codepoint to get, starting at 0 start the index before which characters should not be examined

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is > start or > the length of the array, or if limit is negative or >= the length of the array

Since: 1.5

codePointBefore

public static int codePointBefore(CharSequence sequence, int index)
Get the code point before the specified index. This is like #codePointAt(CharSequence, int), but checks the characters at index-1 and index-2 to see if they form a supplementary code point. If they do not, the character at index-1 is returned.

Parameters: sequence the CharSequence index the index just past the codepoint to get, starting at 0

Returns: the codepoint at the specified index

Throws: IndexOutOfBoundsException if index is negative or >= length()

Since: 1.5

codePointCount

public static int codePointCount(CharSequence seq, int beginIndex, int endIndex)
Returns the number of Unicode code points in the specified range of the given CharSequence. The first char in the range is at position beginIndex and the last one is at position endIndex - 1. Paired surrogates (supplementary characters are represented by a pair of chars - one from the high surrogates and one from the low surrogates) count as just one code point.

Parameters: seq the CharSequence to inspect beginIndex the beginning of the range endIndex the end of the range

Returns: the number of Unicode code points in the given range of the sequence

Throws: NullPointerException if seq is null IndexOutOfBoundsException if beginIndex is negative, endIndex is larger than the length of seq, or if beginIndex is greater than endIndex.

Since: 1.5

codePointCount

public static int codePointCount(char[] a, int offset, int count)
Returns the number of Unicode code points in the specified range of the given char array. The first char in the range is at position offset and the length of the range is count. Paired surrogates (supplementary characters are represented by a pair of chars - one from the high surrogates and one from the low surrogates) count as just one code point.

Parameters: a the char array to inspect offset the beginning of the range count the length of the range

Returns: the number of Unicode code points in the given range of the array

Throws: NullPointerException if a is null IndexOutOfBoundsException if offset or count is negative or if offset + countendIndex is larger than the length of a.

Since: 1.5

compareTo

public int compareTo(Character anotherCharacter)
Compares another Character to this Character, numerically.

Parameters: anotherCharacter Character to compare with this Character

Returns: a negative integer if this Character is less than anotherCharacter, zero if this Character is equal, and a positive integer if this Character is greater

Throws: NullPointerException if anotherCharacter is null

Since: 1.2

digit

public static int digit(char ch, int radix)
Converts a character into a digit of the specified radix. If the radix exceeds MIN_RADIX or MAX_RADIX, or if the result of getNumericValue(ch) exceeds the radix, or if ch is not a decimal digit or in the case insensitive set of 'a'-'z', the result is -1.
character argument boundary = [Nd]|U+0041-U+005A|U+0061-U+007A |U+FF21-U+FF3A|U+FF41-U+FF5A

Parameters: ch character to convert into a digit radix radix in which ch is a digit

Returns: digit which ch represents in radix, or -1 not a valid digit

See Also: MIN_RADIX MAX_RADIX Character Character Character

digit

public static int digit(int codePoint, int radix)
Converts a character into a digit of the specified radix. If the radix exceeds MIN_RADIX or MAX_RADIX, or if the result of getNumericValue(ch) exceeds the radix, or if ch is not a decimal digit or in the case insensitive set of 'a'-'z', the result is -1.
character argument boundary = [Nd]|U+0041-U+005A|U+0061-U+007A |U+FF21-U+FF3A|U+FF41-U+FF5A

Parameters: codePoint character to convert into a digit radix radix in which ch is a digit

Returns: digit which ch represents in radix, or -1 not a valid digit

See Also: MIN_RADIX MAX_RADIX Character Character Character

equals

public boolean equals(Object o)
Determines if an object is equal to this object. This is only true for another Character object wrapping the same value.

Parameters: o object to compare

Returns: true if o is a Character with the same value

forDigit

public static char forDigit(int digit, int radix)
Converts a digit into a character which represents that digit in a specified radix. If the radix exceeds MIN_RADIX or MAX_RADIX, or the digit exceeds the radix, then the null character '\0' is returned. Otherwise the return value is in '0'-'9' and 'a'-'z'.
return value boundary = U+0030-U+0039|U+0061-U+007A

Parameters: digit digit to be converted into a character radix radix of digit

Returns: character representing digit in radix, or '\0'

See Also: MIN_RADIX MAX_RADIX Character

getDirectionality

public static byte getDirectionality(char ch)
Returns the Unicode directionality property of the character. This is used in the visual ordering of text.

Parameters: ch the character to look up

Returns: the directionality constant, or DIRECTIONALITY_UNDEFINED

Since: 1.4

See Also: DIRECTIONALITY_UNDEFINED DIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_WHITESPACE DIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_POP_DIRECTIONAL_FORMAT

getDirectionality

public static byte getDirectionality(int codePoint)
Returns the Unicode directionality property of the character. This is used in the visual ordering of text.

Parameters: codePoint the character to look up

Returns: the directionality constant, or DIRECTIONALITY_UNDEFINED

Since: 1.5

See Also: DIRECTIONALITY_UNDEFINED DIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_WHITESPACE DIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_POP_DIRECTIONAL_FORMAT

getNumericValue

public static int getNumericValue(char ch)
Returns the Unicode numeric value property of a character. For example, '\\u216C' (the Roman numeral fifty) returns 50.

This method also returns values for the letters A through Z, (not specified by Unicode), in these ranges: 'A' through 'Z' (uppercase); 'a' through 'z' (lowercase); and 'A' through 'Z', 'a' through 'z' (full width variants).

If the character lacks a numeric value property, -1 is returned. If the character has a numeric value property which is not representable as a nonnegative integer, such as a fraction, -2 is returned. character argument boundary = [Nd]|[Nl]|[No]|U+0041-U+005A|U+0061-U+007A |U+FF21-U+FF3A|U+FF41-U+FF5A

Parameters: ch character from which the numeric value property will be retrieved

Returns: the numeric value property of ch, or -1 if it does not exist, or -2 if it is not representable as a nonnegative integer

Since: 1.1

See Also: Character Character Character

getNumericValue

public static int getNumericValue(int codePoint)
Returns the Unicode numeric value property of a character. For example, '\\u216C' (the Roman numeral fifty) returns 50.

This method also returns values for the letters A through Z, (not specified by Unicode), in these ranges: 'A' through 'Z' (uppercase); 'a' through 'z' (lowercase); and 'A' through 'Z', 'a' through 'z' (full width variants).

If the character lacks a numeric value property, -1 is returned. If the character has a numeric value property which is not representable as a nonnegative integer, such as a fraction, -2 is returned. character argument boundary = [Nd]|[Nl]|[No]|U+0041-U+005A|U+0061-U+007A |U+FF21-U+FF3A|U+FF41-U+FF5A

Parameters: codePoint character from which the numeric value property will be retrieved

Returns: the numeric value property of ch, or -1 if it does not exist, or -2 if it is not representable as a nonnegative integer

Since: 1.5

See Also: Character Character Character

getType

public static int getType(char ch)
Returns the Unicode general category property of a character.

Parameters: ch character from which the general category property will be retrieved

Returns: the character category property of ch as an integer

Since: 1.1

See Also: UNASSIGNED UPPERCASE_LETTER LOWERCASE_LETTER TITLECASE_LETTER MODIFIER_LETTER OTHER_LETTER NON_SPACING_MARK ENCLOSING_MARK COMBINING_SPACING_MARK DECIMAL_DIGIT_NUMBER LETTER_NUMBER OTHER_NUMBER SPACE_SEPARATOR LINE_SEPARATOR PARAGRAPH_SEPARATOR CONTROL FORMAT PRIVATE_USE SURROGATE DASH_PUNCTUATION START_PUNCTUATION END_PUNCTUATION CONNECTOR_PUNCTUATION OTHER_PUNCTUATION MATH_SYMBOL CURRENCY_SYMBOL MODIFIER_SYMBOL INITIAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATION

getType

public static int getType(int codePoint)
Returns the Unicode general category property of a character.

Parameters: codePoint character from which the general category property will be retrieved

Returns: the character category property of ch as an integer

Since: 1.5

See Also: UNASSIGNED UPPERCASE_LETTER LOWERCASE_LETTER TITLECASE_LETTER MODIFIER_LETTER OTHER_LETTER NON_SPACING_MARK ENCLOSING_MARK COMBINING_SPACING_MARK DECIMAL_DIGIT_NUMBER LETTER_NUMBER OTHER_NUMBER SPACE_SEPARATOR LINE_SEPARATOR PARAGRAPH_SEPARATOR CONTROL FORMAT PRIVATE_USE SURROGATE DASH_PUNCTUATION START_PUNCTUATION END_PUNCTUATION CONNECTOR_PUNCTUATION OTHER_PUNCTUATION MATH_SYMBOL CURRENCY_SYMBOL MODIFIER_SYMBOL INITIAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATION

hashCode

public int hashCode()
Returns the numerical value (unsigned) of the wrapped character. Range of returned values: 0x0000-0xFFFF.

Returns: the value of the wrapped character

isDefined

public static boolean isDefined(char ch)
Determines if a character is part of the Unicode Standard. This is an evolving standard, but covers every character in the data file.
defined = not [Cn]

Parameters: ch character to test

Returns: true if ch is a Unicode character, else false

See Also: Character Character Character Character Character Character

isDefined

public static boolean isDefined(int codePoint)
Determines if a character is part of the Unicode Standard. This is an evolving standard, but covers every character in the data file.
defined = not [Cn]

Parameters: codePoint character to test

Returns: true if ch is a Unicode character, else false

Since: 1.5

See Also: Character Character Character Character Character

isDigit

public static boolean isDigit(char ch)
Determines if a character is a Unicode decimal digit. For example, '0' is a digit. A character is a Unicode digit if getType() returns DECIMAL_DIGIT_NUMBER.
Unicode decimal digit = [Nd]

Parameters: ch character to test

Returns: true if ch is a Unicode decimal digit, else false

See Also: Character Character Character

isDigit

public static boolean isDigit(int codePoint)
Determines if a character is a Unicode decimal digit. For example, '0' is a digit. A character is a Unicode digit if getType() returns DECIMAL_DIGIT_NUMBER.
Unicode decimal digit = [Nd]

Parameters: codePoint character to test

Returns: true if ch is a Unicode decimal digit, else false

Since: 1.5

See Also: Character Character

isHighSurrogate

public static boolean isHighSurrogate(char ch)
Return true if the given character is a high surrogate.

Parameters: ch the character

Returns: true if the character is a high surrogate character

Since: 1.5

isIdentifierIgnorable

public static boolean isIdentifierIgnorable(char ch)
Determines if a character is ignorable in a Unicode identifier. This includes the non-whitespace ISO control characters ('' through '', '' through '', and '' through 'Ÿ'), and FORMAT characters.
Unicode identifier ignorable = [Cf]|U+0000-U+0008|U+000E-U+001B |U+007F-U+009F

Parameters: ch character to test

Returns: true if ch is ignorable in a Unicode or Java identifier

Since: 1.1

See Also: Character Character

isIdentifierIgnorable

public static boolean isIdentifierIgnorable(int codePoint)
Determines if a character is ignorable in a Unicode identifier. This includes the non-whitespace ISO control characters ('' through '', '' through '', and '' through 'Ÿ'), and FORMAT characters.
Unicode identifier ignorable = [Cf]|U+0000-U+0008|U+000E-U+001B |U+007F-U+009F

Parameters: codePoint character to test

Returns: true if ch is ignorable in a Unicode or Java identifier

Since: 1.5

See Also: Character Character

isISOControl

public static boolean isISOControl(char ch)
Determines if a character has the ISO Control property.
ISO Control = [Cc]

Parameters: ch character to test

Returns: true if ch is an ISO Control character, else false

Since: 1.1

See Also: Character Character

isISOControl

public static boolean isISOControl(int codePoint)
Determines if the character is an ISO Control character. This is true if the code point is in the range [0, 0x001F] or if it is in the range [0x007F, 0x009F].

Parameters: codePoint the character to check

Returns: true if the character is in one of the above ranges

Since: 1.5

isJavaIdentifierPart

public static boolean isJavaIdentifierPart(char ch)
Determines if a character can follow the first letter in a Java identifier. This is the combination of isJavaLetter (isLetter, type of LETTER_NUMBER, currency, connecting punctuation) and digit, numeric letter (like Roman numerals), combining marks, non-spacing marks, or isIdentifierIgnorable.
Java identifier extender = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]|[Mn]|[Mc]|[Nd]|[Cf] |U+0000-U+0008|U+000E-U+001B|U+007F-U+009F

Parameters: ch character to test

Returns: true if ch can follow the first letter in a Java identifier

Since: 1.1

See Also: Character Character Character Character

isJavaIdentifierPart

public static boolean isJavaIdentifierPart(int codePoint)
Determines if a character can follow the first letter in a Java identifier. This is the combination of isJavaLetter (isLetter, type of LETTER_NUMBER, currency, connecting punctuation) and digit, numeric letter (like Roman numerals), combining marks, non-spacing marks, or isIdentifierIgnorable.
Java identifier extender = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]|[Mn]|[Mc]|[Nd]|[Cf] |U+0000-U+0008|U+000E-U+001B|U+007F-U+009F

Parameters: codePoint character to test

Returns: true if ch can follow the first letter in a Java identifier

Since: 1.5

See Also: Character Character Character Character

isJavaIdentifierStart

public static boolean isJavaIdentifierStart(char ch)
Determines if a character can start a Java identifier. This is the combination of isLetter, any character where getType returns LETTER_NUMBER, currency symbols (like '$'), and connecting punctuation (like '_').
Java identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]

Parameters: ch character to test

Returns: true if ch can start a Java identifier, else false

Since: 1.1

See Also: Character Character Character

isJavaIdentifierStart

public static boolean isJavaIdentifierStart(int codePoint)
Determines if a character can start a Java identifier. This is the combination of isLetter, any character where getType returns LETTER_NUMBER, currency symbols (like '$'), and connecting punctuation (like '_').
Java identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Sc]|[Pc]

Parameters: codePoint character to test

Returns: true if ch can start a Java identifier, else false

Since: 1.5

See Also: Character Character Character

isJavaLetter

public static boolean isJavaLetter(char ch)

Deprecated: Replaced by {@link #isJavaIdentifierStart(char)}

Determines if a character can start a Java identifier. This is the combination of isLetter, any character where getType returns LETTER_NUMBER, currency symbols (like '$'), and connecting punctuation (like '_').

Parameters: ch character to test

Returns: true if ch can start a Java identifier, else false

See Also: Character Character Character Character Character Character

isJavaLetterOrDigit

public static boolean isJavaLetterOrDigit(char ch)

Deprecated: Replaced by {@link #isJavaIdentifierPart(char)}

Determines if a character can follow the first letter in a Java identifier. This is the combination of isJavaLetter (isLetter, type of LETTER_NUMBER, currency, connecting punctuation) and digit, numeric letter (like Roman numerals), combining marks, non-spacing marks, or isIdentifierIgnorable.

Parameters: ch character to test

Returns: true if ch can follow the first letter in a Java identifier

See Also: Character Character Character Character Character Character Character

isLetter

public static boolean isLetter(char ch)
Determines if a character is a Unicode letter. Not all letters have case, so this may return true when isLowerCase and isUpperCase return false. A character is a Unicode letter if getType() returns one of UPPERCASE_LETTER, LOWERCASE_LETTER, TITLECASE_LETTER, MODIFIER_LETTER, or OTHER_LETTER.
letter = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]

Parameters: ch character to test

Returns: true if ch is a Unicode letter, else false

See Also: Character Character Character Character Character Character Character Character Character

isLetter

public static boolean isLetter(int codePoint)
Determines if a character is a Unicode letter. Not all letters have case, so this may return true when isLowerCase and isUpperCase return false. A character is a Unicode letter if getType() returns one of UPPERCASE_LETTER, LOWERCASE_LETTER, TITLECASE_LETTER, MODIFIER_LETTER, or OTHER_LETTER.
letter = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]

Parameters: codePoint character to test

Returns: true if ch is a Unicode letter, else false

Since: 1.5

See Also: Character Character Character Character Character Character Character Character

isLetterOrDigit

public static boolean isLetterOrDigit(char ch)
Determines if a character is a Unicode letter or a Unicode digit. This is the combination of isLetter and isDigit.
letter or digit = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nd]

Parameters: ch character to test

Returns: true if ch is a Unicode letter or a Unicode digit, else false

See Also: Character Character Character Character Character Character

isLetterOrDigit

public static boolean isLetterOrDigit(int codePoint)
Determines if a character is a Unicode letter or a Unicode digit. This is the combination of isLetter and isDigit.
letter or digit = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nd]

Parameters: codePoint character to test

Returns: true if ch is a Unicode letter or a Unicode digit, else false

Since: 1.5

See Also: Character Character Character Character Character

isLowerCase

public static boolean isLowerCase(char ch)
Determines if a character is a Unicode lowercase letter. For example, 'a' is lowercase. Returns true if getType() returns LOWERCASE_LETTER.
lowercase = [Ll]

Parameters: ch character to test

Returns: true if ch is a Unicode lowercase letter, else false

See Also: Character Character Character Character

isLowerCase

public static boolean isLowerCase(int codePoint)
Determines if a character is a Unicode lowercase letter. For example, 'a' is lowercase. Returns true if getType() returns LOWERCASE_LETTER.
lowercase = [Ll]

Parameters: codePoint character to test

Returns: true if ch is a Unicode lowercase letter, else false

Since: 1.5

See Also: Character Character Character

isLowSurrogate

public static boolean isLowSurrogate(char ch)
Return true if the given character is a low surrogate.

Parameters: ch the character

Returns: true if the character is a low surrogate character

Since: 1.5

isMirrored

public static boolean isMirrored(char ch)
Determines whether the character is mirrored according to Unicode. For example, ( (LEFT PARENTHESIS) appears as '(' in left-to-right text, but ')' in right-to-left text.

Parameters: ch the character to look up

Returns: true if the character is mirrored

Since: 1.4

isMirrored

public static boolean isMirrored(int codePoint)
Determines whether the character is mirrored according to Unicode. For example, ( (LEFT PARENTHESIS) appears as '(' in left-to-right text, but ')' in right-to-left text.

Parameters: codePoint the character to look up

Returns: true if the character is mirrored

Since: 1.5

isSpace

public static boolean isSpace(char ch)

Deprecated: Replaced by {@link #isWhitespace(char)}

Determines if a character is a ISO-LATIN-1 space. This is only the five characters '\t', '\n', '\f', '\r', and ' '.
Java space = U+0020|U+0009|U+000A|U+000C|U+000D

Parameters: ch character to test

Returns: true if ch is a space, else false

See Also: Character Character

isSpaceChar

public static boolean isSpaceChar(char ch)
Determines if a character is a Unicode space character. This includes SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR.
Unicode space = [Zs]|[Zp]|[Zl]

Parameters: ch character to test

Returns: true if ch is a Unicode space, else false

Since: 1.1

See Also: Character

isSpaceChar

public static boolean isSpaceChar(int codePoint)
Determines if a character is a Unicode space character. This includes SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR.
Unicode space = [Zs]|[Zp]|[Zl]

Parameters: codePoint character to test

Returns: true if ch is a Unicode space, else false

Since: 1.5

See Also: Character

isSupplementaryCodePoint

public static boolean isSupplementaryCodePoint(int codePoint)
Determines whether the specified code point is in the range 0x10000 .. 0x10FFFF, i.e. the character is within the Unicode supplementary character range.

Parameters: codePoint a Unicode code point

Returns: true if code point is in supplementary range

Since: 1.5

isSurrogatePair

public static boolean isSurrogatePair(char ch1, char ch2)
Return true if the given characters compose a surrogate pair. This is true if the first character is a high surrogate and the second character is a low surrogate.

Parameters: ch1 the first character ch2 the first character

Returns: true if the characters compose a surrogate pair

Since: 1.5

isTitleCase

public static boolean isTitleCase(char ch)
Determines if a character is a Unicode titlecase letter. For example, the character "Lj" (Latin capital L with small letter j) is titlecase. True if getType() returns TITLECASE_LETTER.
titlecase = [Lt]

Parameters: ch character to test

Returns: true if ch is a Unicode titlecase letter, else false

See Also: Character Character Character Character

isTitleCase

public static boolean isTitleCase(int codePoint)
Determines if a character is a Unicode titlecase letter. For example, the character "Lj" (Latin capital L with small letter j) is titlecase. True if getType() returns TITLECASE_LETTER.
titlecase = [Lt]

Parameters: codePoint character to test

Returns: true if ch is a Unicode titlecase letter, else false

Since: 1.5

See Also: Character Character Character

isUnicodeIdentifierPart

public static boolean isUnicodeIdentifierPart(char ch)
Determines if a character can follow the first letter in a Unicode identifier. This includes letters, connecting punctuation, digits, numeric letters, combining marks, non-spacing marks, and isIdentifierIgnorable.
Unicode identifier extender = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Mn]|[Mc]|[Nd]|[Pc]|[Cf]| |U+0000-U+0008|U+000E-U+001B|U+007F-U+009F

Parameters: ch character to test

Returns: true if ch can follow the first letter in a Unicode identifier

Since: 1.1

See Also: Character Character Character Character

isUnicodeIdentifierPart

public static boolean isUnicodeIdentifierPart(int codePoint)
Determines if a character can follow the first letter in a Unicode identifier. This includes letters, connecting punctuation, digits, numeric letters, combining marks, non-spacing marks, and isIdentifierIgnorable.
Unicode identifier extender = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]|[Mn]|[Mc]|[Nd]|[Pc]|[Cf]| |U+0000-U+0008|U+000E-U+001B|U+007F-U+009F

Parameters: codePoint character to test

Returns: true if ch can follow the first letter in a Unicode identifier

Since: 1.5

See Also: Character Character Character Character

isUnicodeIdentifierStart

public static boolean isUnicodeIdentifierStart(char ch)
Determines if a character can start a Unicode identifier. Only letters can start a Unicode identifier, but this includes characters in LETTER_NUMBER.
Unicode identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]

Parameters: ch character to test

Returns: true if ch can start a Unicode identifier, else false

Since: 1.1

See Also: Character Character Character

isUnicodeIdentifierStart

public static boolean isUnicodeIdentifierStart(int codePoint)
Determines if a character can start a Unicode identifier. Only letters can start a Unicode identifier, but this includes characters in LETTER_NUMBER.
Unicode identifier start = [Lu]|[Ll]|[Lt]|[Lm]|[Lo]|[Nl]

Parameters: codePoint character to test

Returns: true if ch can start a Unicode identifier, else false

Since: 1.5

See Also: Character Character Character

isUpperCase

public static boolean isUpperCase(char ch)
Determines if a character is a Unicode uppercase letter. For example, 'A' is uppercase. Returns true if getType() returns UPPERCASE_LETTER.
uppercase = [Lu]

Parameters: ch character to test

Returns: true if ch is a Unicode uppercase letter, else false

See Also: Character Character Character Character

isUpperCase

public static boolean isUpperCase(int codePoint)
Determines if a character is a Unicode uppercase letter. For example, 'A' is uppercase. Returns true if getType() returns UPPERCASE_LETTER.
uppercase = [Lu]

Parameters: codePoint character to test

Returns: true if ch is a Unicode uppercase letter, else false

Since: 1.5

See Also: Character Character Character

isValidCodePoint

public static boolean isValidCodePoint(int codePoint)
Determines whether the specified code point is in the range 0x0000 .. 0x10FFFF, i.e. it is a valid Unicode code point.

Parameters: codePoint a Unicode code point

Returns: true if code point is valid

Since: 1.5

isWhitespace

public static boolean isWhitespace(char ch)
Determines if a character is Java whitespace. This includes Unicode space characters (SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR) except the non-breaking spaces (' ', ' ', and ' '); and these characters: ' ', ' ', ' ', ' ', ' ', '', '', '', and ''.
Java whitespace = ([Zs] not Nb)|[Zl]|[Zp]|U+0009-U+000D|U+001C-U+001F

Parameters: ch character to test

Returns: true if ch is Java whitespace, else false

Since: 1.1

See Also: Character

isWhitespace

public static boolean isWhitespace(int codePoint)
Determines if a character is Java whitespace. This includes Unicode space characters (SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR) except the non-breaking spaces (' ', ' ', and ' '); and these characters: ' ', ' ', ' ', ' ', ' ', '', '', '', and ''.
Java whitespace = ([Zs] not Nb)|[Zl]|[Zp]|U+0009-U+000D|U+001C-U+001F

Parameters: codePoint character to test

Returns: true if ch is Java whitespace, else false

Since: 1.5

See Also: Character

offsetByCodePoints

public static int offsetByCodePoints(CharSequence seq, int index, int codePointOffset)
Returns the index into the given CharSequence that is offset codePointOffset code points from index.

Parameters: seq the CharSequence index the start position in the CharSequence codePointOffset the number of code points offset from the start position

Returns: the index into the CharSequence that is codePointOffset code points offset from index

Throws: NullPointerException if seq is null IndexOutOfBoundsException if index is negative or greater than the length of the sequence. IndexOutOfBoundsException if codePointOffset is positive and the subsequence from index to the end of seq has fewer than codePointOffset code points IndexOutOfBoundsException if codePointOffset is negative and the subsequence from the start of seq to index has fewer than (-codePointOffset) code points

Since: 1.5

offsetByCodePoints

public static int offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)
Returns the index into the given char subarray that is offset codePointOffset code points from index.

Parameters: a the char array start the start index of the subarray count the length of the subarray index the index to be offset codePointOffset the number of code points offset from index

Returns: the index into the char array

Throws: NullPointerException if a is null IndexOutOfBoundsException if start or count is negative or if start + count is greater than the length of the array IndexOutOfBoundsException if index is less than start or larger than start + count IndexOutOfBoundsException if codePointOffset is positive and the subarray from index to start + count - 1 has fewer than codePointOffset code points. IndexOutOfBoundsException if codePointOffset is negative and the subarray from start to index - 1 has fewer than (-codePointOffset) code points

Since: 1.5

reverseBytes

public static char reverseBytes(char val)
Reverse the bytes in val.

Since: 1.5

toChars

public static char[] toChars(int codePoint)
Converts a unicode code point to a UTF-16 representation of that code point.

Parameters: codePoint the unicode code point

Returns: the UTF-16 representation of that code point

Throws: IllegalArgumentException if the code point is not a valid unicode code point

Since: 1.5

toChars

public static int toChars(int codePoint, char[] dst, int dstIndex)
Converts a unicode code point to its UTF-16 representation.

Parameters: codePoint the unicode code point dst the target char array dstIndex the start index for the target

Returns: number of characters written to dst

Throws: IllegalArgumentException if codePoint is not a valid unicode code point NullPointerException if dst is null IndexOutOfBoundsException if dstIndex is not valid in dst or if the UTF-16 representation does not fit into dst

Since: 1.5

toCodePoint

public static int toCodePoint(char high, char low)
Given a valid surrogate pair, this returns the corresponding code point.

Parameters: high the high character of the pair low the low character of the pair

Returns: the corresponding code point

Since: 1.5

toLowerCase

public static char toLowerCase(char ch)
Converts a Unicode character into its lowercase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isLowerCase(toLowerCase(ch)) does not always return true.

Parameters: ch character to convert to lowercase

Returns: lowercase mapping of ch, or ch if lowercase mapping does not exist

See Also: Character Character Character Character

toLowerCase

public static int toLowerCase(int codePoint)
Converts a Unicode character into its lowercase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isLowerCase(toLowerCase(ch)) does not always return true.

Parameters: codePoint character to convert to lowercase

Returns: lowercase mapping of ch, or ch if lowercase mapping does not exist

Since: 1.5

See Also: Character Character Character

toString

public String toString()
Converts the wrapped character into a String.

Returns: a String containing one character -- the wrapped character of this instance

toString

public static String toString(char ch)
Returns a String of length 1 representing the specified character.

Parameters: ch the character to convert

Returns: a String containing the character

Since: 1.4

toTitleCase

public static char toTitleCase(char ch)
Converts a Unicode character into its titlecase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isTitleCase(toTitleCase(ch)) does not always return true.

Parameters: ch character to convert to titlecase

Returns: titlecase mapping of ch, or ch if titlecase mapping does not exist

See Also: Character Character Character

toTitleCase

public static int toTitleCase(int codePoint)
Converts a Unicode character into its titlecase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isTitleCase(toTitleCase(ch)) does not always return true.

Parameters: codePoint character to convert to titlecase

Returns: titlecase mapping of ch, or ch if titlecase mapping does not exist

Since: 1.5

See Also: Character Character

toUpperCase

public static char toUpperCase(char ch)
Converts a Unicode character into its uppercase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isUpperCase(toUpperCase(ch)) does not always return true.

Parameters: ch character to convert to uppercase

Returns: uppercase mapping of ch, or ch if uppercase mapping does not exist

See Also: Character Character Character Character

toUpperCase

public static int toUpperCase(int codePoint)
Converts a Unicode character into its uppercase equivalent mapping. If a mapping does not exist, then the character passed is returned. Note that isUpperCase(toUpperCase(ch)) does not always return true.

Parameters: codePoint character to convert to uppercase

Returns: uppercase mapping of ch, or ch if uppercase mapping does not exist

Since: 1.5

See Also: Character Character Character

valueOf

public static Character valueOf(char val)
Returns an Character object wrapping the value. In contrast to the Character constructor, this method will cache some values. It is used by boxing conversion.

Parameters: val the value to wrap

Returns: the Character

Since: 1.5