Definition and Characteristics

Definition

CUBRID supports the following four types of character strings:

The followings are the rules that are applied when using the character string types.

Characteristics

Length

For a CHAR or VARCHAR type, specify the length (bytes) of a character string for a NCHAR or NCHAR VARYING type, specify the number of character strings (number of characters).

When the length of the character string entered exceeds the length specified, the characters in excess of the specified length are truncated.

For a fixed-length character string type such as CHAR or NCHAR, the length is fixed at the declared length. Therefore, the right part (trailing space) of the character string is filled with space characters when the string is stored. For a variable-length character string type such as VARCHAR or NCHAR VARYING, only the entered character string is stored, and the space is not filled with space characters.

The maximum length of a CHAR or VARCHAR type to be specified is 1,073,741,823 the maximum length of a NCHAR or NCHAR VARYING type to be specified is 536,870,911. The maximum length that can be input or output in a CSQL statement is 8,192 KB.

Character Set, charset

A character set (charset) is a set in which rules are defined that relate to what kind of codes can be used for encoding when specified characters (symbols) are stored in the computer.

CUBRID supports the following character sets and you can specify them as the CUBRID_LANG environment variable.  You can store data in other character sets (e.g. utf-8), but string function or LIKE search are not supported.

Character Set

CUBRID_LANG

8-bits ISO 8859-1 Latin

en_US

KSC 5601-1992 (EUC-KR)

ko_KR.euckr

Any characters from the above character sets can be included in a character string (the NULL character is represented as '\0').

Collating Character Sets

A collation is a set of rules used for comparing characters to search or sort values stored in the database when a certain character set is specified. Therefore, such rules are applied only to character string data types such as CHAR or VARCHAR. For a national character string type such as NCAHR() or NCHAR VARYING(), the sorting rules are determined according to the encoding algorithm of the specified character set.

Character String Coercion

Automatic coercion takes place between a fixed-length and a variable-length character string for the comparison of two characters, applicable only to characters that belong to the same character set. For example, when you extract a column value from a CHAR(5) data type and insert it into a column with a CHAR(10) data type, the data type is automatically coerced to CHAR(10). If you want to coerce a character string explicitly, use the CAST operator (See CAST Operator).