8.3. 文字型

PostgreSQL 17.5文書
		第8章データ型	誤訳等の報告
前へ	上へ	8.3. 文字型	次へ

8.3. 文字型 #

<title>Character Types</title>

表8.4 文字型

<title>Character Types</title>

名前	説明
`character varying(n)`, `varchar(n)`	上限付き可変長
`character(n)`, `char(n)`, `bpchar(n)`	空白で埋められた固定長
`bpchar`	variable unlimited length, blank-trimmed
`text`	制限なし可変長

<xref linkend="datatype-character-table"/> shows the general-purpose character types available in <productname>PostgreSQL</productname>. 表 8.4はPostgreSQLで使用可能な汎用文字型を示したものです。

<acronym>SQL</acronym> defines two primary character types: <type>character varying(<replaceable>n</replaceable>)</type> and <type>character(<replaceable>n</replaceable>)</type>, where <replaceable>n</replaceable> is a positive integer. Both of these types can store strings up to <replaceable>n</replaceable> characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the <acronym>SQL</acronym> standard.) However, if one explicitly casts a value to <type>character varying(<replaceable>n</replaceable>)</type> or <type>character(<replaceable>n</replaceable>)</type>, then an over-length value will be truncated to <replaceable>n</replaceable> characters without raising an error. (This too is required by the <acronym>SQL</acronym> standard.) If the string to be stored is shorter than the declared length, values of type <type>character</type> will be space-padded; values of type <type>character varying</type> will simply store the shorter string. SQLは2つの主要な文字データ型を定義しています。 character varying(n)とcharacter(n)です。ここでnは正の整数です。これらのデータ型は2つともn文字長（バイト数ではなく）までの文字列を保存できます。上限を越えた文字列をこれらの型の列に保存しようとするとエラーになります。ただし、上限を超えた部分にある文字がすべて空白の場合はエラーにはならず、文字列の最大長にまで切り詰められます。（この一風変わった例外は標準SQLで要求されています。）しかし、character varying(n)やcharacter(n)に明示的なキャストが行われた場合、文字数の上限を超えた値は、エラーを発生させることなくn文字に切り捨てられます。（これもまた、標準SQLで要求されています。）もし宣言された上限よりも文字列が短い時はcharacterの値は空白で埋められ、character varyingの値は単にその短い文字列で保存されます。

In addition, <productname>PostgreSQL</productname> provides the <type>text</type> type, which stores strings of any length. Although the <type>text</type> type is not in the <acronym>SQL</acronym> standard, several other SQL database management systems have it as well. <type>text</type> is <productname>PostgreSQL</productname>'s native string data type, in that most built-in functions operating on strings are declared to take or return <type>text</type> not <type>character varying</type>. For many purposes, <type>character varying</type> acts as though it were a <link linkend="domains">domain</link> over <type>text</type>. さらに、PostgreSQLは、任意の長さの文字列を格納するtext型を提供します。 text型は標準SQLにはありませんが、他のいくつかのSQLデータベース管理システムにもあります。 textはPostgreSQLネイティブの文字列データ型であり、文字列を操作するほとんどの組み込み関数には、引数や戻り値にcharacter varyingではなく、textが宣言されています。多くの目的のために、character varyingはtextに対するドメインであるかのように動作します。

The type name <type>varchar</type> is an alias for <type>character varying</type>, while <type>bpchar</type> (with length specifier) and <type>char</type> are aliases for <type>character</type>. The <type>varchar</type> and <type>char</type> aliases are defined in the <acronym>SQL</acronym> standard; <type>bpchar</type> is a <productname>PostgreSQL</productname> extension. 型名varcharはcharacter varyingの別名で、（長さ指定子がある）bpcharとcharはcharacterの別名です。 varcharとcharの別名は標準SQLで定義されています。bpcharはPostgreSQLの拡張です。

If specified, the length <replaceable>n</replaceable> must be greater than zero and cannot exceed 10,485,760. If <type>character varying</type> (or <type>varchar</type>) is used without length specifier, the type accepts strings of any length. If <type>bpchar</type> lacks a length specifier, it also accepts strings of any length, but trailing spaces are semantically insignificant. If <type>character</type> (or <type>char</type>) lacks a specifier, it is equivalent to <type>character(1)</type>. 長さを指定する場合、nはゼロより大きな値でなければならず、10,485,760を超えることはできません。長さ指定子なしでcharacter varying （またはvarchar）が使用された場合、この型は任意の長さの文字列を受け入れます。 bpcharに長さ指定子がない場合、この型は任意の長さの文字列も受け付けますが、末尾の空白は意味的に重要ではありません。 character（またはchar）に指定子がない場合、この型はcharacter(1)と同じです。

Values of type <type>character</type> are physically padded with spaces to the specified width <replaceable>n</replaceable>, and are stored and displayed that way. However, trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type <type>character</type>. In collations where whitespace is significant, this behavior can produce unexpected results; for example <command>SELECT 'a '::CHAR(2) collate "C" < E'a\n'::CHAR(2)</command> returns true, even though <literal>C</literal> locale would consider a space to be greater than a newline. Trailing spaces are removed when converting a <type>character</type> value to one of the other string types. Note that trailing spaces <emphasis>are</emphasis> semantically significant in <type>character varying</type> and <type>text</type> values, and when using pattern matching, that is <literal>LIKE</literal> and regular expressions. character型の値は、指定長nになるまで物理的に空白で埋められ、そのまま格納、表示されます。しかし、最後の空白は、意味的に重要ではないものとして扱われ、2つのcharacter型の値を比べる際には無視されます。空白が重要な照合順序では、この挙動は予期しない結果を返す可能性があります。例えば、SELECT 'a '::CHAR(2) collate "C" < E'a\n'::CHAR(2)はCロケールでスペースが改行よりも大きいにも関わらず真を返します。 character値を他の文字列型に変換する際は、文字列の終わりの空白は除去されます。 character varying型とtext型の値の中や、パターンマッチを行なう際、すなわちLIKEや正規表現では、最後の空白は意味的に重要なものですので、注意してください。

The characters that can be stored in any of these data types are determined by the database character set, which is selected when the database is created. Regardless of the specific character set, the character with code zero (sometimes called NUL) cannot be stored. For more information refer to <xref linkend="multibyte"/>. これらのデータ型のいずれかに格納できる文字はデータベースを作成するときに選択されるデータベースキャラクタセットによって決定されます。特定のキャラクタセットに関わらず、コード0（時にはNULと呼ばれます）を格納することはできません。より詳細な情報は23.3を参照ください。

The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, which includes the space padding in the case of <type>character</type>. Longer strings have 4 bytes of overhead instead of 1. Long strings are compressed by the system automatically, so the physical requirement on disk might be less. Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values. In any case, the longest possible character string that can be stored is about 1 GB. (The maximum value that will be allowed for <replaceable>n</replaceable> in the data type declaration is less than that. It wouldn't be useful to change this because with multibyte character encodings the number of characters and bytes can be quite different. If you desire to store long strings with no specific upper limit, use <type>text</type> or <type>character varying</type> without a length specifier, rather than making up an arbitrary length limit.) 短い文字列（126バイトまで）の保存には、実際の文字列に１バイト加えたサイズが必要です。 characterでは空白埋め込み分もこれに含まれます。より長い文字列では１バイトではなく４バイトのオーバーヘッドになります。長い文字列はシステムにより自動的に圧縮されますので、ディスク上の物理的必要容量サイズはより小さくなるかもしれません。また、非常に長い値はより短い列の値への高速アクセスに干渉しないように、バックグラウンドテーブルに格納されます。いずれの場合にあっても保存できる最長の文字列は約1ギガバイトです。（データ型宣言に使われるnに許される最大値はこれより小さいものです。マルチバイト文字符号化方式においては文字数とバイト数はまったく異なっているため、この値の変更は便利ではありません。特定の上限を設けずに長い文字列を保存したい場合は、適当な上限を設けるよりも、textもしくは長さの指定がないcharacter varyingを使用してください。）

ヒント

There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While <type>character(<replaceable>n</replaceable>)</type> has performance advantages in some other database systems, there is no such advantage in <productname>PostgreSQL</productname>; in fact <type>character(<replaceable>n</replaceable>)</type> is usually the slowest of the three because of its additional storage costs. In most situations <type>text</type> or <type>character varying</type> should be used instead. 空白で埋められる型を使用した場合の保存領域の増加、および、長さ制限付きの列に格納する際に長さを検査するためにいくつか余計なCPUサイクルが加わる点を別にして、これら3つの型の間で性能に関する差異はありません。他の一部のデータベースシステムではcharacter(n)には性能的な優位性がありますが、PostgreSQLではこうした利点はありません。実際には、格納の際に追加のコストがあるため、character(n)は3つの中でもっとも低速です。多くの場合、代わりにtextかcharacter varyingを使うのがお薦めです。

Refer to <xref linkend="sql-syntax-strings"/> for information about the syntax of string literals, and to <xref linkend="functions"/> for information about available operators and functions. 文字列リテラルの構文については4.1.2.1、利用可能な演算子と関数については第9章を参照してください。

例8.1 文字データ型の使用

<title>Using the Character Types</title>

CREATE TABLE test1 (a character(4));
INSERT INTO test1 VALUES ('ok');
SELECT a, char_length(a) FROM test1; -- (1)

  a   | char_length
------+-------------
 ok   |           2


CREATE TABLE test2 (b varchar(5));
INSERT INTO test2 VALUES ('ok');
INSERT INTO test2 VALUES ('good      ');
INSERT INTO test2 VALUES ('too long');
ERROR:  value too long for type character varying(5)

INSERT INTO test2 VALUES ('too long'::varchar(5)); &#45;- explicit truncation

INSERT INTO test2 VALUES ('too long'::varchar(5)); -- 明示的な切り捨て
SELECT b, char_length(b) FROM test2;

   b   | char_length
-------+-------------
 ok    |           2
 good  |           5
 too l |           5

(1)	The <function>char_length</function> function is discussed in <xref linkend="functions-string"/>. `char_length`関数は9.4で説明されています。

There are two other fixed-length character types in <productname>PostgreSQL</productname>, shown in <xref linkend="datatype-character-special-table"/>. These are not intended for general-purpose use, only for use in the internal system catalogs. The <type>name</type> type is used to store identifiers. Its length is currently defined as 64 bytes (63 usable characters plus terminator) but should be referenced using the constant <symbol>NAMEDATALEN</symbol> in <literal>C</literal> source code. The length is set at compile time (and is therefore adjustable for special uses); the default maximum length might change in a future release. The type <type>"char"</type> (note the quotes) is different from <type>char(1)</type> in that it only uses one byte of storage, and therefore can store only a single ASCII character. It is used in the system catalogs as a simplistic enumeration type. PostgreSQLには、表 8.5に示すように、この他2つの固定長文字型があります。これらは一般的な使用を目的としたものではなく、内部的なシステムカタログでのみ使用することを意図しています。 name型は識別子を格納するために使われます。現在長さは64バイト（63バイトの利用可能文字と終止文字）と定義されていますが、CソースコードにあるNAMEDATALEN定数を使って参照される必要があります。この長さはコンパイル時に設定されます（そのため特別な用途に合わせ調整できます）。デフォルトの最大長は今後のリリースで変更される可能性があります。 "char"（二重引用符に注意）は、char(1)とは異なり、1バイトの領域しか使用せず、このため、シングルバイトのASCII文字のみを格納することができます。過度に単純化した列挙型としてシステムカタログで内部的に使用されます。

表8.5 特別な文字データ型

<title>Special Character Types</title>

名前	格納サイズ	説明
`"char"`	1バイト	単一バイト内部データ型
`name`	64バイト	オブジェクト名用の内部データ型

前へ	上へ	次へ
8.2. 通貨型	ホーム	8.4. バイナリ列データ型