14.1. EXPLAINの利用

PostgreSQL 18.0文書
		第14章性能に関するヒント	誤訳等の報告
前へ	上へ	14.1. `EXPLAIN`の利用	次へ

14.1. `EXPLAIN`の利用 #

<title>Using <command>EXPLAIN</command></title>

<productname>PostgreSQL</productname> devises a <firstterm>query plan</firstterm> for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex <firstterm>planner</firstterm> that tries to choose good plans. You can use the <link linkend="sql-explain"><command>EXPLAIN</command></link> command to see what query plan the planner creates for any query. Plan-reading is an art that requires some experience to master, but this section attempts to cover the basics. PostgreSQLは受理した問い合わせから問い合わせ計画を作り出します。問い合わせの構造と含まれるデータの性質に適した正しい問い合わせ計画を選択することが、良い性能を得るために非常に重要になります。ですので、システムには優れた計画の選択を試みる複雑なプランナが存在します。 EXPLAINコマンドを使えば、任意の問い合わせに対してプランナがどのような問い合わせ計画を作ったのかわかります。問い合わせ計画を読みこなすには、ある程度の経験が必要です。本節ではその基本を提供しようと考えます。

Examples in this section are drawn from the regression test database after doing a <command>VACUUM ANALYZE</command>, using v18 development sources. You should be able to get similar results if you try the examples yourself, but your estimated costs and row counts might vary slightly because <command>ANALYZE</command>'s statistics are random samples rather than exact, and because costs are inherently somewhat platform-dependent. 本節の例は、v18の開発版ソースを用いてVACUUM ANALYZEを実行した後でリグレッションテストデータベースから取り出したものです。実際にこの例を試すと、似たような結果になるはずですが、おそらく推定コストや行数は多少異なることになるでしょう。 ANALYZEによる統計情報は厳密なものではなくランダムなサンプリングを行った結果であり、また、コストは本質的にプラットフォームに何かしら依存するためです。

The examples use <command>EXPLAIN</command>'s default <quote>text</quote> output format, which is compact and convenient for humans to read. If you want to feed <command>EXPLAIN</command>'s output to a program for further analysis, you should use one of its machine-readable output formats (XML, JSON, or YAML) instead. 例では、簡潔で人が読みやすいEXPLAINのデフォルトの「text」出力書式を使用します。今後の解析でEXPLAINの出力をプログラムに渡すことを考えているのであれば、代わりに機械読み取りが容易な出力書式（XML、JSON、YAML）のいずれかを使用する必要があります。

14.1.1. `EXPLAIN`の基本 #

<title><command>EXPLAIN</command> Basics</title>

The structure of a query plan is a tree of <firstterm>plan nodes</firstterm>. Nodes at the bottom level of the tree are scan nodes: they return raw rows from a table. There are different types of scan nodes for different table access methods: sequential scans, index scans, and bitmap index scans. There are also non-table row sources, such as <literal>VALUES</literal> clauses and set-returning functions in <literal>FROM</literal>, which have their own scan node types. If the query requires joining, aggregation, sorting, or other operations on the raw rows, then there will be additional nodes above the scan nodes to perform these operations. Again, there is usually more than one possible way to do these operations, so different node types can appear here too. The output of <command>EXPLAIN</command> has one line for each node in the plan tree, showing the basic node type plus the cost estimates that the planner made for the execution of that plan node. Additional lines might appear, indented from the node's summary line, to show additional properties of the node. The very first line (the summary line for the topmost node) has the estimated total execution cost for the plan; it is this number that the planner seeks to minimize. 問い合わせ計画は計画ノードのツリー構造です。ツリー構造の最下層ノードはスキャンノードで、テーブルから行そのものを返します。シーケンシャルスキャン、インデックススキャン、ビットマップインデックススキャンといったテーブルアクセスメソッドの違いに応じ、スキャンノードの種類に違いがあります。また、VALUES句やFROM内の集合を返す関数など独自のスキャンノード種類を持つ、テーブル行を元にしないものがあります。問い合わせが結合、集約、ソートなど、行そのものに対する操作を必要としている場合、スキャンノードの上位に更に、これらの操作を行うためのノードが追加されます。これらの操作の実現方法にも通常複数の方法がありますので、異なった種類のノードがここに出現することもあり得ます。 EXPLAINには計画ツリー内の各ノードにつき1行の出力があり、基本ノード種類とプランナが生成したその計画ノードの実行に要するコスト推定値を示します。さらに、ノードの追加属性を表示するためにノードの要約行からインデント付けされた行が出力される可能性があります。最初の1行目（最上位ノード）には、計画全体の実行コスト推定値が含まれます。プランナはこの値が最小になるように動作します。

Here is a trivial example, just to show what the output looks like: どのような出力となるのかを示すためだけに、ここで簡単な例を示します。

EXPLAIN SELECT * FROM tenk1;

                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..445.00 rows=10000 width=244)

Since this query has no <literal>WHERE</literal> clause, it must scan all the rows of the table, so the planner has chosen to use a simple sequential scan plan. The numbers that are quoted in parentheses are (left to right): この問い合わせにはWHERE句がありませんので、テーブル行をすべてスキャンしなければなりません。このためプランナは単純なシーケンシャルスキャン計画を使用することを選びました。（左から右に）括弧で囲まれた数値には以下のものがあります。

Estimated start-up cost. This is the time expended before the output phase can begin, e.g., time to do the sorting in a sort node. 初期処理の推定コスト。出力段階が開始できるようになる前に消費される時間、例えば、SORTノードで実行されるソート処理の時間です。
Estimated total cost. This is stated on the assumption that the plan node is run to completion, i.e., all available rows are retrieved. In practice a node's parent node might stop short of reading all available rows (see the <literal>LIMIT</literal> example below). 全体推定コスト。これは計画ノードが実行完了である、つまりすべての利用可能な行を受け取ることを前提として示されます。実際には、ノードの親ノードはすべての利用可能な行を読む前に停止する可能性があります（以下のLIMITの例を参照）。
Estimated number of rows output by this plan node. Again, the node is assumed to be run to completion. この計画ノードが出力する行の推定数。ここでも、ノードが実行を完了することを前提としています。
Estimated average width of rows output by this plan node (in bytes). この計画ノードが出力する行の（バイト単位での）推定平均幅。

The costs are measured in arbitrary units determined by the planner's cost parameters (see <xref linkend="runtime-config-query-constants"/>). Traditional practice is to measure the costs in units of disk page fetches; that is, <xref linkend="guc-seq-page-cost"/> is conventionally set to <literal>1.0</literal> and the other cost parameters are set relative to that. The examples in this section are run with the default cost parameters. コストはプランナのコストパラメータ（19.7.2参照）によって決まる任意の単位で測定されます。取り出すディスクページ単位でコストを測定することが、伝統的な方式です。つまり、seq_page_costを慣習的に1.0に設定し、他のコストパラメータを相対的に設定します。本節の例では、デフォルトのコストパラメータで実行しています。

It's important to understand that the cost of an upper-level node includes the cost of all its child nodes. It's also important to realize that the cost only reflects things that the planner cares about. In particular, the cost does not consider the time spent to convert output values to text form or to transmit them to the client, which could be important factors in the real elapsed time; but the planner ignores those costs because it cannot change them by altering the plan. (Every correct plan will output the same row set, we trust.) 上位ノードのコストには、すべての子ノードのコストもその中に含まれていることを理解することは重要です。このコストはプランナが関与するコストのみ反映する点もまた重要です。とりわけ、出力値をテキスト形式に変換したり、クライアントに送信したりするために要する時間はこのコストでは考慮されません。これらは実際の経過時間の重要な要素となる可能性がありますが、計画を変更しても変えることができないため、プランナはこれらのコストを無視します。（全ての正しい計画は同じ行セットを出力するものと期待されます。）

The <literal>rows</literal> value is a little tricky because it is not the number of rows processed or scanned by the plan node, but rather the number emitted by the node. This is often less than the number scanned, as a result of filtering by any <literal>WHERE</literal>-clause conditions that are being applied at the node. Ideally the top-level rows estimate will approximate the number of rows actually returned, updated, or deleted by the query. rowsの値は、計画ノードによって処理あるいはスキャンされた行数を表しておらず、ノードによって発行された行数を表すので、多少扱いにくくなっています。該当ノードに適用されるすべてのWHERE句条件によるフィルタ処理の結果、スキャンされる行より少ない行数になることがよくあります。理想的には、最上位の行数の推定値は、実際に問い合わせによって返され、更新され、あるいは削除された概算の行数となります。

Returning to our example: 例に戻ります。

EXPLAIN SELECT * FROM tenk1;

                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..445.00 rows=10000 width=244)

These numbers are derived very straightforwardly. If you do: これらの数値はとても素直に導かれます。以下を実行すると、

SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';

you will find that <classname>tenk1</classname> has 345 disk pages and 10000 rows. The estimated cost is computed as (disk pages read * <xref linkend="guc-seq-page-cost"/>) + (rows scanned * <xref linkend="guc-cpu-tuple-cost"/>). By default, <varname>seq_page_cost</varname> is 1.0 and <varname>cpu_tuple_cost</varname> is 0.01, so the estimated cost is (345 * 1.0) + (10000 * 0.01) = 445. tenk1には345のディスクページと10000の行があることがわかります。推定コストは（ディスクページ読み取り * seq_page_cost）+（スキャンした行 * cpu_tuple_cost）と計算されます。デフォルトでは、seq_page_costは1.0、cpu_tuple_costは0.01です。ですから、推定コストは(345 * 1.0) + (10000 * 0.01) = 445となります。

Now let's modify the query to add a <literal>WHERE</literal> condition: では、WHERE条件を加えて、問い合わせを変更してみます。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 7000;

                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..470.00 rows=7000 width=244)
   Filter: (unique1 < 7000)

Notice that the <command>EXPLAIN</command> output shows the <literal>WHERE</literal> clause being applied as a <quote>filter</quote> condition attached to the Seq Scan plan node. This means that the plan node checks the condition for each row it scans, and outputs only the ones that pass the condition. The estimate of output rows has been reduced because of the <literal>WHERE</literal> clause. However, the scan will still have to visit all 10000 rows, so the cost hasn't decreased; in fact it has gone up a bit (by 10000 * <xref linkend="guc-cpu-operator-cost"/>, to be exact) to reflect the extra CPU time spent checking the <literal>WHERE</literal> condition. EXPLAINの出力が、Seq Scan計画ノードに付随する「フィルタ」条件として適用されるWHERE句を表示していることに注意してください。これは、この計画ノードがスキャンした各行に対してその条件を検査することを意味し、その条件を通過したもののみが出力されます。 WHERE句があるため、推定出力行数が小さくなっています。しかし、依然として10000行すべてをスキャンする必要があるため、コストは小さくなっていません。実際には、WHERE条件を検査するためにCPU時間が余計にかかることを反映して、ほんの少し（正確には10000 * cpu_operator_cost）ですがコストが上昇しています。

The actual number of rows this query would select is 7000, but the <literal>rows</literal> estimate is only approximate. If you try to duplicate this experiment, you may well get a slightly different estimate; moreover, it can change after each <command>ANALYZE</command> command, because the statistics produced by <command>ANALYZE</command> are taken from a randomized sample of the table. この問い合わせが選択する実際の行数は7000です。しかし、rowsの推定行数は概算値に過ぎません。この実験を2回実行した場合、多少異なる推定値を得るかもしれません。もっと言うと、これはANALYZEコマンドを行う度に変化することがあり得ます。なぜなら、ANALYZEで生成される統計情報は、テーブルのランダムな標本から取り出されるからです。

Now, let's make the condition more restrictive: では、条件をより強く制限してみます。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100;

                                  QUERY PLAN
------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=5.06..224.98 rows=100 width=244)
   Recheck Cond: (unique1 < 100)
   ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
         Index Cond: (unique1 < 100)

Here the planner has decided to use a two-step plan: the child plan node visits an index to find the locations of rows matching the index condition, and then the upper plan node actually fetches those rows from the table itself. Fetching rows separately is much more expensive than reading them sequentially, but because not all the pages of the table have to be visited, this is still cheaper than a sequential scan. (The reason for using two plan levels is that the upper plan node sorts the row locations identified by the index into physical order before reading them, to minimize the cost of separate fetches. The <quote>bitmap</quote> mentioned in the node names is the mechanism that does the sorting.) ここでは、プランナは2段階の計画を使用することを決定しました。子の計画ノードは、インデックスを使用して、インデックス条件(index condition)に合う行の場所を検索します。そして、上位計画ノードが実際にテーブル自体からこれらの行を取り出します。行を別々に取り出すことは、シーケンシャルな読み取りに比べ非常に高価です。しかし、テーブルのすべてのページを読み取る必要はありませんので、シーケンシャルスキャンより安価になります。（2段階の計画を使用する理由は、別々に行を取り出すコストを最小にするために、上位の計画ノードがインデックスにより識別された行の位置を読み取る前に物理的な順序でソートすることです。ノードで記載されている「bitmap」は、ソートを行う機構の名前です。）

Now let's add another condition to the <literal>WHERE</literal> clause: ここでWHERE句に別の条件を付与してみましょう。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND stringu1 = 'xxx';

                                  QUERY PLAN
------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=5.04..225.20 rows=1 width=244)
   Recheck Cond: (unique1 < 100)
   Filter: (stringu1 = 'xxx'::name)
   ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
         Index Cond: (unique1 < 100)

The added condition <literal>stringu1 = 'xxx'</literal> reduces the output row count estimate, but not the cost because we still have to visit the same set of rows. That's because the <literal>stringu1</literal> clause cannot be applied as an index condition, since this index is only on the <literal>unique1</literal> column. Instead it is applied as a filter on the rows retrieved using the index. Thus the cost has actually gone up slightly to reflect this extra checking. 追加されたstringu1 = 'xxx'条件は出力行数推定値を減らしますが、同じ行集合にアクセスしなければなりませんので、コストは減りません。これは、このインデックスがunique1列に対してのみ存在していて、stringu1の句をインデックス条件として適用できないためです。代わりに、インデックスによって取り出される行に対するフィルタとして適用されます。これにより、追加の検査分を反映するため、コストは実際には少し上がります。

In some cases the planner will prefer a <quote>simple</quote> index scan plan: 場合によってはプランナは「単純な」インデックススキャン計画を選択します。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 = 42;

                                 QUERY PLAN
-----------------------------------------------------------------------------
 Index Scan using tenk1_unique1 on tenk1  (cost=0.29..8.30 rows=1 width=244)
   Index Cond: (unique1 = 42)

In this type of plan the table rows are fetched in index order, which makes them even more expensive to read, but there are so few that the extra cost of sorting the row locations is not worth it. You'll most often see this plan type for queries that fetch just a single row. It's also often used for queries that have an <literal>ORDER BY</literal> condition that matches the index order, because then no extra sorting step is needed to satisfy the <literal>ORDER BY</literal>. In this example, adding <literal>ORDER BY unique1</literal> would use the same plan because the index already implicitly provides the requested ordering. この種の計画では、テーブル行はインデックス順で取り出されます。このため読み取りがより高価になりますが、この場合取り出す行数が少ないため、改めて行位置をソートし直すための追加コストは割に合いません。単一の行のみを取り出す問い合わせでは、この計画種類がよく現れます。また、ORDER BYを満たすために必要となる余分な必要なソート処理がないため、インデックスの順序に一致するORDER BY条件を持つ問い合わせでよく使用されます。この例では、ORDER BY unique1を追加すると、要求された順序がインデックスによってすでに暗黙的に提供されているため、同じ計画が使用されます。

The planner may implement an <literal>ORDER BY</literal> clause in several ways. The above example shows that such an ordering clause may be implemented implicitly. The planner may also add an explicit <literal>Sort</literal> step: プランナはORDER BY句をいくつかの方法で実装できます。上の例ではこのようなORDER BY句を暗黙的に実装できることを示しています。プランナは明示的なSortステップを追加もします。

EXPLAIN SELECT * FROM tenk1 ORDER BY unique1;

                            QUERY PLAN
-------------------------------------------------------------------
 Sort  (cost=1109.39..1134.39 rows=10000 width=244)
   Sort Key: unique1
   ->  Seq Scan on tenk1  (cost=0.00..445.00 rows=10000 width=244)

If a part of the plan guarantees an ordering on a prefix of the required sort keys, then the planner may instead decide to use an <literal>Incremental Sort</literal> step: ソートキーで必要な接頭辞の順序がプランの一部で保証されている場合、インクリメンタルソート(Incremental Sort)ステップを使用することを決定できます。

EXPLAIN SELECT * FROM tenk1 ORDER BY hundred, ten LIMIT 100;

                                              QUERY PLAN
------------------------------------------------------------------------------------------------
 Limit  (cost=19.35..39.49 rows=100 width=244)
   ->  Incremental Sort  (cost=19.35..2033.39 rows=10000 width=244)
         Sort Key: hundred, ten
         Presorted Key: hundred
         ->  Index Scan using tenk1_hundred on tenk1  (cost=0.29..1574.20 rows=10000 width=244)

Compared to regular sorts, sorting incrementally allows returning tuples before the entire result set has been sorted, which particularly enables optimizations with <literal>LIMIT</literal> queries. It may also reduce memory usage and the likelihood of spilling sorts to disk, but it comes at the cost of the increased overhead of splitting the result set into multiple sorting batches. 通常のソートと比較して、インクリメンタルソートは、結果セット全体がソートされる前にタプルを戻すことができます。これにより、特にLIMITがある問い合わせで最適化が可能になります。また、メモリ使用量が削減され、ソートがディスクにオーバーフローする可能性が減少しますが、結果セットを複数のソートバッチに分割するオーバーヘッドが増加という代償を払うことになります。

If there are separate indexes on several of the columns referenced in <literal>WHERE</literal>, the planner might choose to use an AND or OR combination of the indexes: WHERE句で参照される複数の列に対して別々のインデックスが存在する場合、プランナはインデックスをANDやORで組み合わせて使用することを選択する可能性があります。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;

                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=25.07..60.11 rows=10 width=244)
   Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
   ->  BitmapAnd  (cost=25.07..25.07 rows=10 width=0)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
               Index Cond: (unique1 < 100)
         ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0)
               Index Cond: (unique2 > 9000)

But this requires visiting both indexes, so it's not necessarily a win compared to using just one index and treating the other condition as a filter. If you vary the ranges involved you'll see the plan change accordingly. しかし、これは両方のインデックスを参照する必要があります。そのため、インデックスを1つ使用し、他の条件についてはフィルタとして扱う方法と比べて常に勝るとは限りません。含まれる範囲を変更すると、それに伴い計画も変わることが分かるでしょう。

Here is an example showing the effects of <literal>LIMIT</literal>: 以下にLIMITの影響を示す例を示します。

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;

                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Limit  (cost=0.29..14.28 rows=2 width=244)
   ->  Index Scan using tenk1_unique2 on tenk1  (cost=0.29..70.27 rows=10 width=244)
         Index Cond: (unique2 > 9000)
         Filter: (unique1 < 100)

This is the same query as above, but we added a <literal>LIMIT</literal> so that not all the rows need be retrieved, and the planner changed its mind about what to do. Notice that the total cost and row count of the Index Scan node are shown as if it were run to completion. However, the Limit node is expected to stop after retrieving only a fifth of those rows, so its total cost is only a fifth as much, and that's the actual estimated cost of the query. This plan is preferred over adding a Limit node to the previous plan because the Limit could not avoid paying the startup cost of the bitmap scan, so the total cost would be something over 25 units with that approach. これは上と同じ問い合わせですが、すべての行を取り出す必要がないためLIMITを付けています。プランナはどうすべきかについて考えを変えました。インデックススキャンノードの総コストと総行数があたかも実行完了したかのように表示されていることに注意してください。しかしLimitノードが、これらの行の1/5だけを取り出した後で停止することが想定されています。そのため総コストは1/5程度のみとなり、これが問い合わせの実際の推定コストとなります。この計画は、以前の計画にLimitノードを追加することより好まれます。以前の計画へのLimit追加でのLimitはビットマップスキャンの起動コストを払うことを避けることができないため、総コストは25単位を超えてしまうからです。

Let's try joining two tables, using the columns we have been discussing: 今まで説明に使ってきた列を使って2つのテーブルを結合してみましょう。

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Nested Loop  (cost=4.65..118.50 rows=10 width=488)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.38 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.90 rows=1 width=244)
         Index Cond: (unique2 = t1.unique2)

In this plan, we have a nested-loop join node with two table scans as inputs, or children. The indentation of the node summary lines reflects the plan tree structure. The join's first, or <quote>outer</quote>, child is a bitmap scan similar to those we saw before. Its cost and row count are the same as we'd get from <literal>SELECT ... WHERE unique1 < 10</literal> because we are applying the <literal>WHERE</literal> clause <literal>unique1 < 10</literal> at that node. The <literal>t1.unique2 = t2.unique2</literal> clause is not relevant yet, so it doesn't affect the row count of the outer scan. The nested-loop join node will run its second, or <quote>inner</quote> child once for each row obtained from the outer child. Column values from the current outer row can be plugged into the inner scan; here, the <literal>t1.unique2</literal> value from the outer row is available, so we get a plan and costs similar to what we saw above for a simple <literal>SELECT ... WHERE t2.unique2 = <replaceable>constant</replaceable></literal> case. (The estimated cost is actually a bit lower than what was seen above, as a result of caching that's expected to occur during the repeated index scans on <literal>t2</literal>.) The costs of the loop node are then set on the basis of the cost of the outer scan, plus one repetition of the inner scan for each outer row (10 * 7.90, here), plus a little CPU time for join processing. この計画では、入力または子として２つのテーブルスキャンを持つネステッドループ結合ノードがあります。計画のツリー構造を反映して、ノード要約行はインデント付けされます。結合の先頭、「外部」、子は以前に説明したものと似たビットマップスキャンです。そのコストと行数は、該当ノードにunique1 < 10 WHERE句が適用されるため、SELECT ... WHERE unique1 < 10で得られたものと同じです。この段階ではt1.unique2 = t2.unique2句は関係しておらず、外部スキャンにおける出力行数に影響していません。ネステッドループ結合ノードは、外部の子から得られた行毎に、その２番目または「内部の」子を一回実行します。現在の外部の行からの列の値は内部スキャンに組み込まれます。ここで、外部行からのt1.unique2の値が利用できますので、上述の単純なSELECT ... WHERE t2.unique2 = constantの場合に示したものと似た計画とコストが得られます。（実際、推定コストは、t2に対するインデックススキャンが繰り返される間に発生することが想定されるキャッシュの結果、上で示した値よりわずかに低くなります。）ループノードのコストは、外部スキャンのコストと、各々の外部の行に対して内部スキャンが繰り返されることによるコスト（ここでは10 * 7.90）を加え、さらに結合処理を行うための少々のCPU時間を加えたものになります。

In this example the join's output row count is the same as the product of the two scans' row counts, but that's not true in all cases because there can be additional <literal>WHERE</literal> clauses that mention both tables and so can only be applied at the join point, not to either input scan. Here's an example: この例では、結合の出力行数は2つのスキャンの出力行数の積に等しくなっていますが、いつもそうなるわけではありません。 2つのテーブルに関係するWHERE句は、入力スキャン時ではなく、結合を行う際に適用されるからです。以下が例です。

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t2.unique2 < 10 AND t1.hundred < t2.hundred;

                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Nested Loop  (cost=4.65..49.36 rows=33 width=488)
   Join Filter: (t1.hundred < t2.hundred)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.38 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Materialize  (cost=0.29..8.51 rows=10 width=244)
         ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
               Index Cond: (unique2 < 10)

The condition <literal>t1.hundred < t2.hundred</literal> can't be tested in the <literal>tenk2_unique2</literal> index, so it's applied at the join node. This reduces the estimated output row count of the join node, but does not change either input scan. 条件t1.hundred < t2.hundredはtenk2_unique2インデックスの中では試験されません。このため結合ノードで適用されます。これは結合ノードの推定出力行数を減らしはしますが、入力スキャンには影響しません。

Notice that here the planner has chosen to <quote>materialize</quote> the inner relation of the join, by putting a Materialize plan node atop it. This means that the <literal>t2</literal> index scan will be done just once, even though the nested-loop join node needs to read that data ten times, once for each row from the outer relation. The Materialize node saves the data in memory as it's read, and then returns the data from memory on each subsequent pass. ここではプランナが、具体化計画ノードをその上に挿入することで、結合の内部リレーションの「具体化」を選択していることに注意してください。これは、たとえネステッドループ結合ノードが外部リレーションから各行につき一度、そのデータを10回読む必要があったとしても、t2インデックススキャンが一度だけ行なわれることを意味します。具体化ノードはそのデータを読んだときにメモリに保存し、その後の読み出しではそのデータをメモリから返します。

When dealing with outer joins, you might see join plan nodes with both <quote>Join Filter</quote> and plain <quote>Filter</quote> conditions attached. Join Filter conditions come from the outer join's <literal>ON</literal> clause, so a row that fails the Join Filter condition could still get emitted as a null-extended row. But a plain Filter condition is applied after the outer-join rules and so acts to remove rows unconditionally. In an inner join there is no semantic difference between these types of filters. 外部結合を扱う時、「結合フィルタ」および通常の「フィルタ」の両方が付随する結合計画ノードが現れる可能性があります。結合フィルタ条件は外部結合のON句を元にしますので、結合フィルタ条件に合わない行がNULLで展開された行として発行され続けます。しかし通常のフィルタ条件が外部結合規則の後に適用され、条件に合わない行は削除されます。内部結合では、これらのフィルタ種類の間に意味的な違いはありません。

If we change the query's selectivity a bit, we might get a very different join plan: 問い合わせの選択性を少し変更すると、非常に異なる結合計画が得られるかもしれません。

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Hash Join  (cost=226.23..709.73 rows=100 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  Seq Scan on tenk2 t2  (cost=0.00..445.00 rows=10000 width=244)
   ->  Hash  (cost=224.98..224.98 rows=100 width=244)
         ->  Bitmap Heap Scan on tenk1 t1  (cost=5.06..224.98 rows=100 width=244)
               Recheck Cond: (unique1 < 100)
               ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
                     Index Cond: (unique1 < 100)

Here, the planner has chosen to use a hash join, in which rows of one table are entered into an in-memory hash table, after which the other table is scanned and the hash table is probed for matches to each row. Again note how the indentation reflects the plan structure: the bitmap scan on <literal>tenk1</literal> is the input to the Hash node, which constructs the hash table. That's then returned to the Hash Join node, which reads rows from its outer child plan and searches the hash table for each one. ここでプランナはハッシュ結合の使用を選択しました。片方のテーブルの行がメモリ内のハッシュテーブルに格納され、もう片方のテーブルがスキャンされた後、各行に対して一致するかどうかハッシュテーブルを探索します。再度、インデントが計画の構造が表されていることに注意してください。 tenk1に対するビットマップスキャンはハッシュノードへの入力です。外部の子計画から行を読み取り、各行に対してハッシュテーブルを検索します。

Another possible type of join is a merge join, illustrated here: 他にも、以下に示すようなマージ結合という結合があり得ます。

EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Merge Join  (cost=0.56..233.49 rows=10 width=488)
   Merge Cond: (t1.unique2 = t2.unique2)
   ->  Index Scan using tenk1_unique2 on tenk1 t1  (cost=0.29..643.28 rows=100 width=244)
         Filter: (unique1 < 100)
   ->  Index Scan using onek_unique2 on onek t2  (cost=0.28..166.28 rows=1000 width=244)

Merge join requires its input data to be sorted on the join keys. In this example each input is sorted by using an index scan to visit the rows in the correct order; but a sequential scan and sort could also be used. (Sequential-scan-and-sort frequently beats an index scan for sorting many rows, because of the nonsequential disk access required by the index scan.) マージ結合は、結合キーでソートされる入力データを必要とします。この例では、各入力がインデックススキャンを使用して正しい順序で行にアクセスすることでソートされますが、シーケンシャルスキャンとソートも使用できます。（多くの行をソートする場合、インデックススキャンでは非シーケンシャルなディスクアクセスが必要となるため、シーケンシャルスキャンとソートの方がインデックススキャンより優ります。）

One way to look at variant plans is to force the planner to disregard whatever strategy it thought was the cheapest, using the enable/disable flags described in <xref linkend="runtime-config-query-enable"/>. (This is a crude tool, but useful. See also <xref linkend="explicit-joins"/>.) For example, if we're unconvinced that merge join is the best join type for the previous example, we could try 19.7.1に記載したenable/disableフラグを使用して、プランナが最も良いと考えている戦略を強制的に無視させる方法により、異なった計画を観察することができます。（非常に原始的なツールですが、利用価値があります。 14.3も参照してください。）例えば、マージ結合が前の例の最善の結合タイプであると確信できない場合は、以下を試みることができます。

SET enable_mergejoin = off;

EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Hash Join  (cost=226.23..344.08 rows=10 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  Seq Scan on onek t2  (cost=0.00..114.00 rows=1000 width=244)
   ->  Hash  (cost=224.98..224.98 rows=100 width=244)
         ->  Bitmap Heap Scan on tenk1 t1  (cost=5.06..224.98 rows=100 width=244)
               Recheck Cond: (unique1 < 100)
               ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
                     Index Cond: (unique1 < 100)

which shows that the planner thinks that hash join would be nearly 50% more expensive than merge join for this case. Of course, the next question is whether it's right about that. We can investigate that using <command>EXPLAIN ANALYZE</command>, as discussed <link linkend="using-explain-analyze">below</link>. これは、このケースではハッシュ結合がマージ結合よりも50%近く高価になるとプランナが考えていることを示しています。当然ながら、次の疑問はこれが正しいかどうかでしょう。以下で説明するように、EXPLAIN ANALYZEを使って調べることができます。

When using the enable/disable flags to disable plan node types, many of the flags only discourage the use of the corresponding plan node and don't outright disallow the planner's ability to use the plan node type. This is by design so that the planner still maintains the ability to form a plan for a given query. When the resulting plan contains a disabled node, the <command>EXPLAIN</command> output will indicate this fact. enable/disableフラグを使用して計画ノード型を無効化する場合、そのフラグの多くは、対応する計画ノードの使用を妨げるだけで、プランナがその計画ノード型を使用することを完全に禁止するわけではありません。これは、与えられた問い合わせの計画を作成する機能をプランナが維持するための設計です。作成された計画に無効化したノードが含まれている場合、EXPLAINの出力にはその事実が示されます。

SET enable_seqscan = off;
EXPLAIN SELECT * FROM unit;

                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on unit  (cost=0.00..21.30 rows=1130 width=44)
   Disabled: true

Because the <literal>unit</literal> table has no indexes, there is no other means to read the table data, so the sequential scan is the only option available to the query planner. unitテーブルにはインデックスがなく、他にテーブルデータを読み取る方法がないため、問い合わせプランナが使用できるのはシーケンシャルスキャンのみです。

Some query plans involve <firstterm>subplans</firstterm>, which arise from sub-<literal>SELECT</literal>s in the original query. Such queries can sometimes be transformed into ordinary join plans, but when they cannot be, we get plans like: 一部の問い合わせ計画には、元の問い合わせ内の副SELECTから発生する 副計画（subplans）が含まれています。このような問い合わせは、通常の結合プランに変換できる場合もありますが、変換できない場合は、次のような計画が生成されます。

EXPLAIN VERBOSE SELECT unique1
FROM tenk1 t
WHERE t.ten < ALL (SELECT o.ten FROM onek o WHERE o.four = t.four);

                               QUERY PLAN
-------------------------------------------------------------------------
 Seq Scan on public.tenk1 t  (cost=0.00..586095.00 rows=5000 width=4)
   Output: t.unique1
   Filter: (ALL (t.ten < (SubPlan 1).col1))
   SubPlan 1
     ->  Seq Scan on public.onek o  (cost=0.00..116.50 rows=250 width=4)
           Output: o.ten
           Filter: (o.four = t.four)

This rather artificial example serves to illustrate a couple of points: values from the outer plan level can be passed down into a subplan (here, <literal>t.four</literal> is passed down) and the results of the sub-select are available to the outer plan. Those result values are shown by <command>EXPLAIN</command> with notations like <literal>(<replaceable>subplan_name</replaceable>).col<replaceable>N</replaceable></literal>, which refers to the <replaceable>N</replaceable>'th output column of the sub-<literal>SELECT</literal>. このかなり人工的な例は、二つの点を説明するのに役立ちます。すなわち、外側の計画階層からの値は副計画に渡すことができて（ここではt.fourが渡されます）、副SELECTの結果は外側の計画で利用できます。これらの結果値は、EXPLAINで(subplan_name).colNのような記法で表示され、これは副SELECTのN番目の出力列を指します。

In the example above, the <literal>ALL</literal> operator runs the subplan again for each row of the outer query (which accounts for the high estimated cost). Some queries can use a <firstterm>hashed subplan</firstterm> to avoid that: 上の例で、ALL演算子は、外側の問い合わせの各行に対して副計画を実行します（高い推定コストの原因となります）。一部の問い合わせでは、それを回避するためにハッシュされた副計画（hashed subplan）を使用できます。

EXPLAIN SELECT *
FROM tenk1 t
WHERE t.unique1 NOT IN (SELECT o.unique1 FROM onek o);

                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Seq Scan on tenk1 t  (cost=61.77..531.77 rows=5000 width=244)
   Filter: (NOT (ANY (unique1 = (hashed SubPlan 1).col1)))
   SubPlan 1
     ->  Index Only Scan using onek_unique1 on onek o  (cost=0.28..59.27 rows=1000 width=4)
(4 rows)

Here, the subplan is run a single time and its output is loaded into an in-memory hash table, which is then probed by the outer <literal>ANY</literal> operator. This requires that the sub-<literal>SELECT</literal> not reference any variables of the outer query, and that the <literal>ANY</literal>'s comparison operator be amenable to hashing. ここでは、副計画が1回実行され、その出力がメモリ内のハッシュテーブルに置かれ、次に外側のANY演算子によって検査されます。これには、副SELECTが外側の問い合わせの変数を参照しないことと、ANYの比較演算子がハッシュ適用できることが必要です。

If, in addition to not referencing any variables of the outer query, the sub-<literal>SELECT</literal> cannot return more than one row, it may instead be implemented as an <firstterm>initplan</firstterm>: 外側の問い合わせの変数を参照しないことに加えて、副SELECTが複数行を返せない場合は、代わりにinitplanとして次のように計画作成されます。

EXPLAIN VERBOSE SELECT unique1
FROM tenk1 t1 WHERE t1.ten = (SELECT (random() * 10)::integer);

                             QUERY PLAN
--------------------------------------------------------------------
 Seq Scan on public.tenk1 t1  (cost=0.02..470.02 rows=1000 width=4)
   Output: t1.unique1
   Filter: (t1.ten = (InitPlan 1).col1)
   InitPlan 1
     ->  Result  (cost=0.00..0.02 rows=1 width=4)
           Output: ((random() * '10'::double precision))::integer

An initplan is run only once per execution of the outer plan, and its results are saved for re-use in later rows of the outer plan. So in this example <literal>random()</literal> is evaluated only once and all the values of <literal>t1.ten</literal> are compared to the same randomly-chosen integer. That's quite different from what would happen without the sub-<literal>SELECT</literal> construct. initplanは外側の計画の実行ごとに1回だけ実行され、その結果は外側の計画の後の行で再利用するために保存されます。したがって、この例ではrandom()は1回だけ評価され、t1.10のすべての値がランダムに選択された同じ整数と比較されます。これは、副SELECT構造がない場合とは大きく異なります。

14.1.2. `EXPLAIN ANALYZE` #

It is possible to check the accuracy of the planner's estimates by using <command>EXPLAIN</command>'s <literal>ANALYZE</literal> option. With this option, <command>EXPLAIN</command> actually executes the query, and then displays the true row counts and true run time accumulated within each plan node, along with the same estimates that a plain <command>EXPLAIN</command> shows. For example, we might get a result like this: EXPLAINのANALYZEオプションを使用して、プランナが推定するコストの精度を点検することができます。このオプションを付けるとEXPLAINは実際にその問い合わせを実行し、計画ノードごとに実際の行数と要した実際の実行時間を、普通のEXPLAINが示すものと同じ推定値と一緒に表示します。例えば、以下のような結果を得ることができます。

EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=4.65..118.50 rows=10 width=488) (actual time=0.017..0.051 rows=10.00 loops=1)
   Buffers: shared hit=36 read=6
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.38 rows=10 width=244) (actual time=0.009..0.017 rows=10.00 loops=1)
         Recheck Cond: (unique1 < 10)
         Heap Blocks: exact=10
         Buffers: shared hit=3 read=5 written=4
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10.00 loops=1)
               Index Cond: (unique1 < 10)
               Index Searches: 1
               Buffers: shared hit=2
   ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1.00 loops=10)
         Index Cond: (unique2 = t1.unique2)
         Index Searches: 10
         Buffers: shared hit=24 read=6
 Planning:
   Buffers: shared hit=15 dirtied=9
 Planning Time: 0.485 ms
 Execution Time: 0.073 ms

Note that the <quote>actual time</quote> values are in milliseconds of real time, whereas the <literal>cost</literal> estimates are expressed in arbitrary units; so they are unlikely to match up. The thing that's usually most important to look for is whether the estimated row counts are reasonably close to reality. In this example the estimates were all dead-on, but that's quite unusual in practice. 「actual time」値は実時間をミリ秒単位で表されていること、cost推定値は何らかの単位で表されていることに注意してください。ですからそのまま比較することはできません。注目すべきもっとも重要な点は通常、推定行数が実際の値と合理的に近いかどうかです。この例では、推定はすべて正確ですが、現実的にはあまりありません。

In some query plans, it is possible for a subplan node to be executed more than once. For example, the inner index scan will be executed once per outer row in the above nested-loop plan. In such cases, the <literal>loops</literal> value reports the total number of executions of the node, and the actual time and rows values shown are averages per-execution. This is done to make the numbers comparable with the way that the cost estimates are shown. Multiply by the <literal>loops</literal> value to get the total time actually spent in the node. In the above example, we spent a total of 0.030 milliseconds executing the index scans on <literal>tenk2</literal>. 問い合わせ計画の中には、何回も副計画ノードを実行する可能性のあるものがあります。例えば、上述のネステッドループの計画では、内部インデックススキャンは外部の行ごとに一度行われます。このような場合、loops値はそのノードを実行する総回数を報告し、表示される実際の時間と行数は1実行当たりの平均です。これで値を表示された推定コストと比較できるようになります。 loops値をかけることで、そのノードで実際に費やされた総時間を得ることができます。上の例では、tenk2に対するインデックススキャンの実行のために合計0.030ミリ秒要しています。

In some cases <command>EXPLAIN ANALYZE</command> shows additional execution statistics beyond the plan node execution times and row counts. For example, Sort and Hash nodes provide extra information: 場合によっては、EXPLAIN ANALYZEは計画ノードの実行時間と行数以上の実行統計情報をさらに表示します。例えば、ソートとハッシュノードでは以下のような追加情報を提供します。

EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;

                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=713.05..713.30 rows=100 width=488) (actual time=2.995..3.002 rows=100.00 loops=1)
   Sort Key: t1.fivethous
   Sort Method: quicksort  Memory: 74kB
   Buffers: shared hit=440
   ->  Hash Join  (cost=226.23..709.73 rows=100 width=488) (actual time=0.515..2.920 rows=100.00 loops=1)
         Hash Cond: (t2.unique2 = t1.unique2)
         Buffers: shared hit=437
         ->  Seq Scan on tenk2 t2  (cost=0.00..445.00 rows=10000 width=244) (actual time=0.026..1.790 rows=10000.00 loops=1)
               Buffers: shared hit=345
         ->  Hash  (cost=224.98..224.98 rows=100 width=244) (actual time=0.476..0.477 rows=100.00 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 35kB
               Buffers: shared hit=92
               ->  Bitmap Heap Scan on tenk1 t1  (cost=5.06..224.98 rows=100 width=244) (actual time=0.030..0.450 rows=100.00 loops=1)
                     Recheck Cond: (unique1 < 100)
                     Heap Blocks: exact=90
                     Buffers: shared hit=92
                     ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100.00 loops=1)
                           Index Cond: (unique1 < 100)
                           Index Searches: 1
                           Buffers: shared hit=2
 Planning:
   Buffers: shared hit=12
 Planning Time: 0.187 ms
 Execution Time: 3.036 ms

The Sort node shows the sort method used (in particular, whether the sort was in-memory or on-disk) and the amount of memory or disk space needed. The Hash node shows the number of hash buckets and batches as well as the peak amount of memory used for the hash table. (If the number of batches exceeds one, there will also be disk space usage involved, but that is not shown.) ソートノードは使用されるソート方式（具体的にはソートがメモリ内かディスク上か）および必要なメモリまたはディスクの容量を表示します。ハッシュノードでは、ハッシュバケット数とバッチ数、ハッシュテーブルで使用されるメモリのピーク容量が表示されます。（バッチ数が１を超える場合、同時にディスクの使用容量も含まれますが、表示はされません。）

Index Scan nodes (as well as Bitmap Index Scan and Index-Only Scan nodes) show an <quote>Index Searches</quote> line that reports the total number of searches across <emphasis>all</emphasis> node executions/<literal>loops</literal>: インデックススキャンノード（およびビットマップインデックススキャンとインデックスオンリースキャンノード）には、すべてのノード実行とloopsにわたる検索の合計数を報告する「Index Searches」の行が表示されます。

EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 500, 700, 999);
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=9.45..73.44 rows=40 width=244) (actual time=0.012..0.028 rows=40.00 loops=1)
   Recheck Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
   Heap Blocks: exact=39
   Buffers: shared hit=47
   ->  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..9.44 rows=40 width=0) (actual time=0.009..0.009 rows=40.00 loops=1)
         Index Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
         Index Searches: 4
         Buffers: shared hit=8
 Planning Time: 0.029 ms
 Execution Time: 0.034 ms

Here we see a Bitmap Index Scan node that needed 4 separate index searches. The scan had to search the index from the <structname>tenk1_thous_tenthous</structname> index root page once per <type>integer</type> value from the predicate's <literal>IN</literal> construct. However, the number of index searches often won't have such a simple correspondence to the query predicate: ここでは、4つの個別のインデックス検索を必要とするビットマップインデックススキャンノードがあります。このスキャンは、述語のIN構文におけるintegerの値につき一度、tenk1_thous_tenthousインデックスのルートページからインデックスを検索する必要がありました。ただし、インデックス検索の数は、問い合わせの述語とそれほど単純に対応していないことがよくあります。

EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 2, 3, 4);
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=9.45..73.44 rows=40 width=244) (actual time=0.009..0.019 rows=40.00 loops=1)
   Recheck Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
   Heap Blocks: exact=38
   Buffers: shared hit=40
   ->  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..9.44 rows=40 width=0) (actual time=0.005..0.005 rows=40.00 loops=1)
         Index Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
         Index Searches: 1
         Buffers: shared hit=2
 Planning Time: 0.029 ms
 Execution Time: 0.026 ms

This variant of our <literal>IN</literal> query performed only 1 index search. It spent less time traversing the index (compared to the original query) because its <literal>IN</literal> construct uses values matching index tuples stored next to each other, on the same <structname>tenk1_thous_tenthous</structname> index leaf page. この異なるINを含む問い合わせでは、1つのインデックス検索しか実行されていません。このIN構文では、同じtenk1_thous_tenthousインデックスリーフページ上にお互い隣り合って格納されているインデックスタプルと一致する値を使用しているため、（元の問い合わせと比較して）インデックスを探索する時間が短くなりました。

The <quote>Index Searches</quote> line is also useful with B-tree index scans that apply the <firstterm>skip scan</firstterm> optimization to more efficiently traverse through an index: 「Index Searches」の行は、スキップスキャン最適化を適用してインデックスをより効果的に探索するB-treeインデックススキャンでも役立ちます。

EXPLAIN ANALYZE SELECT four, unique1 FROM tenk1 WHERE four BETWEEN 1 AND 3 AND unique1 = 42;
                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using tenk1_four_unique1_idx on tenk1  (cost=0.29..6.90 rows=1 width=8) (actual time=0.006..0.007 rows=1.00 loops=1)
   Index Cond: ((four >= 1) AND (four <= 3) AND (unique1 = 42))
   Heap Fetches: 0
   Index Searches: 3
   Buffers: shared hit=7
 Planning Time: 0.029 ms
 Execution Time: 0.012 ms

Here we see an Index-Only Scan node using <structname>tenk1_four_unique1_idx</structname>, a multi-column index on the <structname>tenk1</structname> table's <structfield>four</structfield> and <structfield>unique1</structfield> columns. The scan performs 3 searches that each read a single index leaf page: <quote><literal>four = 1 AND unique1 = 42</literal></quote>, <quote><literal>four = 2 AND unique1 = 42</literal></quote>, and <quote><literal>four = 3 AND unique1 = 42</literal></quote>. This index is generally a good target for skip scan, since, as discussed in <xref linkend="indexes-multicolumn"/>, its leading column (the <structfield>four</structfield> column) contains only 4 distinct values, while its second/final column (the <structfield>unique1</structfield> column) contains many distinct values. ここでは、tenk1テーブルのfour列とunique1列にある複数列インデックスのtenk1_four_unique1_idxを使用するインデックスオンリースキャンノードがあります。このスキャンでは3回の検索が実行され、それぞれ1つのインデックスリーフページを読みます。「four = 1 AND unique1 = 42」、「four = 2 AND unique1 = 42」、そして「four = 3 AND unique1 = 42」です。このインデックスは、先頭の列（four列）には異なる値が4つしかありませんが、2番目であり最後の列（unique1列）には多くの異なる値が含まれるため、11.3で説明されているように、スキップスキャンの対象として一般的に適しています。

Another type of extra information is the number of rows removed by a filter condition: 他の種類の追加情報はフィルタ条件によって除外される行数があります。

EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE ten < 7;

                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..470.00 rows=7000 width=244) (actual time=0.030..1.995 rows=7000.00 loops=1)
   Filter: (ten < 7)
   Rows Removed by Filter: 3000
   Buffers: shared hit=345
 Planning Time: 0.102 ms
 Execution Time: 2.145 ms

These counts can be particularly valuable for filter conditions applied at join nodes. The <quote>Rows Removed</quote> line only appears when at least one scanned row, or potential join pair in the case of a join node, is rejected by the filter condition. 特に結合ノードで適用されるフィルタ条件ではこれらの数が有用です。「Rows Removed」行は、少なくともスキャンされた１行、結合ノードにおける結合組み合わせの可能性がフィルタ条件によって拒絶された時にのみ現れます。

A case similar to filter conditions occurs with <quote>lossy</quote> index scans. For example, consider this search for polygons containing a specific point: 「非可逆」インデックススキャンはフィルタ条件に似た状況です。例えば、特定の点を含有する多角形の検索を考えてみます。

EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';

                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Seq Scan on polygon_tbl  (cost=0.00..1.09 rows=1 width=85) (actual time=0.023..0.023 rows=0.00 loops=1)
   Filter: (f1 @> '((0.5,2))'::polygon)
   Rows Removed by Filter: 7
   Buffers: shared hit=1
 Planning Time: 0.039 ms
 Execution Time: 0.033 ms

The planner thinks (quite correctly) that this sample table is too small to bother with an index scan, so we have a plain sequential scan in which all the rows got rejected by the filter condition. But if we force an index scan to be used, we see: プランナは（ほぼ正確に）、インデックススキャンを考慮するには例のテーブルが小さ過ぎるとみなします。このため、フィルタ条件によってすべての行が拒絶される、普通のシーケンシャルスキャンとなります。しかしインデックススキャンの使用を強制するのであれば、以下のようにします。

SET enable_seqscan TO off;

EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';

                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Index Scan using gpolygonind on polygon_tbl  (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0.00 loops=1)
   Index Cond: (f1 @> '((0.5,2))'::polygon)
   Rows Removed by Index Recheck: 1
   Index Searches: 1
   Buffers: shared hit=1
 Planning Time: 0.039 ms
 Execution Time: 0.098 ms

Here we can see that the index returned one candidate row, which was then rejected by a recheck of the index condition. This happens because a GiST index is <quote>lossy</quote> for polygon containment tests: it actually returns the rows with polygons that overlap the target, and then we have to do the exact containment test on those rows. ここで、インデックスが１つの候補行を返し、それがインデックス条件の再検査により拒絶されることが分かります。多角形の含有試験ではGiSTインデックスが「非可逆」であるため、これは発生します。実際には対象と重なる多角形を持つ行を返し、そしてこれらの行が正確に含有関係であることを試験しなければなりません。

<command>EXPLAIN</command> has a <literal>BUFFERS</literal> option which provides additional detail about I/O operations performed during the planning and execution of the given query. The buffer numbers displayed show the count of the non-distinct buffers hit, read, dirtied, and written for the given node and all of its child nodes. The <literal>ANALYZE</literal> option implicitly enables the <literal>BUFFERS</literal> option. If this is undesired, <literal>BUFFERS</literal> may be explicitly disabled: EXPLAINには、与えられた問い合わせのプランニング中と実行中に行われるI/O操作に関して追加の詳細情報を提供するBUFFERSオプションがあります。表示されるバッファ数は、与えられたノードとその全ての子ノードについて、ヒット、読み取り、ダーティ化、そして書き込みが行われた個別のバッファの数を示します。 ANALYZEオプションは暗黙的にBUFFERSオプションを有効化します。これが望ましくない場合は、BUFFERSを明示的に無効化できます。

EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=25.07..60.11 rows=10 width=244) (actual time=0.105..0.114 rows=10.00 loops=1)
   Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
   Heap Blocks: exact=10
   ->  BitmapAnd  (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0.00 loops=1)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100.00 loops=1)
               Index Cond: (unique1 < 100)
               Index Searches: 1
         ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999.00 loops=1)
               Index Cond: (unique2 > 9000)
               Index Searches: 1
 Planning Time: 0.162 ms
 Execution Time: 0.143 ms

Keep in mind that because <command>EXPLAIN ANALYZE</command> actually runs the query, any side-effects will happen as usual, even though whatever results the query might output are discarded in favor of printing the <command>EXPLAIN</command> data. If you want to analyze a data-modifying query without changing your tables, you can roll the command back afterwards, for example: EXPLAIN ANALYZEが実際に問い合わせを実行しますので、EXPLAINのデータを出力することを優先して問い合わせの出力が破棄されたとしても、何らかの副作用が通常通り発生することに注意してください。テーブルを変更すること無くデータ変更問い合わせの解析を行いたければ、以下の例のように、実行後コマンドをロールバックしてください。

BEGIN;

EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;

                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Update on tenk1  (cost=5.06..225.23 rows=0 width=0) (actual time=1.634..1.635 rows=0.00 loops=1)
   ->  Bitmap Heap Scan on tenk1  (cost=5.06..225.23 rows=100 width=10) (actual time=0.065..0.141 rows=100.00 loops=1)
         Recheck Cond: (unique1 < 100)
         Heap Blocks: exact=90
         Buffers: shared hit=4 read=2
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100.00 loops=1)
               Index Cond: (unique1 < 100)
               Index Searches: 1
               Buffers: shared read=2
 Planning Time: 0.151 ms
 Execution Time: 1.856 ms

ROLLBACK;

As seen in this example, when the query is an <command>INSERT</command>, <command>UPDATE</command>, <command>DELETE</command>, or <command>MERGE</command> command, the actual work of applying the table changes is done by a top-level Insert, Update, Delete, or Merge plan node. The plan nodes underneath this node perform the work of locating the old rows and/or computing the new data. So above, we see the same sort of bitmap table scan we've seen already, and its output is fed to an Update node that stores the updated rows. It's worth noting that although the data-modifying node can take a considerable amount of run time (here, it's consuming the lion's share of the time), the planner does not currently add anything to the cost estimates to account for that work. That's because the work to be done is the same for every correct query plan, so it doesn't affect planning decisions. この例で分かるように、問い合わせがINSERT、UPDATE、DELETE、MERGEである場合、テーブル変更を行うための実作業は最上位のInsert、Update、Delete、Merge計画ノードで行われます。このノード以下にある計画ノードは、古い行の検索、新しいデータの計算、あるいはその両方を行います。このため、前に述べたものと同じ種類のビットマップテーブルスキャンがあり、その出力が更新される行を格納するUpdateノードに渡されることが分かります。データ変更ノードが実行時間の多くを費やす可能性があります（現在これが一番多くの時間を費やしています）が、プランナは現在その作業を考慮してコスト推定に何も加えません。これは、行われる作業がすべての正確な問い合わせ計画の作業と同一であるためであり、このため計画の決定に影響を与えません。

When an <command>UPDATE</command>, <command>DELETE</command>, or <command>MERGE</command> command affects a partitioned table or inheritance hierarchy, the output might look like this: UPDATE、DELETE、MERGEコマンドがパーティションテーブルや継承階層に影響する場合には、出力は以下のようになるでしょう。

EXPLAIN UPDATE gtest_parent SET f1 = CURRENT_DATE WHERE f2 = 101;

                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Update on gtest_parent  (cost=0.00..3.06 rows=0 width=0)
   Update on gtest_child gtest_parent_1
   Update on gtest_child2 gtest_parent_2
   Update on gtest_child3 gtest_parent_3
   ->  Append  (cost=0.00..3.06 rows=3 width=14)
         ->  Seq Scan on gtest_child gtest_parent_1  (cost=0.00..1.01 rows=1 width=14)
               Filter: (f2 = 101)
         ->  Seq Scan on gtest_child2 gtest_parent_2  (cost=0.00..1.01 rows=1 width=14)
               Filter: (f2 = 101)
         ->  Seq Scan on gtest_child3 gtest_parent_3  (cost=0.00..1.01 rows=1 width=14)
               Filter: (f2 = 101)

In this example the Update node needs to consider three child tables, but not the originally-mentioned partitioned table (since that never stores any data). So there are three input scanning subplans, one per table. For clarity, the Update node is annotated to show the specific target tables that will be updated, in the same order as the corresponding subplans. この例では、Updateノードは3つの子テーブルを考慮しますが、元の問い合わせに記述されていたパーティションテーブルを考慮しません（そこにデータ格納されることはないため）。そのため、テーブル毎に3つの入力スキャン副計画があります。明確にするため、Updateノードには対応する副計画と同じ順に更新される特定の対象テーブルを示す注釈が付けられています。

The <literal>Planning time</literal> shown by <command>EXPLAIN ANALYZE</command> is the time it took to generate the query plan from the parsed query and optimize it. It does not include parsing or rewriting. EXPLAIN ANALYZEで表示されるPlanning timeは、解析された問い合わせから問い合わせ計画を生成し最適化するのに掛かった時間です。解析と書き換えは含みません。

The <literal>Execution time</literal> shown by <command>EXPLAIN ANALYZE</command> includes executor start-up and shut-down time, as well as the time to run any triggers that are fired, but it does not include parsing, rewriting, or planning time. Time spent executing <literal>BEFORE</literal> triggers, if any, is included in the time for the related Insert, Update, or Delete node; but time spent executing <literal>AFTER</literal> triggers is not counted there because <literal>AFTER</literal> triggers are fired after completion of the whole plan. The total time spent in each trigger (either <literal>BEFORE</literal> or <literal>AFTER</literal>) is also shown separately. Note that deferred constraint triggers will not be executed until end of transaction and are thus not considered at all by <command>EXPLAIN ANALYZE</command>. EXPLAIN ANALYZEで表示されるExecution time（実行時間）にはエグゼキュータの起動、停止時間、発行される何らかのトリガの実行時間も含まれますが、解析や書き換え、計画作成の時間は含まれません。 BEFOREトリガがあればその実行時間は関連するInsert、Update、Deleteノード用の時間に含まれます。しかし、AFTERトリガは計画全体が完了した後に発行されますので、AFTERトリガの実行時間は計上されません。また、各トリガ（BEFORE、AFTERのいずれか）で費やされる総時間は別々に表示されます。しかし、遅延制約トリガはトランザクションが終わるまで実行されませんので、EXPLAIN ANALYZEでは考慮されないことに注意してください。

The time shown for the top-level node does not include any time needed to convert the query's output data into displayable form or to send it to the client. While <command>EXPLAIN ANALYZE</command> will never send the data to the client, it can be told to convert the query's output data to displayable form and measure the time needed for that, by specifying the <literal>SERIALIZE</literal> option. That time will be shown separately, and it's also included in the total <literal>Execution time</literal>. 最上位ノードに表示される時間には、問い合わせの出力データを表示可能な形式に変換したり、それをクライアントに送信したりするのに要する時間は含まれません。 EXPLAIN ANALYZEはデータをクライアントに送信することはありませんが、SERIALIZEオプションを指定することで、問い合わせの出力データを表示可能な形式に変換し、そのために必要な時間を測定するように指示できます。その時間は分けて表示され、また、それは合計のExecution timeにも含まれます。

14.1.3. 警告 #

<title>Caveats</title>

There are two significant ways in which run times measured by <command>EXPLAIN ANALYZE</command> can deviate from normal execution of the same query. First, since no output rows are delivered to the client, network transmission costs are not included. I/O conversion costs are not included either unless <literal>SERIALIZE</literal> is specified. Second, the measurement overhead added by <command>EXPLAIN ANALYZE</command> can be significant, especially on machines with slow <function>gettimeofday()</function> operating-system calls. You can use the <xref linkend="pgtesttiming"/> tool to measure the overhead of timing on your system. EXPLAIN ANALYZEにより測定される実行時間が同じ問い合わせを普通に実行する場合と大きくそれる可能性がある、２つの重大な点があります。１つ目は、出力行がクライアントに配信されませんので、ネットワーク転送コストが含まれないことです。 I/O変換のコストも、SERIALIZEが指定されない限り、含まれません。２つ目は、EXPLAIN ANALYZEによって加わる測定オーバーヘッドが大きくなることが、特にgettimeofday()オペレーティングシステムコールが低速なマシンであり得ることです。 pg_test_timingを用いて、使用中のシステムの時間測定にかかるオーバーヘッドを測ることができます。

<command>EXPLAIN</command> results should not be extrapolated to situations much different from the one you are actually testing; for example, results on a toy-sized table cannot be assumed to apply to large tables. The planner's cost estimates are not linear and so it might choose a different plan for a larger or smaller table. An extreme example is that on a table that only occupies one disk page, you'll nearly always get a sequential scan plan whether indexes are available or not. The planner realizes that it's going to take one disk page read to process the table in any case, so there's no value in expending additional page reads to look at an index. (We saw this happening in the <literal>polygon_tbl</literal> example above.) EXPLAINの結果を試験を行ったものと大きく異なる状況の推定に使ってはいけません。例えば、小さなテーブルの結果は、巨大なテーブルに適用できるとは仮定できません。プランナの推定コストは線形ではなく、そのため、テーブルの大小によって異なる計画を選択する可能性があります。極端な例ですが、テーブルが1ディスクページしか占めない場合、インデックスが使用できる、できないに関係なく、ほとんど常にシーケンシャルスキャン計画を得ることになります。プランナは、どのような場合でもテーブルを処理するために1ディスクページ読み取りを行うので、インデックスを参照するための追加的ページ読み取りを行う価値がないことを知っています。（上述のpolygon_tblの例でこれが起こることを示しています。）

There are cases in which the actual and estimated values won't match up well, but nothing is really wrong. One such case occurs when plan node execution is stopped short by a <literal>LIMIT</literal> or similar effect. For example, in the <literal>LIMIT</literal> query we used before, 実際の値と推定値がうまく合わないが本当は間違ったものがない場合があります。こうした状況の１つは、LIMITや同様な効果により計画ノードの実行が短時間で終わる時に起こります。例えば、以前に使用したLIMIT問い合わせでは

EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;

                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..14.33 rows=2 width=244) (actual time=0.051..0.071 rows=2.00 loops=1)
   Buffers: shared hit=16
   ->  Index Scan using tenk1_unique2 on tenk1  (cost=0.29..70.50 rows=10 width=244) (actual time=0.051..0.070 rows=2.00 loops=1)
         Index Cond: (unique2 > 9000)
         Filter: (unique1 < 100)
         Rows Removed by Filter: 287
         Index Searches: 1
         Buffers: shared hit=16
 Planning Time: 0.077 ms
 Execution Time: 0.086 ms

the estimated cost and row count for the Index Scan node are shown as though it were run to completion. But in reality the Limit node stopped requesting rows after it got two, so the actual row count is only 2 and the run time is less than the cost estimate would suggest. This is not an estimation error, only a discrepancy in the way the estimates and true values are displayed. インデックススキャンノードの推定コストと行数が実行完了したかのように表示されます。しかし現実では、Limitノードが２行を取り出した後に行の要求を停止します。このため実際の行数は２行のみであり、実行時間は提示された推定コストより小さくなります。これは推定間違いではなく、単なる推定値と本当の値を表示する方法における矛盾です。

Merge joins also have measurement artifacts that can confuse the unwary. A merge join will stop reading one input if it's exhausted the other input and the next key value in the one input is greater than the last key value of the other input; in such a case there can be no more matches and so no need to scan the rest of the first input. This results in not reading all of one child, with results like those mentioned for <literal>LIMIT</literal>. Also, if the outer (first) child contains rows with duplicate key values, the inner (second) child is backed up and rescanned for the portion of its rows matching that key value. <command>EXPLAIN ANALYZE</command> counts these repeated emissions of the same inner rows as if they were real additional rows. When there are many outer duplicates, the reported actual row count for the inner child plan node can be significantly larger than the number of rows that are actually in the inner relation. またマージ結合には、注意しないと混乱を招く測定上の乱れがあります。マージ結合は他の入力が使い尽くされ、ある入力の次のキー値が他の入力の最後のキー値より大きい場合、その入力の読み取りを停止します。このような場合、これ以上一致することはあり得ず、最初の入力の残りをスキャンする必要がありません。この結果、子のすべては読み取られず、LIMITの説明のようになります。また、外部（最初）の子が重複するキー値を持つ行を含む場合、内部（２番目）の子はバックアップされ、そのキー値が一致する行部分を再度スキャンされます。 EXPLAIN ANALYZEはこうした繰り返される同じ内部行の排出を実際の追加される行と同様に計上します。外部で多くの重複がある場合、内部の子計画ノードで繰り返される実際の行数は、内部リレーションにおける実際の行数より非常に多くなることがあり得ます。

BitmapAnd and BitmapOr nodes always report their actual row counts as zero, due to implementation limitations. 実装上の制限のため、BitmapAndおよびBitmapOrノードは常に実際の行数をゼロと報告します。

Normally, <command>EXPLAIN</command> will display every plan node created by the planner. However, there are cases where the executor can determine that certain nodes need not be executed because they cannot produce any rows, based on parameter values that were not available at planning time. (Currently this can only happen for child nodes of an Append or MergeAppend node that is scanning a partitioned table.) When this happens, those plan nodes are omitted from the <command>EXPLAIN</command> output and a <literal>Subplans Removed: <replaceable>N</replaceable></literal> annotation appears instead. 通常EXPLAINはプランナが生成したすべてのプランノードを表示します。しかし、プラン時にパラメータ値が入手できずそのノードが行を生成できないために、エグゼキュータがあるノードが実行不要であると判断できるケースがあります。（今の所、これはパーティションテーブルをスキャンしているAppendあるいはMergeAppendノードの子ノードでのみ起きることがあります。）これが起きると、これらのプランノードはEXPLAINの出力から削除され、Subplans Removed: Nという注釈が代わりに表示されます。