PostgreSQL 源碼解讀(98)- 分區(qū)表#4(數(shù)據(jù)查詢路由#1-“擴(kuò)展”分區(qū)表)

在查詢分區(qū)表的時(shí)候PG如何確定查詢的是哪個(gè)分區(qū)医吊?如何確定碳默?相關(guān)的機(jī)制是什么?接下來幾個(gè)章節(jié)將一一介紹缘眶,本節(jié)是第一部分嘱根。

零、實(shí)現(xiàn)機(jī)制

我們先看下面的例子,兩個(gè)普通表t_normal_1和t_normal_2,執(zhí)行UNION ALL操作:

drop table if exists t_normal_1;
drop table if exists t_normal_2;
create table t_normal_1 (c1 int not null,c2  varchar(40),c3 varchar(40));
create table t_normal_2 (c1 int not null,c2  varchar(40),c3 varchar(40));

insert into t_normal_1(c1,c2,c3) VALUES(0,'HASH0','HAHS0');
insert into t_normal_2(c1,c2,c3) VALUES(0,'HASH0','HAHS0');

testdb=# explain verbose select * from t_normal_1 where c1 = 0
testdb-# union all
testdb-# select * from t_normal_2 where c1 <> 0;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Append  (cost=0.00..34.00 rows=350 width=200)
   ->  Seq Scan on public.t_normal_1  (cost=0.00..14.38 rows=2 width=200)
         Output: t_normal_1.c1, t_normal_1.c2, t_normal_1.c3
         Filter: (t_normal_1.c1 = 0)
   ->  Seq Scan on public.t_normal_2  (cost=0.00..14.38 rows=348 width=200)
         Output: t_normal_2.c1, t_normal_2.c2, t_normal_2.c3
         Filter: (t_normal_2.c1 <> 0)
(7 rows)

兩張普通表的UNION ALL,PG使用APPEND操作符把t_normal_1順序掃描的結(jié)果集和t_normal_2順序掃描的結(jié)果集"APPEND"在一起作為最終的結(jié)果集輸出.

分區(qū)表的查詢也是類似的機(jī)制,把各個(gè)分區(qū)的結(jié)果集APPEND在一起,然后作為最終的結(jié)果集輸出,如下例所示:

testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Append  (cost=0.00..30.53 rows=6 width=200)
   ->  Seq Scan on public.t_hash_partition_1  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3
         Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2))
   ->  Seq Scan on public.t_hash_partition_3  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3
         Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))
(7 rows)

查詢分區(qū)表t_hash_partition,條件為c1 = 1 OR c1 = 2,從執(zhí)行計(jì)劃可見是把t_hash_partition_1順序掃描的結(jié)果集和t_hash_partition_3順序掃描的結(jié)果集"APPEND"在一起作為最終的結(jié)果集輸出.

這里面有幾個(gè)問題需要解決:
1.識(shí)別分區(qū)表并找到所有的分區(qū)子表;
2.根據(jù)約束條件識(shí)別需要查詢的分區(qū),這是出于性能的考慮;
3.對(duì)結(jié)果集執(zhí)行APPEND,作為最終結(jié)果輸出.
本節(jié)介紹了PG如何識(shí)別分區(qū)表并找到所有的分區(qū)子表,實(shí)現(xiàn)的函數(shù)是expand_inherited_tables.

一巷懈、數(shù)據(jù)結(jié)構(gòu)

AppendRelInfo
Append-relation信息.
當(dāng)我們將可繼承表(分區(qū)表)或UNION-ALL子查詢展開為“追加關(guān)系”(本質(zhì)上是子RTE的鏈表)時(shí)该抒,為每個(gè)子RTE構(gòu)建一個(gè)AppendRelInfo。
AppendRelInfos鏈表指示在展開父節(jié)點(diǎn)時(shí)必須包含哪些子rte顶燕,每個(gè)節(jié)點(diǎn)具有將引用父節(jié)點(diǎn)的Vars轉(zhuǎn)換為引用該子節(jié)點(diǎn)的Vars所需的所有信息凑保。

/*
 * Append-relation info.
 * Append-relation信息.
 * 
 * When we expand an inheritable table or a UNION-ALL subselect into an
 * "append relation" (essentially, a list of child RTEs), we build an
 * AppendRelInfo for each child RTE.  The list of AppendRelInfos indicates
 * which child RTEs must be included when expanding the parent, and each node
 * carries information needed to translate Vars referencing the parent into
 * Vars referencing that child.
 * 當(dāng)我們將可繼承表(分區(qū)表)或UNION-ALL子查詢展開為“追加關(guān)系”(本質(zhì)上是子RTE的鏈表)時(shí),
 *   為每個(gè)子RTE構(gòu)建一個(gè)AppendRelInfo涌攻。
 * AppendRelInfos鏈表指示在展開父節(jié)點(diǎn)時(shí)必須包含哪些子rte欧引,
 *   每個(gè)節(jié)點(diǎn)具有將引用父節(jié)點(diǎn)的Vars轉(zhuǎn)換為引用該子節(jié)點(diǎn)的Vars所需的所有信息。
 * 
 * These structs are kept in the PlannerInfo node's append_rel_list.
 * Note that we just throw all the structs into one list, and scan the
 * whole list when desiring to expand any one parent.  We could have used
 * a more complex data structure (eg, one list per parent), but this would
 * be harder to update during operations such as pulling up subqueries,
 * and not really any easier to scan.  Considering that typical queries
 * will not have many different append parents, it doesn't seem worthwhile
 * to complicate things.
 * 這些結(jié)構(gòu)體保存在PlannerInfo節(jié)點(diǎn)的append_rel_list中恳谎。
 * 注意芝此,只是將所有的結(jié)構(gòu)體放入一個(gè)鏈表中憋肖,并在希望展開任何父類時(shí)掃描整個(gè)鏈表。
 * 本可以使用更復(fù)雜的數(shù)據(jù)結(jié)構(gòu)(例如婚苹,每個(gè)父節(jié)點(diǎn)一個(gè)列表)岸更,
 *   但是在提取子查詢之類的操作中更新它會(huì)更困難,
 *   而且實(shí)際上也不會(huì)更容易掃描膊升。
 * 考慮到典型的查詢不會(huì)有很多不同的附加項(xiàng)怎炊,因此似乎不值得將事情復(fù)雜化。
 * 
 * Note: after completion of the planner prep phase, any given RTE is an
 * append parent having entries in append_rel_list if and only if its
 * "inh" flag is set.  We clear "inh" for plain tables that turn out not
 * to have inheritance children, and (in an abuse of the original meaning
 * of the flag) we set "inh" for subquery RTEs that turn out to be
 * flattenable UNION ALL queries.  This lets us avoid useless searches
 * of append_rel_list.
 * 注意:計(jì)劃準(zhǔn)備階段完成后,
 *   當(dāng)且僅當(dāng)它的“inh”標(biāo)志已設(shè)置時(shí),給定的RTE是一個(gè)append parent在append_rel_list中的一個(gè)條目廓译。
 * 我們?yōu)闆]有child的平面表清除“inh”標(biāo)記,
 *   同時(shí)(有濫用標(biāo)記的嫌疑)為UNION ALL查詢中的子查詢RTEs設(shè)置“inh”標(biāo)記评肆。
 * 這樣可以避免對(duì)append_rel_list進(jìn)行無用的搜索。
 * 
 * Note: the data structure assumes that append-rel members are single
 * baserels.  This is OK for inheritance, but it prevents us from pulling
 * up a UNION ALL member subquery if it contains a join.  While that could
 * be fixed with a more complex data structure, at present there's not much
 * point because no improvement in the plan could result.
 * 注意:數(shù)據(jù)結(jié)構(gòu)假定附加的rel成員是獨(dú)立的baserels责循。
 * 這對(duì)于繼承來說是可以的糟港,但是如果UNION ALL member子查詢包含一個(gè)join,
 *   那么它將阻止我們提取UNION ALL member子查詢院仿。
 * 雖然可以用更復(fù)雜的數(shù)據(jù)結(jié)構(gòu)解決這個(gè)問題秸抚,但目前沒有太大意義,因?yàn)樵撚?jì)劃可能不會(huì)有任何改進(jìn)歹垫。
 */

typedef struct AppendRelInfo
{
    NodeTag     type;

    /*
     * These fields uniquely identify this append relationship.  There can be
     * (in fact, always should be) multiple AppendRelInfos for the same
     * parent_relid, but never more than one per child_relid, since a given
     * RTE cannot be a child of more than one append parent.
     * 這些字段惟一地標(biāo)識(shí)這個(gè)append relationship剥汤。
     * 對(duì)于同一個(gè)parent_relid可以有(實(shí)際上應(yīng)該總是)多個(gè)AppendRelInfos,
     *   但是每個(gè)child_relid不能有多個(gè)AppendRelInfos排惨,
     *   因?yàn)榻o定的RTE不能是多個(gè)append parent的子節(jié)點(diǎn)吭敢。
     */
    Index       parent_relid;   /* parent rel的RT索引;RT index of append parent rel */
    Index       child_relid;    /* child rel的RT索引;RT index of append child rel */

    /*
     * For an inheritance appendrel, the parent and child are both regular
     * relations, and we store their rowtype OIDs here for use in translating
     * whole-row Vars.  For a UNION-ALL appendrel, the parent and child are
     * both subqueries with no named rowtype, and we store InvalidOid here.
     * 對(duì)于繼承appendrel,父類和子類都是普通關(guān)系暮芭,
     *   我們將它們的rowtype OIDs存儲(chǔ)在這里鹿驼,用于轉(zhuǎn)換whole-row Vars。
     * 對(duì)于UNION-ALL appendrel辕宏,父查詢和子查詢都是沒有指定行類型的子查詢畜晰,
     * 我們?cè)谶@里存儲(chǔ)InvalidOid。
     */
    Oid         parent_reltype; /* OID of parent's composite type */
    Oid         child_reltype;  /* OID of child's composite type */

    /*
     * The N'th element of this list is a Var or expression representing the
     * child column corresponding to the N'th column of the parent. This is
     * used to translate Vars referencing the parent rel into references to
     * the child.  A list element is NULL if it corresponds to a dropped
     * column of the parent (this is only possible for inheritance cases, not
     * UNION ALL).  The list elements are always simple Vars for inheritance
     * cases, but can be arbitrary expressions in UNION ALL cases.
     * 這個(gè)列表的第N個(gè)元素是一個(gè)Var或表達(dá)式瑞筐,表示與父元素的第N列對(duì)應(yīng)的子列凄鼻。
     * 這用于將引用parent rel的Vars轉(zhuǎn)換為對(duì)子rel的引用。
     * 如果鏈表元素與父元素的已刪除列相對(duì)應(yīng)聚假,則該元素為NULL
     *   (這只適用于繼承情況块蚌,而不是UNION ALL)。
     * 對(duì)于繼承情況膘格,鏈表元素總是簡單的變量峭范,但是可以是UNION ALL情況下的任意表達(dá)式。
     *
     * Notice we only store entries for user columns (attno > 0).  Whole-row
     * Vars are special-cased, and system columns (attno < 0) need no special
     * translation since their attnos are the same for all tables.
     * 注意瘪贱,我們只存儲(chǔ)用戶列的條目(attno > 0)虎敦。
     * Whole-row Vars是大小寫敏感的游岳,系統(tǒng)列(attno < 0)不需要特別的轉(zhuǎn)換,
     *   因?yàn)樗鼈兊腶ttno對(duì)所有表都是相同的其徙。
     *
     * Caution: the Vars have varlevelsup = 0.  Be careful to adjust as needed
     * when copying into a subquery.
     * 注意:Vars的varlevelsup = 0胚迫。
     * 在將數(shù)據(jù)復(fù)制到子查詢時(shí),要注意根據(jù)需要進(jìn)行調(diào)整唾那。
     */
    //child's Vars中的表達(dá)式
    List       *translated_vars;    /* Expressions in the child's Vars */

    /*
     * We store the parent table's OID here for inheritance, or InvalidOid for
     * UNION ALL.  This is only needed to help in generating error messages if
     * an attempt is made to reference a dropped parent column.
     * 我們將父表的OID存儲(chǔ)在這里用于繼承访锻,
     *   如為UNION ALL,則這里存儲(chǔ)的是InvalidOid。
     * 只有在試圖引用已刪除的父列時(shí)闹获,才需要這樣做來幫助生成錯(cuò)誤消息期犬。
     */
    Oid         parent_reloid;  /* OID of parent relation */
} AppendRelInfo;

PlannerInfo
該數(shù)據(jù)結(jié)構(gòu)用于存儲(chǔ)查詢語句在規(guī)劃/優(yōu)化過程中的相關(guān)信息

/*----------
 * PlannerInfo
 *      Per-query information for planning/optimization
 *      用于規(guī)劃/優(yōu)化的每個(gè)查詢信息
 * 
 * This struct is conventionally called "root" in all the planner routines.
 * It holds links to all of the planner's working state, in addition to the
 * original Query.  Note that at present the planner extensively modifies
 * the passed-in Query data structure; someday that should stop.
 * 在所有計(jì)劃程序例程中,這個(gè)結(jié)構(gòu)通常稱為“root”避诽。
 * 除了原始查詢之外龟虎,它還保存到所有計(jì)劃器工作狀態(tài)的鏈接。
 * 注意沙庐,目前計(jì)劃器會(huì)毫無節(jié)制的修改傳入的查詢數(shù)據(jù)結(jié)構(gòu),相信總有一天這種情況會(huì)停止的鲤妥。
 *----------
 */
struct AppendRelInfo;

typedef struct PlannerInfo
{
    NodeTag     type;//Node標(biāo)識(shí)
    //查詢樹
    Query      *parse;          /* the Query being planned */
    //當(dāng)前的planner全局信息
    PlannerGlobal *glob;        /* global info for current planner run */
    //查詢層次,1標(biāo)識(shí)最高層
    Index       query_level;    /* 1 at the outermost Query */
    // 如為子計(jì)劃,則這里存儲(chǔ)父計(jì)劃器指針,NULL標(biāo)識(shí)最高層
    struct PlannerInfo *parent_root;    /* NULL at outermost Query */

    /*
     * plan_params contains the expressions that this query level needs to
     * make available to a lower query level that is currently being planned.
     * outer_params contains the paramIds of PARAM_EXEC Params that outer
     * query levels will make available to this query level.
     * plan_params包含該查詢級(jí)別需要提供給當(dāng)前計(jì)劃的較低查詢級(jí)別的表達(dá)式。
     * outer_params包含PARAM_EXEC Params的參數(shù)拱雏,外部查詢級(jí)別將使該查詢級(jí)別可用這些參數(shù)棉安。
     */
    List       *plan_params;    /* list of PlannerParamItems, see below */
    Bitmapset  *outer_params;

    /*
     * simple_rel_array holds pointers to "base rels" and "other rels" (see
     * comments for RelOptInfo for more info).  It is indexed by rangetable
     * index (so entry 0 is always wasted).  Entries can be NULL when an RTE
     * does not correspond to a base relation, such as a join RTE or an
     * unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
     * simple_rel_array保存指向“base rels”和“other rels”的指針
     * (有關(guān)RelOptInfo的更多信息,請(qǐng)參見注釋)铸抑。
     * 它由可范圍表索引建立索引(因此條目0總是被浪費(fèi))贡耽。
     * 當(dāng)RTE與基本關(guān)系(如JOIN RTE或未被引用的視圖RTE時(shí))不相對(duì)應(yīng)
     *   或者如果RelOptInfo還沒有生成,條目可以為NULL鹊汛。
     */
    //RelOptInfo數(shù)組,存儲(chǔ)"base rels",比如基表/子查詢等.
    //該數(shù)組與RTE的順序一一對(duì)應(yīng),而且是從1開始,因此[0]無用 */
    struct RelOptInfo **simple_rel_array;   /* All 1-rel RelOptInfos */
    int         simple_rel_array_size;  /* 數(shù)組大小,allocated size of array */

    /*
     * simple_rte_array is the same length as simple_rel_array and holds
     * pointers to the associated rangetable entries.  This lets us avoid
     * rt_fetch(), which can be a bit slow once large inheritance sets have
     * been expanded.
     * simple_rte_array的長度與simple_rel_array相同蒲赂,
     *   并保存指向相應(yīng)范圍表?xiàng)l目的指針。
     * 這使我們可以避免執(zhí)行rt_fetch()刁憋,因?yàn)橐坏U(kuò)展了大型繼承集滥嘴,rt_fetch()可能會(huì)有點(diǎn)慢。
     */
    //RTE數(shù)組
    RangeTblEntry **simple_rte_array;   /* rangetable as an array */

    /*
     * append_rel_array is the same length as the above arrays, and holds
     * pointers to the corresponding AppendRelInfo entry indexed by
     * child_relid, or NULL if none.  The array itself is not allocated if
     * append_rel_list is empty.
     * append_rel_array與上述數(shù)組的長度相同职祷,
     *   并保存指向?qū)?yīng)的AppendRelInfo條目的指針氏涩,該條目由child_relid索引届囚,
     *   如果沒有索引則為NULL浊服。
     * 如果append_rel_list為空翰绊,則不分配數(shù)組本身。
     */
    //處理集合操作如UNION ALL時(shí)使用和分區(qū)表時(shí)使用
    struct AppendRelInfo **append_rel_array;

    /*
     * all_baserels is a Relids set of all base relids (but not "other"
     * relids) in the query; that is, the Relids identifier of the final join
     * we need to form.  This is computed in make_one_rel, just before we
     * start making Paths.
     * all_baserels是查詢中所有base relids(但不是“other” relids)的一個(gè)Relids集合;
     *   也就是說,這是需要形成的最終連接的Relids標(biāo)識(shí)符妓布。
     * 這是在開始創(chuàng)建路徑之前在make_one_rel中計(jì)算的。
     */
    Relids      all_baserels;//"base rels"

    /*
     * nullable_baserels is a Relids set of base relids that are nullable by
     * some outer join in the jointree; these are rels that are potentially
     * nullable below the WHERE clause, SELECT targetlist, etc.  This is
     * computed in deconstruct_jointree.
     * nullable_baserels是由jointree中的某些外連接中值可為空的base Relids集合;
     *   這些是在WHERE子句趾唱、SELECT targetlist等下面可能為空的樹。
     * 這是在deconstruct_jointree中處理獲得的兜辞。
     */
    //Nullable-side端的"base rels"
    Relids      nullable_baserels;

    /*
     * join_rel_list is a list of all join-relation RelOptInfos we have
     * considered in this planning run.  For small problems we just scan the
     * list to do lookups, but when there are many join relations we build a
     * hash table for faster lookups.  The hash table is present and valid
     * when join_rel_hash is not NULL.  Note that we still maintain the list
     * even when using the hash table for lookups; this simplifies life for
     * GEQO.
     * join_rel_list是在計(jì)劃執(zhí)行中考慮的所有連接關(guān)系RelOptInfos的鏈表。
     * 對(duì)于小問題夸溶,只需要掃描鏈表執(zhí)行查找逸吵,但是當(dāng)存在許多連接關(guān)系時(shí),
     *    需要構(gòu)建一個(gè)散列表來進(jìn)行更快的查找缝裁。
     * 當(dāng)join_rel_hash不為空時(shí)扫皱,哈希表是有效可用于查詢的。
     * 注意捷绑,即使在使用哈希表進(jìn)行查找時(shí)韩脑,仍然維護(hù)該鏈表;這簡化了GEQO(遺傳算法)的生命周期。
     */
    //參與連接的Relation的RelOptInfo鏈表
    List       *join_rel_list;  /* list of join-relation RelOptInfos */
    //可加快鏈表訪問的hash表
    struct HTAB *join_rel_hash; /* optional hashtable for join relations */

    /*
     * When doing a dynamic-programming-style join search, join_rel_level[k]
     * is a list of all join-relation RelOptInfos of level k, and
     * join_cur_level is the current level.  New join-relation RelOptInfos are
     * automatically added to the join_rel_level[join_cur_level] list.
     * join_rel_level is NULL if not in use.
     * 在執(zhí)行動(dòng)態(tài)規(guī)劃算法的連接搜索時(shí)粹污,join_rel_level[k]是k級(jí)的所有連接關(guān)系RelOptInfos的列表段多,
     * join_cur_level是當(dāng)前級(jí)別。
     * 新的連接關(guān)系RelOptInfos會(huì)自動(dòng)添加到j(luò)oin_rel_level[join_cur_level]鏈表中壮吩。
     * 如果不使用join_rel_level进苍,則為NULL。
     */
    //RelOptInfo指針鏈表數(shù)組,k層的join存儲(chǔ)在[k]中
    List      **join_rel_level; /* lists of join-relation RelOptInfos */
    //當(dāng)前的join層次
    int         join_cur_level; /* index of list being extended */
    //查詢的初始化計(jì)劃鏈表
    List       *init_plans;     /* init SubPlans for query */
    //CTE子計(jì)劃ID鏈表
    List       *cte_plan_ids;   /* per-CTE-item list of subplan IDs */
    //MULTIEXPR子查詢輸出的參數(shù)鏈表的鏈表
    List       *multiexpr_params;   /* List of Lists of Params for MULTIEXPR
                                     * subquery outputs */
    //活動(dòng)的等價(jià)類鏈表
    List       *eq_classes;     /* list of active EquivalenceClasses */
    //規(guī)范化的PathKey鏈表
    List       *canon_pathkeys; /* list of "canonical" PathKeys */
    //外連接約束條件鏈表(左)
    List       *left_join_clauses;  /* list of RestrictInfos for mergejoinable
                                     * outer join clauses w/nonnullable var on
                                     * left */
    //外連接約束條件鏈表(右)
    List       *right_join_clauses; /* list of RestrictInfos for mergejoinable
                                     * outer join clauses w/nonnullable var on
                                     * right */
    //全連接約束條件鏈表
    List       *full_join_clauses;  /* list of RestrictInfos for mergejoinable
                                     * full join clauses */
    //特殊連接信息鏈表
    List       *join_info_list; /* list of SpecialJoinInfos */
    //AppendRelInfo鏈表
    List       *append_rel_list;    /* list of AppendRelInfos */
    //PlanRowMarks鏈表
    List       *rowMarks;       /* list of PlanRowMarks */
    //PHI鏈表
    List       *placeholder_list;   /* list of PlaceHolderInfos */
    // 外鍵信息鏈表
    List       *fkey_list;      /* list of ForeignKeyOptInfos */
    //query_planner()要求的PathKeys鏈表
    List       *query_pathkeys; /* desired pathkeys for query_planner() */
    //分組子句路徑鍵
    List       *group_pathkeys; /* groupClause pathkeys, if any */
    //窗口函數(shù)路徑鍵
    List       *window_pathkeys;    /* pathkeys of bottom window, if any */
    //distinctClause路徑鍵
    List       *distinct_pathkeys;  /* distinctClause pathkeys, if any */
    //排序路徑鍵
    List       *sort_pathkeys;  /* sortClause pathkeys, if any */
    //已規(guī)范化的分區(qū)Schema
    List       *part_schemes;   /* Canonicalised partition schemes used in the
                                 * query. */
    //嘗試連接的RelOptInfo鏈表
    List       *initial_rels;   /* RelOptInfos we are now trying to join */

    /* Use fetch_upper_rel() to get any particular upper rel */
    //上層的RelOptInfo鏈表
    List       *upper_rels[UPPERREL_FINAL + 1]; /*  upper-rel RelOptInfos */

    /* Result tlists chosen by grouping_planner for upper-stage processing */
    //grouping_planner為上層處理選擇的結(jié)果tlists
    struct PathTarget *upper_targets[UPPERREL_FINAL + 1];//

    /*
     * grouping_planner passes back its final processed targetlist here, for
     * use in relabeling the topmost tlist of the finished Plan.
     * grouping_planner在這里傳回它最終處理過的targetlist粥航,用于重新標(biāo)記已完成計(jì)劃的最頂層tlist琅捏。
     */
    ////最后需處理的投影列
    List       *processed_tlist;

    /* Fields filled during create_plan() for use in setrefs.c */
    //setrefs.c中在create_plan()函數(shù)調(diào)用期間填充的字段
    //分組函數(shù)屬性映射
    AttrNumber *grouping_map;   /* for GroupingFunc fixup */
    //MinMaxAggInfos鏈表
    List       *minmax_aggs;    /* List of MinMaxAggInfos */
    //內(nèi)存上下文
    MemoryContext planner_cxt;  /* context holding PlannerInfo */
    //關(guān)系的page計(jì)數(shù)
    double      total_table_pages;  /* # of pages in all tables of query */
    //query_planner輸入?yún)?shù):元組處理比例
    double      tuple_fraction; /* tuple_fraction passed to query_planner */
    //query_planner輸入?yún)?shù):limit_tuple
    double      limit_tuples;   /* limit_tuples passed to query_planner */
    //表達(dá)式的最小安全等級(jí)
    Index       qual_security_level;    /* minimum security_level for quals */
    /* Note: qual_security_level is zero if there are no securityQuals */
    //注意:如果沒有securityQuals, 則qual_security_level是NULL(0)

    //如目標(biāo)relation是分區(qū)表的child/partition/分區(qū)表,則通過此字段標(biāo)記
    InheritanceKind inhTargetKind;  /* indicates if the target relation is an
                                     * inheritance child or partition or a
                                     * partitioned table */
    //是否存在RTE_JOIN的RTE
    bool        hasJoinRTEs;    /* true if any RTEs are RTE_JOIN kind */
    //是否存在標(biāo)記為LATERAL的RTE
    bool        hasLateralRTEs; /* true if any RTEs are marked LATERAL */
    //是否存在已在jointree刪除的RTE
    bool        hasDeletedRTEs; /* true if any RTE was deleted from jointree */
    //是否存在Having子句
    bool        hasHavingQual;  /* true if havingQual was non-null */
    //如約束條件中存在pseudoconstant = true,則此字段為T
    bool        hasPseudoConstantQuals; /* true if any RestrictInfo has
                                         * pseudoconstant = true */
    //是否存在遞歸語句
    bool        hasRecursion;   /* true if planning a recursive WITH item */

    /* These fields are used only when hasRecursion is true: */
    //這些字段僅在hasRecursion為T時(shí)使用:
    //工作表的PARAM_EXEC ID
    int         wt_param_id;    /* PARAM_EXEC ID for the work table */
    //非遞歸模式的訪問路徑
    struct Path *non_recursive_path;    /* a path for non-recursive term */

    /* These fields are workspace for createplan.c */
    //這些字段用于createplan.c
    //當(dāng)前節(jié)點(diǎn)之上的外部rels
    Relids      curOuterRels;   /* outer rels above current node */
    //未賦值的NestLoopParams參數(shù)
    List       *curOuterParams; /* not-yet-assigned NestLoopParams */

    /* optional private data for join_search_hook, e.g., GEQO */
    //可選的join_search_hook私有數(shù)據(jù),例如GEQO
    void       *join_search_private;

    /* Does this query modify any partition key columns? */
    //該查詢是否更新分區(qū)鍵列?
    bool        partColsUpdated;
} PlannerInfo;

二递雀、源碼解讀

expand_inherited_tables函數(shù)將表示繼承集合的每個(gè)范圍表?xiàng)l目展開為“append relation”柄延。

/*
 * expand_inherited_tables
 *      Expand each rangetable entry that represents an inheritance set
 *      into an "append relation".  At the conclusion of this process,
 *      the "inh" flag is set in all and only those RTEs that are append
 *      relation parents.
 *      將表示繼承集合的每個(gè)范圍表?xiàng)l目展開為“append relation”。
 *      在這個(gè)過程結(jié)束時(shí)缀程,“inh”標(biāo)志被設(shè)置在所有且只有那些作為append
 *      relation parents的RTEs中搜吧。
 */
void
expand_inherited_tables(PlannerInfo *root)
{
    Index       nrtes;
    Index       rti;
    ListCell   *rl;

    /*
     * expand_inherited_rtentry may add RTEs to parse->rtable. The function is
     * expected to recursively handle any RTEs that it creates with inh=true.
     * So just scan as far as the original end of the rtable list.
     * expand_inherited_rtentry可以添加RTEs到parse->rtable中。
     * 這個(gè)函數(shù)被期望遞歸地處理它用inh = true創(chuàng)建的所有RTEs杨凑。
     * 所以只要掃描到rtable鏈表最開始的末尾即可滤奈。
     */
    nrtes = list_length(root->parse->rtable);
    rl = list_head(root->parse->rtable);
    for (rti = 1; rti <= nrtes; rti++)
    {
        RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);

        expand_inherited_rtentry(root, rte, rti);
        rl = lnext(rl);
    }
}

/*
 * expand_inherited_rtentry
 *      Check whether a rangetable entry represents an inheritance set.
 *      If so, add entries for all the child tables to the query's
 *      rangetable, and build AppendRelInfo nodes for all the child tables
 *      and add them to root->append_rel_list.  If not, clear the entry's
 *      "inh" flag to prevent later code from looking for AppendRelInfos.
 *      檢查范圍表?xiàng)l目是否表示繼承集合。
 *      如是撩满,將所有子表的條目添加到查詢的范圍表中蜒程,
 *        并為所有子表構(gòu)建AppendRelInfo節(jié)點(diǎn),并將它們添加到root->append_rel_list伺帘。
 *      如沒有昭躺,清除條目的“inh”標(biāo)志,以防止以后的代碼尋找AppendRelInfos伪嫁。
 *
 * Note that the original RTE is considered to represent the whole
 * inheritance set.  The first of the generated RTEs is an RTE for the same
 * table, but with inh = false, to represent the parent table in its role
 * as a simple member of the inheritance set.
 * 注意领炫,原始的RTEs被認(rèn)為代表了整個(gè)繼承集合。
 * 生成的第一個(gè)RTE是同一個(gè)表的RTE张咳,但inh = false表示父表作為繼承集的一個(gè)簡單成員的角色帝洪。
 *
 * A childless table is never considered to be an inheritance set. For
 * regular inheritance, a parent RTE must always have at least two associated
 * AppendRelInfos: one corresponding to the parent table as a simple member of
 * inheritance set and one or more corresponding to the actual children.
 * Since a partitioned table is not scanned, it might have only one associated
 * AppendRelInfo.
 * 無子表的關(guān)系永遠(yuǎn)不會(huì)被認(rèn)為是繼承集合似舵。
 * 對(duì)于常規(guī)繼承,父RTE必須始終至少有兩個(gè)相關(guān)的AppendRelInfos:
 *   一個(gè)作為繼承集的簡單成員與父表相對(duì)應(yīng)葱峡,
 *   另一個(gè)或多個(gè)與實(shí)際的子表相對(duì)應(yīng)砚哗。
 * 因?yàn)闆]有掃描分區(qū)表,所以它可能只有一個(gè)關(guān)聯(lián)的AppendRelInfo砰奕。
 */
static void
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
    Oid         parentOID;
    PlanRowMark *oldrc;
    Relation    oldrelation;
    LOCKMODE    lockmode;
    List       *inhOIDs;
    ListCell   *l;

    /* Does RT entry allow inheritance? */
    //是否分區(qū)表?
    if (!rte->inh)
        return;
    /* Ignore any already-expanded UNION ALL nodes */
    //忽略所有已擴(kuò)展的UNION ALL節(jié)點(diǎn)
    if (rte->rtekind != RTE_RELATION)
    {
        Assert(rte->rtekind == RTE_SUBQUERY);
        return;//返回
    }
    /* Fast path for common case of childless table */
    //對(duì)于常規(guī)的無子表的關(guān)系,快速判斷
    parentOID = rte->relid;
    if (!has_subclass(parentOID))
    {
        /* Clear flag before returning */
        //無子表,設(shè)置標(biāo)記并返回
        rte->inh = false;
        return;
    }

    /*
     * The rewriter should already have obtained an appropriate lock on each
     * relation named in the query.  However, for each child relation we add
     * to the query, we must obtain an appropriate lock, because this will be
     * the first use of those relations in the parse/rewrite/plan pipeline.
     * Child rels should use the same lockmode as their parent.
     * 查詢r(jià)ewriter程序應(yīng)該已經(jīng)在查詢中命名的每個(gè)關(guān)系上獲得了適當(dāng)?shù)逆i频祝。
     * 但是,對(duì)于添加到查詢中的每個(gè)子關(guān)系脆淹,必須獲得適當(dāng)?shù)逆i常空,
     *   因?yàn)檫@將是解析/重寫/計(jì)劃過程中這些關(guān)系的第一次使用。
     * 子樹應(yīng)該使用與父樹相同的鎖模式盖溺。
     */
    lockmode = rte->rellockmode;

    /* Scan for all members of inheritance set, acquire needed locks */
    //掃描繼承集的所有成員漓糙,獲取所需的鎖
    inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);

    /*
     * Check that there's at least one descendant, else treat as no-child
     * case.  This could happen despite above has_subclass() check, if table
     * once had a child but no longer does.
     * 檢查是否至少有一個(gè)后代,否則視為無子女情況烘嘱。
     * 盡管上面有has_subclass()檢查昆禽,但如果table曾經(jīng)有一個(gè)子元素,
     *   但現(xiàn)在不再有了蝇庭,則可能發(fā)生這種情況醉鳖。
     */
    if (list_length(inhOIDs) < 2)
    {
        /* Clear flag before returning */
        //清除標(biāo)記,返回
        rte->inh = false;
        return;
    }

    /*
     * If parent relation is selected FOR UPDATE/SHARE, we need to mark its
     * PlanRowMark as isParent = true, and generate a new PlanRowMark for each
     * child.
     * 如果父關(guān)系是 selected FOR UPDATE/SHARE,
     *   則需要將其PlanRowMark標(biāo)記為isParent = true哮内,
     *   并為每個(gè)子關(guān)系生成一個(gè)新的PlanRowMark盗棵。
     */
    oldrc = get_plan_rowmark(root->rowMarks, rti);
    if (oldrc)
        oldrc->isParent = true;

    /*
     * Must open the parent relation to examine its tupdesc.  We need not lock
     * it; we assume the rewriter already did.
     * 必須打開父關(guān)系以檢查其tupdesc。
     * 不需要鎖定,我們假設(shè)查詢重寫已經(jīng)這么做了北发。
     */
    oldrelation = heap_open(parentOID, NoLock);

    /* Scan the inheritance set and expand it */
    //掃描繼承集合并擴(kuò)展之
    if (RelationGetPartitionDesc(oldrelation) != NULL)//
    {
        Assert(rte->relkind == RELKIND_PARTITIONED_TABLE);

        /*
         * If this table has partitions, recursively expand them in the order
         * in which they appear in the PartitionDesc.  While at it, also
         * extract the partition key columns of all the partitioned tables.
         * 如果這個(gè)表有分區(qū)纹因,則按分區(qū)在PartitionDesc中出現(xiàn)的順序遞歸展開它們。
         * 同時(shí)琳拨,還提取所有分區(qū)表的分區(qū)鍵列瞭恰。
         */
        expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
                                   lockmode, &root->append_rel_list);
    }
    else
    {
        //分區(qū)描述符獲取不成功(沒有分區(qū)信息)
        List       *appinfos = NIL;
        RangeTblEntry *childrte;
        Index       childRTindex;

        /*
         * This table has no partitions.  Expand any plain inheritance
         * children in the order the OIDs were returned by
         * find_all_inheritors.
         * 這個(gè)表沒有分區(qū)。
         * 按find_all_inheritors返回的OIDs的順序展開所有普通繼承子元素狱庇。
         */
        foreach(l, inhOIDs)//遍歷OIDs
        {
            Oid         childOID = lfirst_oid(l);
            Relation    newrelation;

            /* Open rel if needed; we already have required locks */
            //如有需要惊畏,打開rel(已獲得鎖)
            if (childOID != parentOID)
                newrelation = heap_open(childOID, NoLock);
            else
                newrelation = oldrelation;

            /*
             * It is possible that the parent table has children that are temp
             * tables of other backends.  We cannot safely access such tables
             * (because of buffering issues), and the best thing to do seems
             * to be to silently ignore them.
             * 父表的子表可能是其他后臺(tái)的臨時(shí)表。
             * 我們不能安全地訪問這些表(因?yàn)榇嬖诰彌_問題)密任,最好的辦法似乎是悄悄地忽略它們颜启。
             */
            if (childOID != parentOID && RELATION_IS_OTHER_TEMP(newrelation))
            {
                heap_close(newrelation, lockmode);//忽略它們
                continue;
            }

            expand_single_inheritance_child(root, rte, rti, oldrelation, oldrc,
                                            newrelation,
                                            &appinfos, &childrte,
                                            &childRTindex);//展開

            /* Close child relations, but keep locks */
            //關(guān)閉子表,但仍持有鎖
            if (childOID != parentOID)
                heap_close(newrelation, NoLock);
        }

        /*
         * If all the children were temp tables, pretend it's a
         * non-inheritance situation; we don't need Append node in that case.
         * The duplicate RTE we added for the parent table is harmless, so we
         * don't bother to get rid of it; ditto for the useless PlanRowMark
         * node.
         * 如果所有的子表都是臨時(shí)表,則假設(shè)這是非繼承情況;
         *   在這種情況下批什,不需要APPEND NODE农曲。
         * 我們?yōu)楦副硖砑又貜?fù)的RTE是無關(guān)緊要的社搅,
         *   因此我們不必費(fèi)心刪除它;無用的PlanRowMark節(jié)點(diǎn)也是如此驻债。
         */
        if (list_length(appinfos) < 2)
            rte->inh = false;//設(shè)置標(biāo)記
        else
            root->append_rel_list = list_concat(root->append_rel_list,
                                                appinfos);//添加到鏈表中

    }

    heap_close(oldrelation, NoLock);//關(guān)閉relation
}

/*
 * expand_partitioned_rtentry
 *      Recursively expand an RTE for a partitioned table.
 *      遞歸擴(kuò)展分區(qū)表RTE
 */
static void
expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
                           Index parentRTindex, Relation parentrel,
                           PlanRowMark *top_parentrc, LOCKMODE lockmode,
                           List **appinfos)
{
    int         i;
    RangeTblEntry *childrte;
    Index       childRTindex;
    PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);

    check_stack_depth();

    /* A partitioned table should always have a partition descriptor. */
    //分配表通常應(yīng)具備分區(qū)描述符
    Assert(partdesc);

    Assert(parentrte->inh);

    /*
     * Note down whether any partition key cols are being updated. Though it's
     * the root partitioned table's updatedCols we are interested in, we
     * instead use parentrte to get the updatedCols. This is convenient
     * because parentrte already has the root partrel's updatedCols translated
     * to match the attribute ordering of parentrel.
     * 請(qǐng)注意是否正在更新分區(qū)鍵cols乳规。
     * 雖然感興趣的是根分區(qū)表的updatedCols,但是使用parentrte來獲取updatedCols合呐。
     * 這很方便暮的,因?yàn)閜arentrte已經(jīng)將root partrel的updatedCols轉(zhuǎn)換為匹配parentrel的屬性順序。
     */
    if (!root->partColsUpdated)
        root->partColsUpdated =
            has_partition_attrs(parentrel, parentrte->updatedCols, NULL);

    /* First expand the partitioned table itself. */
    //
    expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel,
                                    top_parentrc, parentrel,
                                    appinfos, &childrte, &childRTindex);

    /*
     * If the partitioned table has no partitions, treat this as the
     * non-inheritance case.
     * 如果分區(qū)表沒有分區(qū)淌实,則將其視為非繼承情況冻辩。
     */
    if (partdesc->nparts == 0)
    {
        parentrte->inh = false;
        return;
    }

    for (i = 0; i < partdesc->nparts; i++)
    {
        Oid         childOID = partdesc->oids[i];
        Relation    childrel;

        /* Open rel; we already have required locks */
        //打開rel
        childrel = heap_open(childOID, NoLock);

        /*
         * Temporary partitions belonging to other sessions should have been
         * disallowed at definition, but for paranoia's sake, let's double
         * check.
         * 屬于其他會(huì)話的臨時(shí)分區(qū)在定義時(shí)應(yīng)該是不允許的,但是出于偏執(zhí)狂的考慮拆祈,再檢查一下恨闪。
         */
        if (RELATION_IS_OTHER_TEMP(childrel))
            elog(ERROR, "temporary relation from another session found as partition");
        //擴(kuò)展之
        expand_single_inheritance_child(root, parentrte, parentRTindex,
                                        parentrel, top_parentrc, childrel,
                                        appinfos, &childrte, &childRTindex);

        /* If this child is itself partitioned, recurse */
        //子關(guān)系是分區(qū)表,遞歸擴(kuò)展
        if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
            expand_partitioned_rtentry(root, childrte, childRTindex,
                                       childrel, top_parentrc, lockmode,
                                       appinfos);

        /* Close child relation, but keep locks */
        //關(guān)閉子關(guān)系,但仍持有鎖
        heap_close(childrel, NoLock);
    }
}


 /* expand_single_inheritance_child
 *      Build a RangeTblEntry and an AppendRelInfo, if appropriate, plus
 *      maybe a PlanRowMark.
 *      構(gòu)建一個(gè)RangeTblEntry和一個(gè)AppendRelInfo,如果合適的話放坏,再加上一個(gè)PlanRowMark咙咽。
 *
 * We now expand the partition hierarchy level by level, creating a
 * corresponding hierarchy of AppendRelInfos and RelOptInfos, where each
 * partitioned descendant acts as a parent of its immediate partitions.
 * (This is a difference from what older versions of PostgreSQL did and what
 * is still done in the case of table inheritance for unpartitioned tables,
 * where the hierarchy is flattened during RTE expansion.)
 * 現(xiàn)在我們逐層擴(kuò)展分區(qū)層次結(jié)構(gòu),創(chuàng)建一個(gè)對(duì)應(yīng)的AppendRelInfos和RelOptInfos層次結(jié)構(gòu)淤年,
 *   其中每個(gè)分區(qū)的后代充當(dāng)其直接分區(qū)的父級(jí)钧敞。
 * (在未分區(qū)表的表繼承中,
 *    層次結(jié)構(gòu)在RTE擴(kuò)展期間被扁平化麸粮,這與老版本的PostgreSQL有所不同溉苛。)
 *
 * PlanRowMarks still carry the top-parent's RTI, and the top-parent's
 * allMarkTypes field still accumulates values from all descendents.
 * PlanRowMarks仍然具有頂級(jí)父類的RTI信息,
 *   而頂級(jí)父類的allMarkTypes字段仍然從所有子類累積弄诲。
 * 
 * "parentrte" and "parentRTindex" are immediate parent's RTE and
 * RTI. "top_parentrc" is top parent's PlanRowMark.
 * “parentrte”和“parentRTindex”是直接父級(jí)的RTE和RTI愚战。
 * “top_parentrc”是top父類的PlanRowMark。
 *
 * The child RangeTblEntry and its RTI are returned in "childrte_p" and
 * "childRTindex_p" resp.
 * 子RTE及其RTI在“childrte_p”和“childRTindex_p”resp中返回齐遵。
 */
static void
expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte,
                                Index parentRTindex, Relation parentrel,
                                PlanRowMark *top_parentrc, Relation childrel,
                                List **appinfos, RangeTblEntry **childrte_p,
                                Index *childRTindex_p)
{
    Query      *parse = root->parse;
    Oid         parentOID = RelationGetRelid(parentrel);//父關(guān)系
    Oid         childOID = RelationGetRelid(childrel);//子關(guān)系
    RangeTblEntry *childrte;
    Index       childRTindex;
    AppendRelInfo *appinfo;

    /*
     * Build an RTE for the child, and attach to query's rangetable list. We
     * copy most fields of the parent's RTE, but replace relation OID and
     * relkind, and set inh = false.  Also, set requiredPerms to zero since
     * all required permissions checks are done on the original RTE. Likewise,
     * set the child's securityQuals to empty, because we only want to apply
     * the parent's RLS conditions regardless of what RLS properties
     * individual children may have.  (This is an intentional choice to make
     * inherited RLS work like regular permissions checks.) The parent
     * securityQuals will be propagated to children along with other base
     * restriction clauses, so we don't need to do it here.
     * 為子元素構(gòu)建一個(gè)RTE凤巨,并附加到query的范圍表鏈表中。
     * 我們復(fù)制父RTE的大部分字段洛搀,但是替換關(guān)系OID和relkind敢茁,并設(shè)置inh = false。
     * 另外留美,將requiredPerms設(shè)置為0彰檬,因?yàn)樗行枰臋?quán)限檢查都是在原始RTE上完成的。
     * 同樣谎砾,將子元素securityQuals設(shè)置為空逢倍,因?yàn)橹幌霊?yīng)用父元素的RLS條件,
     *   而不管每個(gè)子元素可能具有什么RLS屬性景图。
     *   (這是一種有意的選擇较雕,目的是讓繼承的RLS像常規(guī)權(quán)限檢查一樣工作。)
     * 父安全條件quals將與其他基本限制條款一起傳播到子級(jí),因此不需要在這里這樣做亮蒋。
     */
    childrte = copyObject(parentrte);
    *childrte_p = childrte;
    childrte->relid = childOID;
    childrte->relkind = childrel->rd_rel->relkind;
    /* A partitioned child will need to be expanded further. */
    //分區(qū)表的子關(guān)系會(huì)在"將來"擴(kuò)展
    if (childOID != parentOID &&
        childrte->relkind == RELKIND_PARTITIONED_TABLE)
        childrte->inh = true;
    else
        childrte->inh = false;
    childrte->requiredPerms = 0;
    childrte->securityQuals = NIL;
    parse->rtable = lappend(parse->rtable, childrte);
    childRTindex = list_length(parse->rtable);
    *childRTindex_p = childRTindex;

    /*
     * We need an AppendRelInfo if paths will be built for the child RTE. If
     * childrte->inh is true, then we'll always need to generate append paths
     * for it.  If childrte->inh is false, we must scan it if it's not a
     * partitioned table; but if it is a partitioned table, then it never has
     * any data of its own and need not be scanned.
     * 如果要為子RTE構(gòu)建路徑扣典,則需要一個(gè)AppendRelInfo。
     * 如果children ->inh為真慎玖,那么我們總是需要為它生成APPEND訪問路徑贮尖。
     * 如果children ->inh為假,則必須掃描它趁怔,如果它不是分區(qū)表;
     *   但是如果它是一個(gè)分區(qū)表湿硝,那么它永遠(yuǎn)不會(huì)有任何自己的數(shù)據(jù),也不需要掃描润努。
     */
    if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
    {
        appinfo = makeNode(AppendRelInfo);
        appinfo->parent_relid = parentRTindex;
        appinfo->child_relid = childRTindex;
        appinfo->parent_reltype = parentrel->rd_rel->reltype;
        appinfo->child_reltype = childrel->rd_rel->reltype;
        make_inh_translation_list(parentrel, childrel, childRTindex,
                                  &appinfo->translated_vars);
        appinfo->parent_reloid = parentOID;
        *appinfos = lappend(*appinfos, appinfo);

        /*
         * Translate the column permissions bitmaps to the child's attnums (we
         * have to build the translated_vars list before we can do this). But
         * if this is the parent table, leave copyObject's result alone.
         * 將列權(quán)限位圖轉(zhuǎn)換為子節(jié)點(diǎn)的attnums(在此之前必須構(gòu)建translated_vars列表)关斜。
         * 但是,如果這是父表铺浇,則不要理會(huì)copyObject的結(jié)果蚤吹。
         *
         * Note: we need to do this even though the executor won't run any
         * permissions checks on the child RTE.  The insertedCols/updatedCols
         * bitmaps may be examined for trigger-firing purposes.
         * 注意:即使執(zhí)行程序不會(huì)在子RTE上運(yùn)行任何權(quán)限檢查,我們也需要這樣做随抠。
         * 可以檢查插入的tedcols /updatedCols位圖是否具有觸發(fā)目的裁着。
         */
        if (childOID != parentOID)
        {
            childrte->selectedCols = translate_col_privs(parentrte->selectedCols,
                                                         appinfo->translated_vars);
            childrte->insertedCols = translate_col_privs(parentrte->insertedCols,
                                                         appinfo->translated_vars);
            childrte->updatedCols = translate_col_privs(parentrte->updatedCols,
                                                        appinfo->translated_vars);
        }
    }

    /*
     * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE.
     * 如父關(guān)系標(biāo)記為FOR UPDATE/SHARE,則創(chuàng)建PlanRowMark
     */
    if (top_parentrc)
    {
        PlanRowMark *childrc = makeNode(PlanRowMark);

        childrc->rti = childRTindex;
        childrc->prti = top_parentrc->rti;
        childrc->rowmarkId = top_parentrc->rowmarkId;
        /* Reselect rowmark type, because relkind might not match parent */
        //重新選擇rowmark類型,因?yàn)閞elkind可能與父類不匹配
        childrc->markType = select_rowmark_type(childrte,
                                                top_parentrc->strength);
        childrc->allMarkTypes = (1 << childrc->markType);
        childrc->strength = top_parentrc->strength;
        childrc->waitPolicy = top_parentrc->waitPolicy;

        /*
         * We mark RowMarks for partitioned child tables as parent RowMarks so
         * that the executor ignores them (except their existence means that
         * the child tables be locked using appropriate mode).
         * 我們將分區(qū)的子表的RowMarks標(biāo)記為父RowMarks拱她,
         *   以便執(zhí)行程序忽略它們(除非它們的存在意味著子表使用適當(dāng)?shù)哪J奖绘i定)二驰。
         */
        childrc->isParent = (childrte->relkind == RELKIND_PARTITIONED_TABLE);

        /* Include child's rowmark type in top parent's allMarkTypes */
        //在父類的allMarkTypes中包含子類的rowmark類型
        top_parentrc->allMarkTypes |= childrc->allMarkTypes;

        root->rowMarks = lappend(root->rowMarks, childrc);
    }
}

三、跟蹤分析

測試腳本如下

testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Append  (cost=0.00..30.53 rows=6 width=200)
   ->  Seq Scan on public.t_hash_partition_1  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3
         Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2))
   ->  Seq Scan on public.t_hash_partition_3  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3
         Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))
(7 rows)

啟動(dòng)gdb,設(shè)置斷點(diǎn)

(gdb) b expand_inherited_tables
Breakpoint 1 at 0x7e53ba: file prepunion.c, line 1483.
(gdb) c
Continuing.

Breakpoint 1, expand_inherited_tables (root=0x28fcdc8) at prepunion.c:1483
1483        nrtes = list_length(root->parse->rtable);

獲取RTE的個(gè)數(shù)和鏈表元素

(gdb) n
1484        rl = list_head(root->parse->rtable);
(gdb) 
1485        for (rti = 1; rti <= nrtes; rti++)
(gdb) p nrtes
$1 = 1
(gdb) p *rl
$2 = {data = {ptr_value = 0x28d83d0, int_value = 42828752, oid_value = 42828752}, next = 0x0}
(gdb) 

循環(huán)處理RTE

(gdb) n
1487            RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
(gdb) 
1489            expand_inherited_rtentry(root, rte, rti);
(gdb) p *rte
$3 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28d84e8, lateral = false, 
  inh = true, inFromCl = true, requiredPerms = 2, checkAsUser = 0, selectedCols = 0x28d8c40, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}

進(jìn)入expand_inherited_rtentry

(gdb) step
expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:1517
1517        Query      *parse = root->parse;

expand_inherited_rtentry->分區(qū)表標(biāo)記為T

1526        if (!rte->inh)
(gdb) p rte->inh
$4 = true

expand_inherited_rtentry->執(zhí)行相關(guān)判斷

(gdb) n
1529        if (rte->rtekind != RTE_RELATION)
(gdb) p rte->rtekind
$5 = RTE_RELATION
(gdb) n
1535        parentOID = rte->relid;
(gdb) 
1536        if (!has_subclass(parentOID))
(gdb) p parentOID
$6 = 16986
(gdb) n
1556        oldrc = get_plan_rowmark(root->rowMarks, rti);
(gdb) 
1557        if (rti == parse->resultRelation)
(gdb) p *oldrc
Cannot access memory at address 0x0

expand_inherited_rtentry->掃描繼承集的所有成員秉沼,獲取所需的鎖,并構(gòu)建OIDs鏈表

(gdb) n
1559        else if (oldrc && RowMarkRequiresRowShareLock(oldrc->markType))
(gdb) 
1562            lockmode = AccessShareLock;
(gdb) 
1565        inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
(gdb) 
1572        if (list_length(inhOIDs) < 2)
(gdb) p inhOIDs
$7 = (List *) 0x28fd208
(gdb) p *inhOIDs
$8 = {type = T_OidList, length = 7, head = 0x28fd1e0, tail = 0x28fd778}
(gdb) 

expand_inherited_rtentry->打開relation

(gdb) n
1584        if (oldrc)
(gdb) 
1591        oldrelation = heap_open(parentOID, NoLock);

expand_inherited_rtentry->成功獲取分區(qū)描述符,調(diào)用expand_partitioned_rtentry

(gdb) 
1594        if (RelationGetPartitionDesc(oldrelation) != NULL)
(gdb) 
1596            Assert(rte->relkind == RELKIND_PARTITIONED_TABLE);
(gdb) 
1603            expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
(gdb) 

expand_inherited_rtentry->進(jìn)入expand_partitioned_rtentry

(gdb) step
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1684
1684        PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);

expand_partitioned_rtentry->獲取分區(qū)描述符

1684        PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
(gdb) n
1686        check_stack_depth();
(gdb) p *partdesc
$9 = {nparts = 6, oids = 0x298e4f8, boundinfo = 0x298e530}

expand_partitioned_rtentry->執(zhí)行相關(guān)校驗(yàn)

(gdb) n
1689        Assert(partdesc);
(gdb) 
1691        Assert(parentrte->inh);
(gdb) 
1700        if (!root->partColsUpdated)
(gdb) 
1702                has_partition_attrs(parentrel, parentrte->updatedCols, NULL);
(gdb) 
1701            root->partColsUpdated =
(gdb) 
1705        expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel,

expand_partitioned_rtentry->首先展開分區(qū)表本身,進(jìn)入expand_single_inheritance_child

(gdb) step
expand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, childrel=0x7f4e66827980, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4)
    at prepunion.c:1778
1778        Query      *parse = root->parse;

expand_single_inheritance_child->執(zhí)行相關(guān)初始化(childrte)

(gdb) n
1779        Oid         parentOID = RelationGetRelid(parentrel);
(gdb) 
1780        Oid         childOID = RelationGetRelid(childrel);
(gdb) 
1797        childrte = copyObject(parentrte);
(gdb) p parentOID
$10 = 16986
(gdb) p childOID
$11 = 16986
(gdb) n
1798        *childrte_p = childrte;
(gdb) 
1799        childrte->relid = childOID;
(gdb) 
1800        childrte->relkind = childrel->rd_rel->relkind;
(gdb) 
1802        if (childOID != parentOID &&
(gdb) 
1806            childrte->inh = false;
(gdb) 
1807        childrte->requiredPerms = 0;
(gdb) 
1808        childrte->securityQuals = NIL;
(gdb) 
1809        parse->rtable = lappend(parse->rtable, childrte);
(gdb) 
1810        childRTindex = list_length(parse->rtable);
(gdb) 
1811        *childRTindex_p = childRTindex;
(gdb) p *childrte -->relid = 16986,仍為分區(qū)表
$12 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd268, lateral = false, 
  inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fd898, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}
(gdb) p *childRTindex_p
$13 = 0

expand_single_inheritance_child->完成分區(qū)表本身的擴(kuò)展,回到expand_partitioned_rtentry

(gdb) p *childRTindex_p
$13 = 0
(gdb) n
1820        if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
(gdb) 
1855        if (top_parentrc)
(gdb) 
1881    }
(gdb) 
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1713
1713        if (partdesc->nparts == 0)

expand_partitioned_rtentry->開始遍歷分區(qū)描述符中的分區(qū)

1713        if (partdesc->nparts == 0)
(gdb) n
1719        for (i = 0; i < partdesc->nparts; i++)
(gdb) 
1721            Oid         childOID = partdesc->oids[i];
(gdb) 
1725            childrel = heap_open(childOID, NoLock);
(gdb) 
1732            if (RELATION_IS_OTHER_TEMP(childrel))
(gdb) 
1735            expand_single_inheritance_child(root, parentrte, parentRTindex,
(gdb) p childOID
$14 = 16989 
----------------------------------------
testdb=# select relname from pg_class where oid=16989;
      relname       
--------------------
 t_hash_partition_1
(1 row)
----------------------------------------

expand_single_inheritance_child->再次進(jìn)入expand_single_inheritance_child

(gdb) step
expand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, childrel=0x7f4e668306a0, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4)
    at prepunion.c:1778
1778        Query      *parse = root->parse;

expand_single_inheritance_child->開始構(gòu)建AppendRelInfo

...
1820        if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
(gdb) 
1822            appinfo = makeNode(AppendRelInfo);
(gdb) p *childrte
$17 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16989, relkind = 114 'r', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd9d0, lateral = false, 
  inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fdbc8, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}
(gdb) p *childrte->relkind
Cannot access memory at address 0x72
(gdb) p childrte->relkind
$18 = 114 'r'
(gdb) p childrte->inh
$19 = false

expand_single_inheritance_child->構(gòu)建完畢,查看AppendRelInfo結(jié)構(gòu)體

(gdb) n
1823            appinfo->parent_relid = parentRTindex;
(gdb) 
1824            appinfo->child_relid = childRTindex;
(gdb) 
1825            appinfo->parent_reltype = parentrel->rd_rel->reltype;
(gdb) 
1826            appinfo->child_reltype = childrel->rd_rel->reltype;
(gdb) 
1827            make_inh_translation_list(parentrel, childrel, childRTindex,
(gdb) 
1829            appinfo->parent_reloid = parentOID;
(gdb) 
1830            *appinfos = lappend(*appinfos, appinfo);
(gdb) 
1841            if (childOID != parentOID)
(gdb) 
1843                childrte->selectedCols = translate_col_privs(parentrte->selectedCols,
(gdb) 
1845                childrte->insertedCols = translate_col_privs(parentrte->insertedCols,
(gdb) 
1847                childrte->updatedCols = translate_col_privs(parentrte->updatedCols,
(gdb) 
1855        if (top_parentrc)
(gdb) p *appinfo
$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, 
  translated_vars = 0x28fdc90, parent_reloid = 16986}

expand_single_inheritance_child->完成調(diào)用,返回

(gdb) 
1855        if (top_parentrc)
(gdb) p *appinfo
$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, 
  translated_vars = 0x28fdc90, parent_reloid = 16986}
(gdb) n
1881    }
(gdb) 
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1740
1740            if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)

expand_inherited_rtentry->完成expand_partitioned_rtentry過程調(diào)用,回到expand_inherited_rtentry

(gdb) finish
Run till exit from #0  expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, 
    parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1740
0x00000000007e55e3 in expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:1603
1603            expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
(gdb) 

expand_inherited_rtentry->完成expand_inherited_rtentry的調(diào)用,回到expand_inherited_tables

(gdb) n
1665        heap_close(oldrelation, NoLock);
(gdb) 
1666    }
(gdb) 
expand_inherited_tables (root=0x28fcdc8) at prepunion.c:1490
1490            rl = lnext(rl);
(gdb) 

expand_inherited_tables->完成expand_inherited_tables調(diào)用,回到subquery_planner

(gdb) n
1485        for (rti = 1; rti <= nrtes; rti++)
(gdb) 
1492    }
(gdb) 
subquery_planner (glob=0x28fcd30, parse=0x28d82b8, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:719
719     root->hasHavingQual = (parse->havingQual != NULL);
(gdb) 

DONE!

四桶雀、參考資料

Parallel Append implementation
Partition Elimination in PostgreSQL 11

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市唬复,隨后出現(xiàn)的幾起案子矗积,更是在濱河造成了極大的恐慌,老刑警劉巖敞咧,帶你破解...
    沈念sama閱讀 222,464評(píng)論 6 517
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件棘捣,死亡現(xiàn)場離奇詭異,居然都是意外死亡休建,警方通過查閱死者的電腦和手機(jī)乍恐,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 95,033評(píng)論 3 399
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來测砂,“玉大人茵烈,你說我怎么就攤上這事∑鲂” “怎么了呜投?”我有些...
    開封第一講書人閱讀 169,078評(píng)論 0 362
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經(jīng)常有香客問我仑荐,道長雕拼,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 59,979評(píng)論 1 299
  • 正文 為了忘掉前任释漆,我火速辦了婚禮,結(jié)果婚禮上篮迎,老公的妹妹穿的比我還像新娘男图。我一直安慰自己,他們只是感情好甜橱,可當(dāng)我...
    茶點(diǎn)故事閱讀 69,001評(píng)論 6 398
  • 文/花漫 我一把揭開白布逊笆。 她就那樣靜靜地躺著,像睡著了一般岂傲。 火紅的嫁衣襯著肌膚如雪难裆。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 52,584評(píng)論 1 312
  • 那天镊掖,我揣著相機(jī)與錄音乃戈,去河邊找鬼。 笑死亩进,一個(gè)胖子當(dāng)著我的面吹牛症虑,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播归薛,決...
    沈念sama閱讀 41,085評(píng)論 3 422
  • 文/蒼蘭香墨 我猛地睜開眼谍憔,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了主籍?” 一聲冷哼從身側(cè)響起习贫,我...
    開封第一講書人閱讀 40,023評(píng)論 0 277
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎千元,沒想到半個(gè)月后苫昌,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 46,555評(píng)論 1 319
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡幸海,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 38,626評(píng)論 3 342
  • 正文 我和宋清朗相戀三年蜡歹,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片涕烧。...
    茶點(diǎn)故事閱讀 40,769評(píng)論 1 353
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡月而,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出议纯,到底是詐尸還是另有隱情父款,我是刑警寧澤,帶...
    沈念sama閱讀 36,439評(píng)論 5 351
  • 正文 年R本政府宣布,位于F島的核電站憨攒,受9級(jí)特大地震影響世杀,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜肝集,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 42,115評(píng)論 3 335
  • 文/蒙蒙 一瞻坝、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧杏瞻,春花似錦所刀、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,601評(píng)論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至砌函,卻和暖如春斩披,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背讹俊。 一陣腳步聲響...
    開封第一講書人閱讀 33,702評(píng)論 1 274
  • 我被黑心中介騙來泰國打工垦沉, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人仍劈。 一個(gè)月前我還...
    沈念sama閱讀 49,191評(píng)論 3 378
  • 正文 我出身青樓乡话,卻偏偏與公主長得像,于是被迫代替她去往敵國和親耳奕。 傳聞我的和親對(duì)象是個(gè)殘疾皇子绑青,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,781評(píng)論 2 361

推薦閱讀更多精彩內(nèi)容