This section describes the execpreparetuplerouting - > execfindpartition - > formpartitionkeydatum function, which gets the partition key value of Tuple.
I. data structure
ModifyTable
Apply the rows generated by the subplan to the result table by inserting, updating, or deleting.
/* ---------------- * ModifyTable node - * Apply rows produced by subplan(s) to result table(s), * by inserting, updating, or deleting. * Apply the rows generated by the subplan to the result table by inserting, updating, or deleting. * * If the originally named target table is a partitioned table, both * nominalRelation and rootRelation contain the RT index of the partition * root, which is not otherwise mentioned in the plan. Otherwise rootRelation * is zero. However, nominalRelation will always be set, as it's the rel that * EXPLAIN should claim is the INSERT/UPDATE/DELETE target. * If the originally named target table is a partitioned table, both the nominal relation and the rootRelation contain the RT index of the partition root, which is not mentioned in the plan. * Otherwise, the root relationship is zero. However, the nominal relationship is always set, because the rel that EXPLAIN should declare is the INSERT/UPDATE/DELETE target relationship. * * Note that rowMarks and epqParam are presumed to be valid for all the * subplan(s); they can't contain any info that varies across subplans. * Note that rowMarks and epqParam are assumed to be valid for all subprogrammes; * They cannot contain any information that changes in the sub plan. * ---------------- */ typedef struct ModifyTable { Plan plan; CmdType operation; /* Operation type; INSERT, UPDATE, or DELETE */ bool canSetTag; /* Do we set the command tag / es? Processed? */ Index nominalRelation; /* Parent RT index for use of EXPLAIN */ Index rootRelation; /* Root RT index, if target is partitioned */ bool partColsUpdated; /* some part key in hierarchy updated */ List *resultRelations; /* RT integer list of RT indexes */ int resultRelIndex; /* index of first resultRel in plan's list */ int rootResultRelIndex; /* index of the partitioned table root */ List *plans; /* plan(s) producing source data */ List *withCheckOptionLists; /* Each target table has a WCO list; per target table WCO lists */ List *returningLists; /* Each target table has a return list; per target table return lists */ List *fdwPrivLists; /* Per target table FDW private data lists */ Bitmapset *fdwDirectModifyPlans; /* FDW DM Indexes of FDW DM plans */ List *rowMarks; /* rowMarks Planrowmarks (non locking only) */ int epqParam; /* EvalPlanQual ID of param for evalplanqual re Eval */ OnConflictAction onConflictAction; /* ON CONFLICT action */ List *arbiterIndexes; /* List of ON CONFLICT arbiter index OIDs */ List *onConflictSet; /* SET for INSERT ON CONFLICT DO UPDATE */ Node *onConflictWhere; /* WHERE for ON CONFLICT UPDATE */ Index exclRelRTI; /* RTI of the EXCLUDED pseudo relation */ List *exclRelTlist; /* tlist of the EXCLUDED pseudo relation */ } ModifyTable;
ResultRelInfo
ResultRelInfo structure
Whenever an existing relationship is updated, we must update the index on the relationship, and perhaps trigger the trigger. ResultRelInfo holds all the information needed about the result relationship, including the index.
/* * ResultRelInfo * ResultRelInfo structural morphology * * Whenever we update an existing relation, we have to update indexes on the * relation, and perhaps also fire triggers. ResultRelInfo holds all the * information needed about a result relation, including indexes. * Whenever an existing relationship is updated, we must update the index on the relationship, and perhaps trigger the trigger. * ResultRelInfo Save all the information you need about the resulting relationship, including the index. * * Normally, a ResultRelInfo refers to a table that is in the query's * range table; then ri_RangeTableIndex is the RT index and ri_RelationDesc * is just a copy of the relevant es_relations[] entry. But sometimes, * in ResultRelInfos used only for triggers, ri_RangeTableIndex is zero * and ri_RelationDesc is a separately-opened relcache pointer that needs * to be separately closed. See ExecGetTriggerResultRel. * In general, ResultRelInfo refers to the table in the query range table; * ri_RangeTableIndex It is an RT index, and RI \ relationdesc is only a copy of the related es \ relationships [] entry. * But sometimes, in ResultRelInfos, which is only used for triggers, ri'rangetableindex is zero (NULL), * And RI? Relationdesc is a relacache pointer that needs to be closed and opened separately. * Refer to ExecGetTriggerResultRel structure for details. */ typedef struct ResultRelInfo { NodeTag type; /* result relation's range table index, or 0 if not in range table */ //RTE index Index ri_RangeTableIndex; /* relation descriptor for result relation */ //Descriptor of result / target relation Relation ri_RelationDesc; /* # of indices existing on result relation */ //Number of indexes in the target relationship int ri_NumIndices; /* array of relation descriptors for indices */ //Index's relation descriptor array (index is treated as a relation) RelationPtr ri_IndexRelationDescs; /* array of key/attr info for indices */ //Key / property array for index IndexInfo **ri_IndexRelationInfo; /* triggers to be fired, if any */ //Triggered index TriggerDesc *ri_TrigDesc; /* cached lookup info for trigger functions */ //Trigger function (CACHE) FmgrInfo *ri_TrigFunctions; /* array of trigger WHEN expr states */ //Trigger array for WHEN expression state ExprState **ri_TrigWhenExprs; /* optional runtime measurements for triggers */ //Optional trigger run time meter Instrumentation *ri_TrigInstrument; /* FDW callback functions, if foreign table */ //FDW callback function struct FdwRoutine *ri_FdwRoutine; /* available to save private state of FDW */ //Can be used to store the private state of FDW void *ri_FdwState; /* true when modifying foreign table directly */ //T when updating FDW directly bool ri_usesFdwDirectModify; /* list of WithCheckOption's to be checked */ //WithCheckOption linked list List *ri_WithCheckOptions; /* list of WithCheckOption expr states */ //WithCheckOption expression list List *ri_WithCheckOptionExprs; /* array of constraint-checking expr states */ //Constraint check expression state array ExprState **ri_ConstraintExprs; /* for removing junk attributes from tuples */ //Used to remove the junk attribute from a tuple JunkFilter *ri_junkFilter; /* list of RETURNING expressions */ //RETURNING expression list List *ri_returningList; /* for computing a RETURNING list */ //Used to calculate the RETURNING list ProjectionInfo *ri_projectReturning; /* list of arbiter indexes to use to check conflicts */ //List of arbiter indexes used to check for conflicts List *ri_onConflictArbiterIndexes; /* ON CONFLICT evaluation state */ //ON CONFLICT resolution status OnConflictSetState *ri_onConflict; /* partition check expression */ //Partition check expression list List *ri_PartitionCheck; /* partition check expression state */ //Partition check expression state ExprState *ri_PartitionCheckExpr; /* relation descriptor for root partitioned table */ //Partition root table descriptor Relation ri_PartitionRoot; /* Additional information specific to partition tuple routing */ //Additional partition tuple routing information struct PartitionRoutingInfo *ri_PartitionInfo; } ResultRelInfo;
PartitionRoutingInfo
PartitionRoutingInfo structure
Partition routing information, which is used to route tuples to the result relationship information of the table partition.
/* * PartitionRoutingInfo * PartitionRoutingInfo - Partition routing information * * Additional result relation information specific to routing tuples to a * table partition. * Result relationship information used to route tuples to table partitions. */ typedef struct PartitionRoutingInfo { /* * Map for converting tuples in root partitioned table format into * partition format, or NULL if no conversion is required. * Mapping, used to convert tuples in the root partition table format to partition format, or NULL if no conversion is required. */ TupleConversionMap *pi_RootToPartitionMap; /* * Map for converting tuples in partition format into the root partitioned * table format, or NULL if no conversion is required. * Mapping, used to convert tuples in partition format to root partition table format, or NULL if no conversion is required. */ TupleConversionMap *pi_PartitionToRootMap; /* * Slot to store tuples in partition format, or NULL when no translation * is required between root and partition. * slot that stores tuples in partition format. NULL when no conversion is required between the root partition and partition. */ TupleTableSlot *pi_PartitionTupleSlot; } PartitionRoutingInfo;
TupleConversionMap
TupleConversionMap structure, used to store tuple conversion mapping information
typedef struct TupleConversionMap { TupleDesc indesc; /* Descriptor of source row type */ TupleDesc outdesc; /* Descriptor for result row type */ AttrNumber *attrMap; /* Index of input fields, or 0 for null */ Datum *invalues; /* Workspace for constructing source */ bool *inisnull; //Whether it is a NULL tag array Datum *outvalues; /* workspace for constructing result */ bool *outisnull; //null marker } TupleConversionMap;
II. Source code interpretation
The FormPartitionKeyDatum function gets the partition key value of Tuple, and returns the array of key values [] and whether it is null marked isnull [] array
/* ---------------- * FormPartitionKeyDatum * Construct values[] and isnull[] arrays for the partition key * of a tuple. * Construct values [] array and isnull [] array * * pd Partition dispatch object of the partitioned table * pd Partition distributor object of partition table * * slot Heap tuple from which to extract partition key * slot The heap tuple from which the key is pre partitioned * * estate executor state for evaluating any partition key * expressions (must be non-NULL) * estate Resolve actuator state of partition key expression (must be non NULL) * * values Array of partition key Datums (output area) * Partition key datum array (output parameter) * isnull Array of is-null indicators (output area) * is-null Tag array (output parameter) * * the ecxt_scantuple slot of estate's per-tuple expr context must point to * the heap tuple passed in. * estate The ecxt of the per tuple context of must point to the incoming heap tuple * ---------------- */ static void FormPartitionKeyDatum(PartitionDispatch pd, TupleTableSlot *slot, EState *estate, Datum *values, bool *isnull) { ListCell *partexpr_item; int i; if (pd->key->partexprs != NIL && pd->keystate == NIL) { /* Check caller has set up context correctly */ //Check that the memory context is configured correctly by the caller Assert(estate != NULL && GetPerTupleExprContext(estate)->ecxt_scantuple == slot); /* First time through, set up expression evaluation state */ //Enter for the first time, configure the expression parser state pd->keystate = ExecPrepareExprList(pd->key->partexprs, estate); } partexpr_item = list_head(pd->keystate);//Get partition key expression status for (i = 0; i < pd->key->partnatts; i++)//Loop traversal partition key { AttrNumber keycol = pd->key->partattrs[i];//Partition key property number Datum datum;// typedef uintptr_t Datum;sizeof(Datum) == sizeof(void *) == 4 or 8 bool isNull;//Whether null if (keycol != 0)//No. is not 0 { /* Plain column; get the value directly from the heap tuple */ //Flat columns, extracting values directly from heap tuples datum = slot_getattr(slot, keycol, &isNull); } else { /* Expression; need to evaluate it */ //Expression, need to resolve if (partexpr_item == NULL)//Partition key expression status is NULL, error reported elog(ERROR, "wrong number of partition key expressions"); //Get expression value datum = ExecEvalExprSwitchContext((ExprState *) lfirst(partexpr_item), GetPerTupleExprContext(estate), &isNull); //Switch to next partexpr_item = lnext(partexpr_item); } values[i] = datum;//assignment isnull[i] = isNull; } if (partexpr_item != NULL)//Error in parameter setting? Error reporting elog(ERROR, "wrong number of partition key expressions"); } /* * slot_getattr - fetch one attribute of the slot's contents. * slot_getattr - Extract an attribute value in slot */ static inline Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull) { AssertArg(attnum > 0); if (attnum > slot->tts_nvalid) slot_getsomeattrs(slot, attnum); *isnull = slot->tts_isnull[attnum - 1]; return slot->tts_values[attnum - 1]; } /* * This function forces the entries of the slot's Datum/isnull arrays to be * valid at least up through the attnum'th entry. * This function forces the entry of the Datum/isnull array of slot to be valid at least on the first entry of attnum. */ static inline void slot_getsomeattrs(TupleTableSlot *slot, int attnum) { if (slot->tts_nvalid < attnum) slot_getsomeattrs_int(slot, attnum); } /* * slot_getsomeattrs_int - workhorse for slot_getsomeattrs() * slot_getsomeattrs_int - slot_getsomeattrs()Practical implementation of function */ void slot_getsomeattrs_int(TupleTableSlot *slot, int attnum) { /* Check for caller errors */ //Check whether the caller's input parameters are correct Assert(slot->tts_nvalid < attnum); /* slot_getsomeattr checked */ Assert(attnum > 0); //attnum parameter judgment if (unlikely(attnum > slot->tts_tupleDescriptor->natts)) elog(ERROR, "invalid attribute number %d", attnum); /* Fetch as many attributes as possible from the underlying tuple. */ //Get as many attributes as possible from the tuple. slot->tts_ops->getsomeattrs(slot, attnum); /* * If the underlying tuple doesn't have enough attributes, tuple descriptor * must have the missing attributes. * If the underlying tuple does not have enough properties, the tuple descriptor must have the missing properties. */ if (unlikely(slot->tts_nvalid < attnum)) { slot_getmissingattrs(slot, slot->tts_nvalid, attnum); slot->tts_nvalid = attnum; } }
III. tracking analysis
The test script is as follows
-- Hash Partition drop table if exists t_hash_partition; create table t_hash_partition (c1 int not null,c2 varchar(40),c3 varchar(40)) partition by hash(c1); create table t_hash_partition_1 partition of t_hash_partition for values with (modulus 6,remainder 0); create table t_hash_partition_2 partition of t_hash_partition for values with (modulus 6,remainder 1); create table t_hash_partition_3 partition of t_hash_partition for values with (modulus 6,remainder 2); create table t_hash_partition_4 partition of t_hash_partition for values with (modulus 6,remainder 3); create table t_hash_partition_5 partition of t_hash_partition for values with (modulus 6,remainder 4); create table t_hash_partition_6 partition of t_hash_partition for values with (modulus 6,remainder 5); insert into t_hash_partition(c1,c2,c3) VALUES(20,'HASH0','HAHS0');
Start gdb, set breakpoint
(gdb) b FormPartitionKeyDatum Breakpoint 5 at 0x6e30d2: file execPartition.c, line 1087. (gdb) b slot_getattr Breakpoint 6 at 0x489d9b: file heaptuple.c, line 1510. (gdb) c Continuing. Breakpoint 5, FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, isnull=0x7fff4e240780) at execPartition.c:1087 1087 if (pd->key->partexprs != NIL && pd->keystate == NIL)
Loop, get the corresponding key value according to the partition key
1087 if (pd->key->partexprs != NIL && pd->keystate == NIL) (gdb) n 1097 partexpr_item = list_head(pd->keystate); (gdb) 1098 for (i = 0; i < pd->key->partnatts; i++) (gdb) 1100 AttrNumber keycol = pd->key->partattrs[i]; (gdb) 1104 if (keycol != 0) (gdb) 1107 datum = slot_getattr(slot, keycol, &isNull);
Enter the function slot? Getattr
(gdb) step Breakpoint 6, slot_getattr (slot=0x2e1b8a0, attnum=1, isnull=0x7fff4e240735) at heaptuple.c:1510 1510 HeapTuple tuple = slot->tts_tuple;
Get the result, partition key value is 20
... (gdb) p *isnull $31 = false (gdb) p slot->tts_values[attnum - 1] $32 = 20
Return to FormPartitionKeyDatum function
(gdb) n 1593 } (gdb) FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, isnull=0x7fff4e240780) at execPartition.c:1119 1119 values[i] = datum;
Completion of invocation
1119 values[i] = datum; (gdb) n 1120 isnull[i] = isNull; (gdb) 1098 for (i = 0; i < pd->key->partnatts; i++) (gdb) 1123 if (partexpr_item != NULL) (gdb) 1125 } (gdb) ExecFindPartition (resultRelInfo=0x2e1b108, pd=0x2e1c5b8, slot=0x2e1b8a0, estate=0x2e1aeb8) at execPartition.c:282 282 if (partdesc->nparts == 0)
DONE!
IV. references
PG 11.1 Source Code.
Note: doxygen The source code on is inconsistent with PG 11.1. This section is based on 11.1