PostgreSQL source code interpretation (96) - partition table (3) (data insertion route (3) - get partition key)

Keywords: Attribute

This section describes the execpreparetuplerouting - > execfindpartition - > formpartitionkeydatum function, which gets the partition key value of Tuple.

I. data structure

ModifyTable
Apply the rows generated by the subplan to the result table by inserting, updating, or deleting.

/* ----------------
 *   ModifyTable node -
 *      Apply rows produced by subplan(s) to result table(s),
 *      by inserting, updating, or deleting.
 *      Apply the rows generated by the subplan to the result table by inserting, updating, or deleting.
 *
 * If the originally named target table is a partitioned table, both
 * nominalRelation and rootRelation contain the RT index of the partition
 * root, which is not otherwise mentioned in the plan.  Otherwise rootRelation
 * is zero.  However, nominalRelation will always be set, as it's the rel that
 * EXPLAIN should claim is the INSERT/UPDATE/DELETE target.
 * If the originally named target table is a partitioned table, both the nominal relation and the rootRelation contain the RT index of the partition root, which is not mentioned in the plan.
 * Otherwise, the root relationship is zero. However, the nominal relationship is always set, because the rel that EXPLAIN should declare is the INSERT/UPDATE/DELETE target relationship.
 * 
 * Note that rowMarks and epqParam are presumed to be valid for all the
 * subplan(s); they can't contain any info that varies across subplans.
 * Note that rowMarks and epqParam are assumed to be valid for all subprogrammes;
 * They cannot contain any information that changes in the sub plan.
 * ----------------
 */
typedef struct ModifyTable
{
    Plan        plan;
    CmdType     operation;      /* Operation type; INSERT, UPDATE, or DELETE */
    bool        canSetTag;      /* Do we set the command tag / es? Processed? */
    Index       nominalRelation;    /* Parent RT index for use of EXPLAIN */
    Index       rootRelation;   /* Root RT index, if target is partitioned */
    bool        partColsUpdated;    /* some part key in hierarchy updated */
    List       *resultRelations;    /* RT integer list of RT indexes */
    int         resultRelIndex; /* index of first resultRel in plan's list */
    int         rootResultRelIndex; /* index of the partitioned table root */
    List       *plans;          /* plan(s) producing source data */
    List       *withCheckOptionLists;   /* Each target table has a WCO list; per target table WCO lists */
    List       *returningLists; /* Each target table has a return list; per target table return lists */
    List       *fdwPrivLists;   /* Per target table FDW private data lists */
    Bitmapset  *fdwDirectModifyPlans;   /* FDW DM Indexes of FDW DM plans */
    List       *rowMarks;       /* rowMarks Planrowmarks (non locking only) */
    int         epqParam;       /* EvalPlanQual ID of param for evalplanqual re Eval */
    OnConflictAction onConflictAction;  /* ON CONFLICT action */
    List       *arbiterIndexes; /* List of ON CONFLICT arbiter index OIDs  */
    List       *onConflictSet;  /* SET for INSERT ON CONFLICT DO UPDATE */
    Node       *onConflictWhere;    /* WHERE for ON CONFLICT UPDATE */
    Index       exclRelRTI;     /* RTI of the EXCLUDED pseudo relation */
    List       *exclRelTlist;   /* tlist of the EXCLUDED pseudo relation */
} ModifyTable;

ResultRelInfo
ResultRelInfo structure
Whenever an existing relationship is updated, we must update the index on the relationship, and perhaps trigger the trigger. ResultRelInfo holds all the information needed about the result relationship, including the index.

/*
 * ResultRelInfo
 * ResultRelInfo structural morphology
 *
 * Whenever we update an existing relation, we have to update indexes on the
 * relation, and perhaps also fire triggers.  ResultRelInfo holds all the
 * information needed about a result relation, including indexes.
 * Whenever an existing relationship is updated, we must update the index on the relationship, and perhaps trigger the trigger.
 * ResultRelInfo Save all the information you need about the resulting relationship, including the index.
 * 
 * Normally, a ResultRelInfo refers to a table that is in the query's
 * range table; then ri_RangeTableIndex is the RT index and ri_RelationDesc
 * is just a copy of the relevant es_relations[] entry.  But sometimes,
 * in ResultRelInfos used only for triggers, ri_RangeTableIndex is zero
 * and ri_RelationDesc is a separately-opened relcache pointer that needs
 * to be separately closed.  See ExecGetTriggerResultRel.
 * In general, ResultRelInfo refers to the table in the query range table;
 * ri_RangeTableIndex It is an RT index, and RI \ relationdesc is only a copy of the related es \ relationships [] entry.
 * But sometimes, in ResultRelInfos, which is only used for triggers, ri'rangetableindex is zero (NULL),
 *   And RI? Relationdesc is a relacache pointer that needs to be closed and opened separately.
 *   Refer to ExecGetTriggerResultRel structure for details.
 */
typedef struct ResultRelInfo
{
    NodeTag     type;

    /* result relation's range table index, or 0 if not in range table */
    //RTE index
    Index       ri_RangeTableIndex;

    /* relation descriptor for result relation */
    //Descriptor of result / target relation
    Relation    ri_RelationDesc;

    /* # of indices existing on result relation */
    //Number of indexes in the target relationship
    int         ri_NumIndices;

    /* array of relation descriptors for indices */
    //Index's relation descriptor array (index is treated as a relation)
    RelationPtr ri_IndexRelationDescs;

    /* array of key/attr info for indices */
    //Key / property array for index
    IndexInfo **ri_IndexRelationInfo;

    /* triggers to be fired, if any */
    //Triggered index
    TriggerDesc *ri_TrigDesc;

    /* cached lookup info for trigger functions */
    //Trigger function (CACHE)
    FmgrInfo   *ri_TrigFunctions;

    /* array of trigger WHEN expr states */
    //Trigger array for WHEN expression state
    ExprState **ri_TrigWhenExprs;

    /* optional runtime measurements for triggers */
    //Optional trigger run time meter
    Instrumentation *ri_TrigInstrument;

    /* FDW callback functions, if foreign table */
    //FDW callback function
    struct FdwRoutine *ri_FdwRoutine;

    /* available to save private state of FDW */
    //Can be used to store the private state of FDW
    void       *ri_FdwState;

    /* true when modifying foreign table directly */
    //T when updating FDW directly
    bool        ri_usesFdwDirectModify;

    /* list of WithCheckOption's to be checked */
    //WithCheckOption linked list
    List       *ri_WithCheckOptions;

    /* list of WithCheckOption expr states */
    //WithCheckOption expression list
    List       *ri_WithCheckOptionExprs;

    /* array of constraint-checking expr states */
    //Constraint check expression state array
    ExprState **ri_ConstraintExprs;

    /* for removing junk attributes from tuples */
    //Used to remove the junk attribute from a tuple
    JunkFilter *ri_junkFilter;

    /* list of RETURNING expressions */
    //RETURNING expression list
    List       *ri_returningList;

    /* for computing a RETURNING list */
    //Used to calculate the RETURNING list
    ProjectionInfo *ri_projectReturning;

    /* list of arbiter indexes to use to check conflicts */
    //List of arbiter indexes used to check for conflicts
    List       *ri_onConflictArbiterIndexes;

    /* ON CONFLICT evaluation state */
    //ON CONFLICT resolution status
    OnConflictSetState *ri_onConflict;

    /* partition check expression */
    //Partition check expression list
    List       *ri_PartitionCheck;

    /* partition check expression state */
    //Partition check expression state
    ExprState  *ri_PartitionCheckExpr;

    /* relation descriptor for root partitioned table */
    //Partition root table descriptor
    Relation    ri_PartitionRoot;

    /* Additional information specific to partition tuple routing */
    //Additional partition tuple routing information
    struct PartitionRoutingInfo *ri_PartitionInfo;
} ResultRelInfo;

PartitionRoutingInfo
PartitionRoutingInfo structure
Partition routing information, which is used to route tuples to the result relationship information of the table partition.

/*
 * PartitionRoutingInfo
 * PartitionRoutingInfo - Partition routing information
 * 
 * Additional result relation information specific to routing tuples to a
 * table partition.
 * Result relationship information used to route tuples to table partitions.
 */
typedef struct PartitionRoutingInfo
{
    /*
     * Map for converting tuples in root partitioned table format into
     * partition format, or NULL if no conversion is required.
     * Mapping, used to convert tuples in the root partition table format to partition format, or NULL if no conversion is required.
     */
    TupleConversionMap *pi_RootToPartitionMap;

    /*
     * Map for converting tuples in partition format into the root partitioned
     * table format, or NULL if no conversion is required.
     * Mapping, used to convert tuples in partition format to root partition table format, or NULL if no conversion is required.
     */
    TupleConversionMap *pi_PartitionToRootMap;

    /*
     * Slot to store tuples in partition format, or NULL when no translation
     * is required between root and partition.
     * slot that stores tuples in partition format. NULL when no conversion is required between the root partition and partition.
     */
    TupleTableSlot *pi_PartitionTupleSlot;
} PartitionRoutingInfo;

TupleConversionMap
TupleConversionMap structure, used to store tuple conversion mapping information


typedef struct TupleConversionMap
{
    TupleDesc   indesc;         /* Descriptor of source row type */
    TupleDesc   outdesc;        /* Descriptor for result row type */
    AttrNumber *attrMap;        /* Index of input fields, or 0 for null */
    Datum      *invalues;       /* Workspace for constructing source */
    bool       *inisnull;       //Whether it is a NULL tag array
    Datum      *outvalues;      /* workspace for constructing result */
    bool       *outisnull;      //null marker
} TupleConversionMap;

II. Source code interpretation

The FormPartitionKeyDatum function gets the partition key value of Tuple, and returns the array of key values [] and whether it is null marked isnull [] array


/* ----------------
 *      FormPartitionKeyDatum
 *          Construct values[] and isnull[] arrays for the partition key
 *          of a tuple.
 *          Construct values [] array and isnull [] array
 *
 *  pd              Partition dispatch object of the partitioned table
 *  pd              Partition distributor object of partition table
 *
 *  slot            Heap tuple from which to extract partition key
 *  slot            The heap tuple from which the key is pre partitioned
 *
 *  estate          executor state for evaluating any partition key
 *                  expressions (must be non-NULL)
 *  estate          Resolve actuator state of partition key expression (must be non NULL)
 *
 *  values          Array of partition key Datums (output area)
 *                  Partition key datum array (output parameter)
 *  isnull          Array of is-null indicators (output area)
 *                  is-null Tag array (output parameter)
 *
 * the ecxt_scantuple slot of estate's per-tuple expr context must point to
 * the heap tuple passed in.
 * estate The ecxt of the per tuple context of must point to the incoming heap tuple
 * ----------------
 */
static void
FormPartitionKeyDatum(PartitionDispatch pd,
                      TupleTableSlot *slot,
                      EState *estate,
                      Datum *values,
                      bool *isnull)
{
    ListCell   *partexpr_item;
    int         i;

    if (pd->key->partexprs != NIL && pd->keystate == NIL)
    {
        /* Check caller has set up context correctly */
        //Check that the memory context is configured correctly by the caller
        Assert(estate != NULL &&
               GetPerTupleExprContext(estate)->ecxt_scantuple == slot);

        /* First time through, set up expression evaluation state */
        //Enter for the first time, configure the expression parser state
        pd->keystate = ExecPrepareExprList(pd->key->partexprs, estate);
    }

    partexpr_item = list_head(pd->keystate);//Get partition key expression status
    for (i = 0; i < pd->key->partnatts; i++)//Loop traversal partition key
    {
        AttrNumber  keycol = pd->key->partattrs[i];//Partition key property number
        Datum       datum;// typedef uintptr_t Datum;sizeof(Datum) == sizeof(void *) == 4 or 8
        bool        isNull;//Whether null

        if (keycol != 0)//No. is not 0
        {
            /* Plain column; get the value directly from the heap tuple */
            //Flat columns, extracting values directly from heap tuples
            datum = slot_getattr(slot, keycol, &isNull);
        }
        else
        {
            /* Expression; need to evaluate it */
            //Expression, need to resolve
            if (partexpr_item == NULL)//Partition key expression status is NULL, error reported
                elog(ERROR, "wrong number of partition key expressions");
            //Get expression value
            datum = ExecEvalExprSwitchContext((ExprState *) lfirst(partexpr_item),
                                              GetPerTupleExprContext(estate),
                                              &isNull);
            //Switch to next
            partexpr_item = lnext(partexpr_item);
        }
        values[i] = datum;//assignment
        isnull[i] = isNull;
    }

    if (partexpr_item != NULL)//Error in parameter setting? Error reporting
        elog(ERROR, "wrong number of partition key expressions");
}



/*
 * slot_getattr - fetch one attribute of the slot's contents.
 * slot_getattr - Extract an attribute value in slot
 */
static inline Datum
slot_getattr(TupleTableSlot *slot, int attnum,
             bool *isnull)
{
    AssertArg(attnum > 0);

    if (attnum > slot->tts_nvalid)
        slot_getsomeattrs(slot, attnum);

    *isnull = slot->tts_isnull[attnum - 1];

    return slot->tts_values[attnum - 1];
}


/*
 * This function forces the entries of the slot's Datum/isnull arrays to be
 * valid at least up through the attnum'th entry.
 * This function forces the entry of the Datum/isnull array of slot to be valid at least on the first entry of attnum.
 */
static inline void
slot_getsomeattrs(TupleTableSlot *slot, int attnum)
{
    if (slot->tts_nvalid < attnum)
        slot_getsomeattrs_int(slot, attnum);
}


/*
 * slot_getsomeattrs_int - workhorse for slot_getsomeattrs()
 * slot_getsomeattrs_int - slot_getsomeattrs()Practical implementation of function
 */
void
slot_getsomeattrs_int(TupleTableSlot *slot, int attnum)
{
    /* Check for caller errors */
    //Check whether the caller's input parameters are correct
    Assert(slot->tts_nvalid < attnum); /* slot_getsomeattr checked */
    Assert(attnum > 0);
    //attnum parameter judgment
    if (unlikely(attnum > slot->tts_tupleDescriptor->natts))
        elog(ERROR, "invalid attribute number %d", attnum);

    /* Fetch as many attributes as possible from the underlying tuple. */
    //Get as many attributes as possible from the tuple.
    slot->tts_ops->getsomeattrs(slot, attnum);

    /*
     * If the underlying tuple doesn't have enough attributes, tuple descriptor
     * must have the missing attributes.
     * If the underlying tuple does not have enough properties, the tuple descriptor must have the missing properties.
     */
    if (unlikely(slot->tts_nvalid < attnum))
    {
        slot_getmissingattrs(slot, slot->tts_nvalid, attnum);
        slot->tts_nvalid = attnum;
    }
}

III. tracking analysis

The test script is as follows

-- Hash Partition
drop table if exists t_hash_partition;
create table t_hash_partition (c1 int not null,c2  varchar(40),c3 varchar(40)) partition by hash(c1);
create table t_hash_partition_1 partition of t_hash_partition for values with (modulus 6,remainder 0);
create table t_hash_partition_2 partition of t_hash_partition for values with (modulus 6,remainder 1);
create table t_hash_partition_3 partition of t_hash_partition for values with (modulus 6,remainder 2);
create table t_hash_partition_4 partition of t_hash_partition for values with (modulus 6,remainder 3);
create table t_hash_partition_5 partition of t_hash_partition for values with (modulus 6,remainder 4);
create table t_hash_partition_6 partition of t_hash_partition for values with (modulus 6,remainder 5);

insert into t_hash_partition(c1,c2,c3) VALUES(20,'HASH0','HAHS0');

Start gdb, set breakpoint

(gdb) b FormPartitionKeyDatum
Breakpoint 5 at 0x6e30d2: file execPartition.c, line 1087.
(gdb) b slot_getattr
Breakpoint 6 at 0x489d9b: file heaptuple.c, line 1510.
(gdb) c
Continuing.

Breakpoint 5, FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, 
    isnull=0x7fff4e240780) at execPartition.c:1087
1087        if (pd->key->partexprs != NIL && pd->keystate == NIL)

Loop, get the corresponding key value according to the partition key

1087        if (pd->key->partexprs != NIL && pd->keystate == NIL)
(gdb) n
1097        partexpr_item = list_head(pd->keystate);
(gdb) 
1098        for (i = 0; i < pd->key->partnatts; i++)
(gdb) 
1100            AttrNumber  keycol = pd->key->partattrs[i];
(gdb) 
1104            if (keycol != 0)
(gdb) 
1107                datum = slot_getattr(slot, keycol, &isNull);

Enter the function slot? Getattr

(gdb) step

Breakpoint 6, slot_getattr (slot=0x2e1b8a0, attnum=1, isnull=0x7fff4e240735) at heaptuple.c:1510
1510        HeapTuple   tuple = slot->tts_tuple;

Get the result, partition key value is 20

...
(gdb) p *isnull
$31 = false
(gdb) p slot->tts_values[attnum - 1]
$32 = 20

Return to FormPartitionKeyDatum function

(gdb) n
1593    }
(gdb) 
FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, isnull=0x7fff4e240780)
    at execPartition.c:1119
1119            values[i] = datum;

Completion of invocation

1119            values[i] = datum;
(gdb) n
1120            isnull[i] = isNull;
(gdb) 
1098        for (i = 0; i < pd->key->partnatts; i++)
(gdb) 
1123        if (partexpr_item != NULL)
(gdb) 
1125    }
(gdb) 
ExecFindPartition (resultRelInfo=0x2e1b108, pd=0x2e1c5b8, slot=0x2e1b8a0, estate=0x2e1aeb8) at execPartition.c:282
282         if (partdesc->nparts == 0)

DONE!

IV. references

PG 11.1 Source Code.
Note: doxygen The source code on is inconsistent with PG 11.1. This section is based on 11.1

Posted by Ryyo on Thu, 05 Dec 2019 19:19:20 -0800