preface
In the last blog HUDI preCombinedField summary The preCombinedField has been summarized in. At that time, the understanding of the source code was not deep enough, resulting in incomplete analysis. Now we have a further understanding of the source code, so we can summarize and supplement it.
Historical comparison value
In the summary above:
DF: whether the ts value of the new record is greater than the ts value of the history record, it will be overwritten and updated directly. SQL: when writing data, the ts value will be updated only if it is greater than or equal to the historical ts value. If it is less than the historical value, it will not be updated.
Here is an explanation. First, spark SQL payload_ CLASS_ The default value of name is ExpressionPayload, which inherits defaulthoudierecordpayload
class ExpressionPayload(record: GenericRecord, orderingVal: Comparable[_]) extends DefaultHoodieRecordPayload(record, orderingVal) {
needUpdatingPersistedRecord in defaulthoudierecordpayload enables historical value comparison. The specific implementation will be analyzed later
Spark DF is in Hudi 0.9.0 payload_ CLASS_ The default value of name is OverwriteWithLatestAvroPayload, which is the parent class of defaulthoudierecordpayload and does not compare with historical values. In order to compare with historical values and keep consistent with Spark SQL, I have mentioned it PR Change the default value of Spark DF to defaulthoudierecordpayload. PR has been merge d and will be withdrawn in version 0.10.0
Historical value comparison implementation
After a brief analysis of the source code, it is first explained that the configuration items of historical comparison values are:
HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY = "hoodie.payload.ordering.field"
The default value is ts, so ordering_field is different from preCombineField, but the default value is the same and the implementation is in payload_ In class, it gives people the same feeling, so it is summarized together
HoodieMergeHandle
When hudi merges small files with upsert, he will go to the write method of houdiemergehandled:
/** * Go through an old record. Here if we detect a newer version shows up, we write the new one to the file. */ public void write(GenericRecord oldRecord) { // History key value String key = KeyGenUtils.getRecordKeyFromGenericRecord(oldRecord, keyGeneratorOpt); boolean copyOldRecord = true; if (keyToNewRecords.containsKey(key)) { //If the key value of the new record contains the old value, merge logic is performed // If we have duplicate records that we are updating, then the hoodie record will be deflated after // writing the first record. So make a copy of the record to be merged HoodieRecord<T> hoodieRecord = new HoodieRecord<>(keyToNewRecords.get(key)); try { // Payload is called here_ combineAndGetUpdateValue method of class Option<IndexedRecord> combinedAvroRecord = hoodieRecord.getData().combineAndGetUpdateValue(oldRecord, useWriterSchema ? tableSchemaWithMetaFields : tableSchema, config.getPayloadConfig().getProps()); if (combinedAvroRecord.isPresent() && combinedAvroRecord.get().equals(IGNORE_RECORD)) { // If it is an IGNORE_RECORD, just copy the old record, and do not update the new record. copyOldRecord = true; } else if (writeUpdateRecord(hoodieRecord, oldRecord, combinedAvroRecord)) { /* * ONLY WHEN 1) we have an update for this key AND 2) We are able to successfully * write the the combined new * value * * We no longer need to copy the old record over. */ copyOldRecord = false; } writtenRecordKeys.add(key); } catch (Exception e) { throw new HoodieUpsertException("Failed to combine/merge new record with old value in storage, for new record {" + keyToNewRecords.get(key) + "}, old value {" + oldRecord + "}", e); } } if (copyOldRecord) { // this should work as it is, since this is an existing record try { fileWriter.writeAvro(key, oldRecord); } catch (IOException | RuntimeException e) { String errMsg = String.format("Failed to merge old record into new file for key %s from old file %s to new file %s with writerSchema %s", key, getOldFilePath(), newFilePath, writeSchemaWithMetaFields.toString(true)); LOG.debug("Old record is " + oldRecord); throw new HoodieUpsertException(errMsg, e); } recordsWritten++; } }
combineAndGetUpdateValue method
Take a look at the combineAndGetUpdateValue of defaulthoudierecordpayload:
@Override /** * currentValue Current value, i.e. history value * Option<IndexedRecord> combinedAvroRecord = * hoodieRecord.getData().combineAndGetUpdateValue(oldRecord, * useWriterSchema ? tableSchemaWithMetaFields : tableSchema, * config.getPayloadConfig().getProps()); */ public Option<IndexedRecord> combineAndGetUpdateValue(IndexedRecord currentValue, Schema schema, Properties properties) throws IOException { // recordBytes is the byte value of the new data if (recordBytes.length == 0) { return Option.empty(); } // Convert recordBytes to GenericRecord in Avro format GenericRecord incomingRecord = bytesToAvro(recordBytes, schema); // Null check is needed here to support schema evolution. The record in storage may be from old schema where // the new ordering column might not be present and hence returns null. // If a history value is not required, the history value is returned if (!needUpdatingPersistedRecord(currentValue, incomingRecord, properties)) { return Option.of(currentValue); } /* * We reached a point where the value is disk is older than the incoming record. */ eventTime = updateEventTime(incomingRecord, properties); /* * Now check if the incoming record is a delete record. */ return isDeleteRecord(incomingRecord) ? Option.empty() : Option.of(incomingRecord); }
As for the assignment of recordBytes, in the parent class BaseAvroPayload, when writing data, we need to construct GenericRecord record first, then pass record as a parameter to PayLoad, finally construct list >, and call hoodiejavawriteclient.upsert (list > records, String instantTime)
public BaseAvroPayload(GenericRecord record, Comparable orderingVal) { this.recordBytes = record != null ? HoodieAvroUtils.avroToBytes(record) : new byte[0]; this.orderingVal = orderingVal; if (orderingVal == null) { throw new HoodieException("Ordering value is null for record: " + record); } }
needUpdatingPersistedRecord
The comparison with the historical value is here:
protected boolean needUpdatingPersistedRecord(IndexedRecord currentValue, IndexedRecord incomingRecord, Properties properties) { /* * Combining strategy here returns currentValue on disk if incoming record is older. * The incoming record can be either a delete (sent as an upsert with _hoodie_is_deleted set to true) * or an insert/update record. In any case, if it is older than the record in disk, the currentValue * in disk is returned (to be rewritten with new commit time). * * NOTE: Deletes sent via EmptyHoodieRecordPayload and/or Delete operation type do not hit this code path * and need to be dealt with separately. */ // Historical ts value Object persistedOrderingVal = getNestedFieldVal((GenericRecord) currentValue, properties.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY), true); // ts value of new data Comparable incomingOrderingVal = (Comparable) getNestedFieldVal((GenericRecord) incomingRecord, properties.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY), false); // If the historical value is null or the historical value is less than the new value, return true, which means that the historical value update is to be overwritten, otherwise it is not updated return persistedOrderingVal == null || ((Comparable) persistedOrderingVal).compareTo(incomingOrderingVal) <= 0; }
PAYLOAD_ORDERING_FIELD_PROP_KEY default
You can see that the properties parameter passed in the HoodieMergeHandle above is config.getPayloadConfig().getProps() getPayloadConfig returns HoodiePayloadConfig. The default value of PAYLOAD_ORDERING_FIELD_PROP_KEY defined in HoodiePayloadConfig is ts
public HoodiePayloadConfig getPayloadConfig() { return hoodiePayloadConfig; } public class HoodiePayloadConfig extends HoodieConfig { public static final ConfigProperty<String> ORDERING_FIELD = ConfigProperty .key(PAYLOAD_ORDERING_FIELD_PROP_KEY) .defaultValue("ts") .withDocumentation("Table column/field name to order records that have the same key, before " + "merging and writing to storage.");
Pre merge implementation
First, the pre merge implementation method is class overwritewithlatestavropaload.precombine
public class OverwriteWithLatestAvroPayload extends BaseAvroPayload implements HoodieRecordPayload<OverwriteWithLatestAvroPayload> { public OverwriteWithLatestAvroPayload(GenericRecord record, Comparable orderingVal) { super(record, orderingVal); } public OverwriteWithLatestAvroPayload(Option<GenericRecord> record) { this(record.isPresent() ? record.get() : null, 0); // natural order } @Override public OverwriteWithLatestAvroPayload preCombine(OverwriteWithLatestAvroPayload oldValue) { if (oldValue.recordBytes.length == 0) { // use natural order for delete record return this; } // If the orderingVal of the old value is greater than the orderingVal, the old value will be returned; otherwise, the current new value will be returned, that is, a larger record will be returned if (oldValue.orderingVal.compareTo(orderingVal) > 0) { // pick the payload with greatest ordering value return oldValue; } else { return this; } }
Therefore, both Spark SQL and Spark DF implement pre consolidation by default. ExpressionPayload and defaulthoudierecordpayload inherit (extensions) overwritewitlatestavropayload. Therefore, pre consolidation can be realized with these three payloads. The key depends on how to construct the payload
Construct Paylod
According to the above code, we can find that OverwriteWithLatestAvroPayload has two constructors, one parameter and two parameters. The of one parameter can not realize pre merging. Because orderingVal comparison is required in the pre merging method, OverwriteWithLatestAvroPayload should be constructed with the constructor of two parameters, where orderingVal is the constructor corresponding to preCombineField Value. Record is a row of record values. Whether Spark SQL or Spark DF, houdiesparksqlwriter.write will be called in the end. Payold is constructed in this write method.
// Convert to RDD[HoodieRecord] // First, convert df to RDD[HoodieRecord] val genericRecords: RDD[GenericRecord] = HoodieSparkUtils.createRdd(df, structName, nameSpace, reconcileSchema, org.apache.hudi.common.util.Option.of(schema)) // Determine whether pre consolidation is required val shouldCombine = parameters(INSERT_DROP_DUPS.key()).toBoolean || operation.equals(WriteOperationType.UPSERT) || parameters.getOrElse(HoodieWriteConfig.COMBINE_BEFORE_INSERT.key(), HoodieWriteConfig.COMBINE_BEFORE_INSERT.defaultValue()).toBoolean val hoodieAllIncomingRecords = genericRecords.map(gr => { val processedRecord = getProcessedRecord(partitionColumns, gr, dropPartitionColumns) val hoodieRecord = if (shouldCombine) { // If pre consolidation is required // Get the value corresponding to PRECOMBINE_FIELD from record. If the value does not exist, an exception will be thrown because the pre merged fields are not allowed to have null values val orderingVal = HoodieAvroUtils.getNestedFieldVal(gr, hoodieConfig.getString(PRECOMBINE_FIELD), false) .asInstanceOf[Comparable[_]] Then, through the reflection method, the structure PAYLOAD_CLASS_NAME Corresponding paylod DataSourceUtils.createHoodieRecord(processedRecord, orderingVal, keyGenerator.getKey(gr), hoodieConfig.getString(PAYLOAD_CLASS_NAME)) } else { // If pre merging is not required, paylod can also be constructed through reflection, but the orderingVal parameter is not required DataSourceUtils.createHoodieRecord(processedRecord, keyGenerator.getKey(gr), hoodieConfig.getString(PAYLOAD_CLASS_NAME)) } hoodieRecord }).toJavaRDD()
As can be seen from the comments of the source code above, if pre merging is required, first take out the corresponding PRECOMBINE_FIELD value orderingVal in record, and then construct payload, that is
new OverwriteWithLatestAvroPayload(record, orderingVal)
Here, the payload is constructed, so where is the pre merge finally implemented?
Call preCombine
Here, take the upsert of the cow table as an example, that is, HoodieJavaCopyOnWriteTable.upsert
// HoodieJavaCopyOnWriteTable @Override public HoodieWriteMetadata<List<WriteStatus>> upsert(HoodieEngineContext context, String instantTime, List<HoodieRecord<T>> records) { return new JavaUpsertCommitActionExecutor<>(context, config, this, instantTime, records).execute(); } // JavaUpsertCommitActionExecutor @Override public HoodieWriteMetadata<List<WriteStatus>> execute() { return JavaWriteHelper.newInstance().write(instantTime, inputRecords, context, table, config.shouldCombineBeforeUpsert(), config.getUpsertShuffleParallelism(), this, true); } // AbstractWriteHelper public HoodieWriteMetadata<O> write(String instantTime, I inputRecords, HoodieEngineContext context, HoodieTable<T, I, K, O> table, boolean shouldCombine, int shuffleParallelism, BaseCommitActionExecutor<T, I, K, O, R> executor, boolean performTagging) { try { // De-dupe/merge if needed I dedupedRecords = combineOnCondition(shouldCombine, inputRecords, shuffleParallelism, table); Instant lookupBegin = Instant.now(); I taggedRecords = dedupedRecords; if (performTagging) { // perform index loop up to get existing location of records taggedRecords = tag(dedupedRecords, context, table); } Duration indexLookupDuration = Duration.between(lookupBegin, Instant.now()); HoodieWriteMetadata<O> result = executor.execute(taggedRecords); result.setIndexLookupDuration(indexLookupDuration); return result; } catch (Throwable e) { if (e instanceof HoodieUpsertException) { throw (HoodieUpsertException) e; } throw new HoodieUpsertException("Failed to upsert for commit time " + instantTime, e); } } public I combineOnCondition( boolean condition, I records, int parallelism, HoodieTable<T, I, K, O> table) { return condition ? deduplicateRecords(records, table, parallelism) : records; } /** * Deduplicate Hoodie records, using the given deduplication function. * * @param records hoodieRecords to deduplicate * @param parallelism parallelism or partitions to be used while reducing/deduplicating * @return Collection of HoodieRecord already be deduplicated */ public I deduplicateRecords( I records, HoodieTable<T, I, K, O> table, int parallelism) { return deduplicateRecords(records, table.getIndex(), parallelism); } // SparkWriteHelper @Override public JavaRDD<HoodieRecord<T>> deduplicateRecords( JavaRDD<HoodieRecord<T>> records, HoodieIndex<T, ?, ?, ?> index, int parallelism) { boolean isIndexingGlobal = index.isGlobal(); return records.mapToPair(record -> { HoodieKey hoodieKey = record.getKey(); // If index used is global, then records are expected to differ in their partitionPath // Get the key value of record Object key = isIndexingGlobal ? hoodieKey.getRecordKey() : hoodieKey; // Return (key,record) return new Tuple2<>(key, record); }).reduceByKey((rec1, rec2) -> { @SuppressWarnings("unchecked") // The record with the same key value returns the one with the larger preCombineField value through the preCombine function T reducedData = (T) rec2.getData().preCombine(rec1.getData()); HoodieKey reducedKey = rec1.getData().equals(reducedData) ? rec1.getKey() : rec2.getKey(); return new HoodieRecord<T>(reducedKey, reducedData); }, parallelism).map(Tuple2::_2); }
This enables the pre merge function
Modify historical comparison value
Finally, let's talk about how the historical comparison value is modified. In fact, Spark SQL and Spark DF do not need to modify its value specially, because the default and preCombineField values are modified synchronously. Let's see how the program modifies synchronously. Both SQL and DF will eventually call houdiesparksqlwriter.write
// Create a HoodieWriteClient & issue the delete. val client = hoodieWriteClient.getOrElse(DataSourceUtils.createHoodieClient(jsc, null, path, tblName, mapAsJavaMap(parameters - HoodieWriteConfig.AUTO_COMMIT_ENABLE.key))) .asInstanceOf[SparkRDDWriteClient[HoodieRecordPayload[Nothing]]] public static SparkRDDWriteClient createHoodieClient(JavaSparkContext jssc, String schemaStr, String basePath, String tblName, Map<String, String> parameters) { return new SparkRDDWriteClient<>(new HoodieSparkEngineContext(jssc), createHoodieConfig(schemaStr, basePath, tblName, parameters)); } public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath, String tblName, Map<String, String> parameters) { boolean asyncCompact = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_COMPACT_ENABLE().key())); boolean inlineCompact = !asyncCompact && parameters.get(DataSourceWriteOptions.TABLE_TYPE().key()) .equals(DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL()); boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key())); boolean inlineClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.INLINE_CLUSTERING_ENABLE().key())); // insert/bulk-insert combining to be true, if filtering for duplicates boolean combineInserts = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.INSERT_DROP_DUPS().key())); HoodieWriteConfig.Builder builder = HoodieWriteConfig.newBuilder() .withPath(basePath).withAutoCommit(false).combineInput(combineInserts, true); if (schemaStr != null) { builder = builder.withSchema(schemaStr); } return builder.forTable(tblName) .withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(IndexType.BLOOM).build()) .withCompactionConfig(HoodieCompactionConfig.newBuilder() .withPayloadClass(parameters.get(DataSourceWriteOptions.PAYLOAD_CLASS_NAME().key())) .withInlineCompaction(inlineCompact).build()) .withClusteringConfig(HoodieClusteringConfig.newBuilder() .withInlineClustering(inlineClusteringEnabled) .withAsyncClustering(asyncClusteringEnabled).build()) // In this setting, the value of OrderingField is equal to PRECOMBINE_FIELD, so it is modified synchronously with PRECOMBINE_FIELD by default .withPayloadConfig(HoodiePayloadConfig.newBuilder().withPayloadOrderingField(parameters.get(DataSourceWriteOptions.PRECOMBINE_FIELD().key())) .build()) // override above with Hoodie configs specified as options. .withProps(parameters).build(); }
If you really want to change the default value, which is different from PRECOMBINE_FIELD,
Then sql:
set hoodie.payload.ordering.field=ts;
DF:
.option("hoodie.payload.ordering.field", "ts") or .option(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY, "ts")
This paper consists of Dong Kelun Published in Lun Shao's blog , use Signature - non commercial use - no deduction 3.0 License.
For non-commercial reprint, please indicate the author and source. For commercial reprint, please contact the author himself.
Title: HUDI preCombinedField summary (II) - source code analysis
Link to this article: https://dongkelun.com/2021/11/30/hudiPreCombineField2/