2021SC@SDUSC Hbase project code analysis - flush

Keywords: HBase

2021SC@SDUSC

        

        In the fourth article, we explored how cacheflush initializes. Now let's look at how cacheflush handles flush requests.

         Through the analysis in the previous article, we know that there are two queues and collections that store flush requests and their hregon encapsulation classes: flushQueue and regionsInQueue, while memstoreflush provides a requestFlush() method:

public void requestFlush(HRegion r) {
   synchronized (regionsInQueue) {
      if (!regionsInQueue.containsKey(r)) {
        FlushRegionEntry fqe = new FlushRegionEntry(r);
        this.regionsInQueue.put(r, fqe);
        this.flushQueue.add(fqe);
      }
   }
}

         The requestFlush() method adds a flush region request to the memstoreflush internal queue. The main process is as follows:
        1. Thread synchronization of regionsInQueue with synchronized

        2. Judge whether the corresponding hregon exists in the regionsInQueue. If it exists in the regionsInQueue collection, return

        3. Encapsulate r of hregon type into fqe of FlushRegionEntry type

        4. Add the corresponding relationship of hregon - > flushregionentry to the regionsInQueue collection

        5. Add the flush request FlushRegionEntry to the flushQueue queue queue

        Here, the flush request has been added to the flushQueue queue queue, which is equivalent to that the producer produces products and waits for consumers to consume, and the consumers are served by the FlushHandler thread. Since it is a thread, the processing logic must be in its run() method, but before studying its run() method, let's take a look at what is stored in the flushQueue:

         Let's review the definition of flushQueue, which is a queue DelayQueue that stores FlushQueueEntry. FlushQueueEntry is an interface that inherits Delayed (another implementation class of Delayed is WakeupFlushThread). However, before introducing them, let's take a look at the queue type corresponding to flushQueue -- DelayQueue in Java.

         DelayQueue is an unbounded BlockingQueue, which internally stores objects that implement the Delayed interface. Therefore, FlushQueueEntry must implement the Delayed interface of java. One of the biggest characteristics of the members in this queue is that they can be listed only after their expiration, and the queue members are orderly (sorted according to the length of Delayed expiration time). So how to judge whether a member expires? The getDelay() method of the corresponding member object returns a value less than or equal to 0, which indicates that the corresponding object has expired in the queue.

         Since the member objects stored in the DelayQueue are ordered, the classes that implement the Delayed interface must be able to sort, and the above getDelay() method needs to be implemented to judge whether the members in the team expire.

         Next, let's study WakeupFlushThread and FlushRegionEntry.

         WakeupFlushThread code is as follows:

static class WakeupFlushThread implements FlushQueueEntry {
    @Override
    public long getDelay(TimeUnit unit) {
      return 0;
    }
 
    @Override
    public int compareTo(Delayed o) {
      return -1;
    }
 
    @Override
    public boolean equals(Object obj) {
      return (this == obj);
    }
}

        This method is inserted into the flush queue as a placeholder or token to ensure that the FlushHandler does not sleep. Moreover, the return value of its getDelay() method is 0, indicating that there is no delay time. After entering the column, it can be listed. The value returned by its compareTo() method is - 1, indicating that it is equivalent to the order of other wakeupflushthreads in the team.

         FlushRegionEntry is defined as follows:

  static class FlushRegionEntry implements FlushQueueEntry {
	private final HRegion region;
    private final long createTime;
    private long whenToExpire;
    private int requeueCount = 0;
 
    FlushRegionEntry(final HRegion r) {
      this.region = r;
      this.createTime = EnvironmentEdgeManager.currentTime();
      this.whenToExpire = this.createTime;
    }
 
    /**
     * @param maximumWait
     * @return True if we have been delayed > <code>maximumWait</code> milliseconds.
     */
    public boolean isMaximumWait(final long maximumWait) {
      return (EnvironmentEdgeManager.currentTime() - this.createTime) > maximumWait;
    }
 
    /**
     * @return Count of times {@link #requeue(long)} was called; i.e this is
     * number of times we've been requeued.
     */
    public int getRequeueCount() {
      return this.requeueCount;
    }
 
    /**
     * @param when When to expire, when to come up out of the queue.
     * Specify in milliseconds.  This method adds EnvironmentEdgeManager.currentTime()
     * to whatever you pass.
     * @return This.
     */
    public FlushRegionEntry requeue(final long when) {
      this.whenToExpire = EnvironmentEdgeManager.currentTime() + when;
      this.requeueCount++;
      return this;
    }
    @Override
    public long getDelay(TimeUnit unit) {
      return unit.convert(this.whenToExpire - EnvironmentEdgeManager.currentTime(),
          TimeUnit.MILLISECONDS);
    }
    @Override
    public int compareTo(Delayed other) {
      // Delay is compared first. If there is a tie, compare region's hash code
      int ret = Long.valueOf(getDelay(TimeUnit.MILLISECONDS) -
        other.getDelay(TimeUnit.MILLISECONDS)).intValue();
      if (ret != 0) {
        return ret;
      }
      FlushQueueEntry otherEntry = (FlushQueueEntry) other;
      return hashCode() - otherEntry.hashCode();
    }
 
    @Override
    public String toString() {
      return "[flush region " + Bytes.toStringBinary(region.getRegionName()) + "]";
    }
 
    @Override
    public int hashCode() {
      int hash = (int) getDelay(TimeUnit.MILLISECONDS);
      return hash ^ region.hashCode();
    }
 
   @Override
    public boolean equals(Object obj) {
      if (this == obj) {
        return true;
      }
      if (obj == null || getClass() != obj.getClass()) {
        return false;
      }
      Delayed other = (Delayed) obj;
      return compareTo(other) == 0;
    }
  }
}

         During the initialization of the object of FlushRegionEntry class, createTime is set to the current time, and whenToExpire is also set to the current time. The getDelay() method to judge whether it expires is whenToExpire minus createTime (that is, the column can be listed for the first time). In addition, its compareTo() method also determines the order according to the getdelay () method. If the whenToExpire time is consistent, it is sorted according to hashCode(). In addition, this class also provides a method similar to re entering the column. The number of re entering the column is requestcount plus 1, and whenToExpire is set to the current time plus the parameter when.

        Finally, let's take a look at the actual processing flow of the flush request:

@Override
public void run() {
      while (!server.isStopped()) {
        FlushQueueEntry fqe = null;
        
        try {
          
          wakeupPending.set(false);

          fqe = flushQueue.poll(threadWakeFrequency, TimeUnit.MILLISECONDS);
          
          if (fqe == null || fqe instanceof WakeupFlushThread) {
          
            if (isAboveLowWaterMark()) {

              LOG.debug("Flush thread woke up because memory above low water="
                  + StringUtils.humanReadableInt(globalMemStoreLimitLowMark));
              
              if (!flushOneForGlobalPressure()) {
                Thread.sleep(1000);
                wakeUpIfBlocking();
              }
              wakeupFlushThread();
            }
            continue;
          }
          
          FlushRegionEntry fre = (FlushRegionEntry) fqe;
          
          if (!flushRegion(fre)) {
            break;
          }
        } catch (InterruptedException ex) {
          continue;
        } catch (ConcurrentModificationException ex) {
          continue;
        } catch (Exception ex) {
          LOG.error("Cache flusher failed for entry " + fqe, ex);
          if (!server.checkFileSystem()) {
            break;
          }
        }
      }

      synchronized (regionsInQueue) {
        regionsInQueue.clear();
        flushQueue.clear();
      }
 
      wakeUpIfBlocking();
      
      LOG.info(getName() + " exiting");
}

        The processing flow is as follows:;

        1. This method runs continuously when HRegionServer is not stopped

        2. Set wakeupPending to false

         3. Pull a FlushQueueEntry from the flushQueue queue queue., If it is empty or WakeupFlushThread, determine the global MemStore size through isAboveLowWaterMark() method. If it is higher than the low level of the limit value, call flushOneForGlobalPressure() method. According to certain policies, flush a MemStore of an hregon, reduce the size of the MemStore, and list another token to wake up the thread again later; If it is not null and WakeupFlushThread, it will be converted to FlushRegionEntry type: call the flushRegion() method, and if the result is false, the loop will jump out

        4. The regionsInQueue and flushQueue are cleared at the end of the loop

         5. Wake up all the waiters so that they can see the close flag

        6. Log

        To sum up, WakeupFlushThread is mainly inserted into the refresh queue flushQueue as a placeholder or token to ensure that the FlushHandler will not sleep. In fact, WakeupFlushThread has other functions. When the FlushHandler thread continuously poll s the elements in the queue flushQueue, if it obtains a WakeupFlushThread, it will initiate a detection, That is, whether the global MemStore size of the RegionServer exceeds the low level line. If not, WakeupFlushThread only serves as a placeholder. Otherwise, WakeupFlushThread selects a Region on the RegionServer to refresh the MemStore according to certain policies to alleviate the memory pressure of the RegionServer.

         If there is any mistake, please correct it.

Posted by archbeta on Thu, 28 Oct 2021 06:35:10 -0700