[RocketMQ] Message Storage Notes

Summary

Message middleware storage can be divided into three types: one is stored in memory, which is fast but can cause message loss due to system downtime; the other is stored in memory, and messages are written to DB regularly. The benefits are persistent messages, and how to read and write DB is the bottleneck of MQ; the third is memory + disk, which saves messages on disk regularly, and how to design a good storage mechanism determines the high MQ. Concurrent and highly available.

By reading RocketMQ source code, understand the solution of the following problems:

  • How RocketMQ Designs Storage Mechanisms
  • What technologies are used to ensure efficient storage

Storage mechanism

Messages are stored in files, and a role is needed to manage the corresponding files, for which MappedFile was born. The role of managing these MappedFiles is MappedFileQueue, which acts as a folder to maintain CopyOnWriteArrayList < MappedFile > mappedFiles.

public class MappedFile {

    //Record the location after each message is written to memory
    protected final AtomicInteger wrotePosition = new AtomicInteger(0);
    //Record the location after each submission to FileChannel
    protected final AtomicInteger committedPosition = new AtomicInteger(0);
    //Location of record after refresh to physical file
    private final AtomicInteger flushedPosition = new AtomicInteger(0);
    //File size defaults to 1G
    protected int fileSize;
    //Corresponding file NIO channel
    protected FileChannel fileChannel;
    //Corresponding files
    private File file;
    //Memory buffer to save temporarily written messages
    protected ByteBuffer writeBuffer = null;
    protected MappedByteBuffer mappedByteBuffer = null;


    private void init(final String fileName, final int fileSize) throws IOException {
       
        this.fileFromOffset = Long.parseLong(this.file.getName());
        ensureDirOK(this.file.getParent());
        this.fileChannel = new RandomAccessFile(this.file, "rw").getChannel();
        this.mappedByteBuffer = this.fileChannel.map(MapMode.READ_WRITE, 0, fileSize);
    }
}

The name of MappedFile (file.getName) is 00000000000000000000, 00000000001073741824, 00000000002147483648, fileName[n] = fileName[n - 1] + mappedFileSize. Record each file name directly with the starting offset, and convert 00000001073741824 to size = 1G, that is, each file name is the start Offset of the file.

MappedFile provides three functions: writing messages, submitting messages to FileChannel, and writing disks.

1,AppendMessageResult appendMessagesInner(MessageExt messageExt, final AppendMessageCallback cb)

2,boolean appendMessage(final byte[] data, final int offset, final int length)

3,int commit(final int commitLeastPages) 

4,int flush(final int flushLeastPages)

First look at the appendMessage operation

MappedFile#appendMessage

public AppendMessageResult appendMessagesInner(final MessageExt messageExt, final AppendMessageCallback cb) {

        int currentPos = this.wrotePosition.get();

        if (currentPos < this.fileSize) {
            ByteBuffer byteBuffer = writeBuffer != null ? writeBuffer.slice() : this.mappedByteBuffer.slice();
            byteBuffer.position(currentPos);
            AppendMessageResult result = cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, messageExt);
            this.wrotePosition.addAndGet(result.getWroteBytes());
            this.storeTimestamp = result.getStoreTimestamp();
            return result;
        }
       .......
    }

1. Get the last write location first and extract a partition from Buffer

2. Set the starting position of the buffer to be written, after the last writing position

3. The callback function AppendMessageCallback is responsible for message writing. The function is provided by CommitLog. The logic is to do some additional processing for Message, such as additional message length, timestamp, etc. Specifically as follows:

How many? field Explain data type Bytes
1 MsgLen Total message length Int 4
2 MagicCode MESSAGE_MAGIC_CODE Int 4
3 BodyCRC Message content CRC Int 4
4 QueueId Message queue number Int 4
5 Flag flag Int 4
6 QueueOffset Message queue location Long 8
7 PhysicalOffset Physical location. Sequential storage location in CommitLog. Long 8
8 SysFlag MessageSysFlag Int 4
9 BornTimestamp Generate message timestamp Long 8
10 BornHost Address + Port of Effective Message Long 8
11 StoreTimestamp Storage message timestamp Long 8
12 StoreHost Address + Port for Storing Messages Long 8
13 ReconsumeTimes Number of Reconsumption Messages Int 4
14 PreparedTransationOffset   Long 8
15 BodyLength + Body Content Length + Content Int + Bytes 4 + bodyLength
16 TopicLength + Topic Topic length + Topic Byte + Bytes 1 + topicLength
17 PropertiesLength + Properties Extended Field Length + Extended Field Short + Bytes 2 + PropertiesLength

After encapsulation, it can be written to Buffer as a byte array. Return the write length and tell wrotePosition to offset the WroteBytes length; thus, ByteBuffer is for the message dimension

commit operation

public int commit(final int commitLeastPages) {

        if (this.isAbleToCommit(commitLeastPages)) {
            if (this.hold()) {
                commit0(commitLeastPages);
                this.release();
            } 
        }
        // All dirty data has been committed to FileChannel.
        if (writeBuffer != null && this.transientStorePool != null && this.fileSize == this.committedPosition.get()) {
            this.transientStorePool.returnBuffer(writeBuffer);
            this.writeBuffer = null;
        }
        return this.committedPosition.get();
}

protected void commit0(final int commitLeastPages) {
        int writePos = this.wrotePosition.get();
        int lastCommittedPosition = this.committedPosition.get();

        if (writePos - this.committedPosition.get() > 0) {
            try {
                ByteBuffer byteBuffer = writeBuffer.slice();
                byteBuffer.position(lastCommittedPosition);
                byteBuffer.limit(writePos);
                this.fileChannel.position(lastCommittedPosition);
                this.fileChannel.write(byteBuffer);
                this.committedPosition.set(writePos);
            } catch (Throwable e) {
                log.error("Error occurred when commit data to FileChannel.", e);
            }
        }
}
/**
 * Can we commit it? The following conditions are satisfied:
 * 1. The mapping file is full
 * 2. commitLeastPages > 0 && The uncommitted portion exceeds commitLeast Pages
 * 3. commitLeastPages = 0 && There is a new writing section
 * @param commitLeastPages commit Minimum paging
 * @return Is it possible to write
 */
 protected boolean isAbleToCommit(final int commitLeastPages) {
        int flush = this.committedPosition.get();
        int write = this.wrotePosition.get();
        if (this.isFull()) {  //this.fileSize == this.wrotePosition.get()
            return true;
        }
        if (commitLeastPages > 0) {
            return ((write / OS_PAGE_SIZE) - (flush / OS_PAGE_SIZE)) >= commitLeastPages;
        }
        return write > flush;
}

The commit operation is mainly composed of the above three methods. isAbleToCommit is responsible for judging whether or not to write, each writing exceeds 4KB(OS page size). commit0 writes the contents of the buffer (the location after the last submission - the location last written to the Buffer) into the FileChannel and updates the committedPosition. The commit operation is focused on the FileChannel dimension.

flush operation

public int flush(final int flushLeastPages) {
        if (this.isAbleToFlush(flushLeastPages)) {
            if (this.hold()) {
                int value = getReadPosition();
                if (writeBuffer != null || this.fileChannel.position() != 0) {
                    this.fileChannel.force(false);
                } else {
                    this.mappedByteBuffer.force();
                }
              
                this.flushedPosition.set(value);
                this.release();
            } else {
                log.warn("in flush, hold failed, flush offset = " + this.flushedPosition.get());
                this.flushedPosition.set(getReadPosition());
            }
        }
        return this.getFlushedPosition();
 }

When refreshed, isAbleToFlush, like isAbletoCommit, guarantees more than 4KB. Update the flushedPosition after refreshing to disk, and record the last write location of the physical file. The flush operation is at the physical file level.

Let's look at how CommitLog operates on commit & flush

FlushCommitLogService inherits Service Thread - > Thread, so it executes asynchronously.

Thread service scene Insert message performance
CommitRealTimeService Asynchronous Brush & open memory byte buffer first
FlushRealTimeService Asynchronous brush disc & close memory byte buffer Second
GroupCommitService Synchronous brush pan Third

CommitRealTime Service calls mappedFileQueue. commit (commitData Least Pages) regularly to perform the submission. Wake up the flush CommitLogService to execute the drop after submitting.

[MappedFileQueue]
 public boolean commit(final int commitLeastPages) {
        boolean result = true;
        MappedFile mappedFile = findMappedFileByOffset(committedWhere,committedWhere == 0);
        if (mappedFile != null) {
            int offset = mappedFile.commit(commitLeastPages);
            // The location after the update, i.e. the start of the next submission
            long where = mappedFile.getFileFromOffset() + offset;
            //If they are not equal, they are written. Otherwise, the offset of the previous operation is zero, and after adding, they may still be equal to committedWhere.
            result = where == this.committedWhere;
            this.committedWhere = where;
        }

        return result;
 }

First find MappedFile ByOffset finds the file to be submitted. The formula is index (the subscript of the file in the collection)= (committedWhere-startOffset)/fileSize, committedWhere is the place to be submitted, such as committedWhere = 4000, startOffset = 0, fileSize = 1024, then index = 3, get the fourth MapFilpede from Queue, which is responsible for submitting its buffer to FileC. Hannel.

Flush Real Time Service also refreshes content to physical files on a regular basis, and updates flushedWhere after successful refresh, the main steps are similar to commit.

 public boolean flush(final int flushLeastPages) {
        boolean result = true;
        MappedFile mappedFile = this.findMappedFileByOffset(this.flushedWhere, this.flushedWhere == 0);
        if (mappedFile != null) {
            long tmpTimeStamp = mappedFile.getStoreTimestamp();
            int offset = mappedFile.flush(flushLeastPages);
            long where = mappedFile.getFileFromOffset() + offset;
            result = where == this.flushedWhere;
            this.flushedWhere = where;
            if (0 == flushLeastPages) {
                this.storeTimestamp = tmpTimeStamp;
            }
        }

        return result;
    }

Posted by depraved on Thu, 09 May 2019 23:24:40 -0700