Summary
Message middleware storage can be divided into three types: one is stored in memory, which is fast but can cause message loss due to system downtime; the other is stored in memory, and messages are written to DB regularly. The benefits are persistent messages, and how to read and write DB is the bottleneck of MQ; the third is memory + disk, which saves messages on disk regularly, and how to design a good storage mechanism determines the high MQ. Concurrent and highly available.
By reading RocketMQ source code, understand the solution of the following problems:
- How RocketMQ Designs Storage Mechanisms
- What technologies are used to ensure efficient storage
Storage mechanism
Messages are stored in files, and a role is needed to manage the corresponding files, for which MappedFile was born. The role of managing these MappedFiles is MappedFileQueue, which acts as a folder to maintain CopyOnWriteArrayList < MappedFile > mappedFiles.
public class MappedFile { //Record the location after each message is written to memory protected final AtomicInteger wrotePosition = new AtomicInteger(0); //Record the location after each submission to FileChannel protected final AtomicInteger committedPosition = new AtomicInteger(0); //Location of record after refresh to physical file private final AtomicInteger flushedPosition = new AtomicInteger(0); //File size defaults to 1G protected int fileSize; //Corresponding file NIO channel protected FileChannel fileChannel; //Corresponding files private File file; //Memory buffer to save temporarily written messages protected ByteBuffer writeBuffer = null; protected MappedByteBuffer mappedByteBuffer = null; private void init(final String fileName, final int fileSize) throws IOException { this.fileFromOffset = Long.parseLong(this.file.getName()); ensureDirOK(this.file.getParent()); this.fileChannel = new RandomAccessFile(this.file, "rw").getChannel(); this.mappedByteBuffer = this.fileChannel.map(MapMode.READ_WRITE, 0, fileSize); } }
The name of MappedFile (file.getName) is 00000000000000000000, 00000000001073741824, 00000000002147483648, fileName[n] = fileName[n - 1] + mappedFileSize. Record each file name directly with the starting offset, and convert 00000001073741824 to size = 1G, that is, each file name is the start Offset of the file.
MappedFile provides three functions: writing messages, submitting messages to FileChannel, and writing disks.
1,AppendMessageResult appendMessagesInner(MessageExt messageExt, final AppendMessageCallback cb)
2,boolean appendMessage(final byte[] data, final int offset, final int length)
3,int commit(final int commitLeastPages)
4,int flush(final int flushLeastPages)
First look at the appendMessage operation
MappedFile#appendMessage public AppendMessageResult appendMessagesInner(final MessageExt messageExt, final AppendMessageCallback cb) { int currentPos = this.wrotePosition.get(); if (currentPos < this.fileSize) { ByteBuffer byteBuffer = writeBuffer != null ? writeBuffer.slice() : this.mappedByteBuffer.slice(); byteBuffer.position(currentPos); AppendMessageResult result = cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, messageExt); this.wrotePosition.addAndGet(result.getWroteBytes()); this.storeTimestamp = result.getStoreTimestamp(); return result; } ....... }
1. Get the last write location first and extract a partition from Buffer
2. Set the starting position of the buffer to be written, after the last writing position
3. The callback function AppendMessageCallback is responsible for message writing. The function is provided by CommitLog. The logic is to do some additional processing for Message, such as additional message length, timestamp, etc. Specifically as follows:
How many? | field | Explain | data type | Bytes |
---|---|---|---|---|
1 | MsgLen | Total message length | Int | 4 |
2 | MagicCode | MESSAGE_MAGIC_CODE | Int | 4 |
3 | BodyCRC | Message content CRC | Int | 4 |
4 | QueueId | Message queue number | Int | 4 |
5 | Flag | flag | Int | 4 |
6 | QueueOffset | Message queue location | Long | 8 |
7 | PhysicalOffset | Physical location. Sequential storage location in CommitLog. | Long | 8 |
8 | SysFlag | MessageSysFlag | Int | 4 |
9 | BornTimestamp | Generate message timestamp | Long | 8 |
10 | BornHost | Address + Port of Effective Message | Long | 8 |
11 | StoreTimestamp | Storage message timestamp | Long | 8 |
12 | StoreHost | Address + Port for Storing Messages | Long | 8 |
13 | ReconsumeTimes | Number of Reconsumption Messages | Int | 4 |
14 | PreparedTransationOffset | Long | 8 | |
15 | BodyLength + Body | Content Length + Content | Int + Bytes | 4 + bodyLength |
16 | TopicLength + Topic | Topic length + Topic | Byte + Bytes | 1 + topicLength |
17 | PropertiesLength + Properties | Extended Field Length + Extended Field | Short + Bytes | 2 + PropertiesLength |
After encapsulation, it can be written to Buffer as a byte array. Return the write length and tell wrotePosition to offset the WroteBytes length; thus, ByteBuffer is for the message dimension
commit operation
public int commit(final int commitLeastPages) { if (this.isAbleToCommit(commitLeastPages)) { if (this.hold()) { commit0(commitLeastPages); this.release(); } } // All dirty data has been committed to FileChannel. if (writeBuffer != null && this.transientStorePool != null && this.fileSize == this.committedPosition.get()) { this.transientStorePool.returnBuffer(writeBuffer); this.writeBuffer = null; } return this.committedPosition.get(); } protected void commit0(final int commitLeastPages) { int writePos = this.wrotePosition.get(); int lastCommittedPosition = this.committedPosition.get(); if (writePos - this.committedPosition.get() > 0) { try { ByteBuffer byteBuffer = writeBuffer.slice(); byteBuffer.position(lastCommittedPosition); byteBuffer.limit(writePos); this.fileChannel.position(lastCommittedPosition); this.fileChannel.write(byteBuffer); this.committedPosition.set(writePos); } catch (Throwable e) { log.error("Error occurred when commit data to FileChannel.", e); } } } /** * Can we commit it? The following conditions are satisfied: * 1. The mapping file is full * 2. commitLeastPages > 0 && The uncommitted portion exceeds commitLeast Pages * 3. commitLeastPages = 0 && There is a new writing section * @param commitLeastPages commit Minimum paging * @return Is it possible to write */ protected boolean isAbleToCommit(final int commitLeastPages) { int flush = this.committedPosition.get(); int write = this.wrotePosition.get(); if (this.isFull()) { //this.fileSize == this.wrotePosition.get() return true; } if (commitLeastPages > 0) { return ((write / OS_PAGE_SIZE) - (flush / OS_PAGE_SIZE)) >= commitLeastPages; } return write > flush; }
The commit operation is mainly composed of the above three methods. isAbleToCommit is responsible for judging whether or not to write, each writing exceeds 4KB(OS page size). commit0 writes the contents of the buffer (the location after the last submission - the location last written to the Buffer) into the FileChannel and updates the committedPosition. The commit operation is focused on the FileChannel dimension.
flush operation
public int flush(final int flushLeastPages) { if (this.isAbleToFlush(flushLeastPages)) { if (this.hold()) { int value = getReadPosition(); if (writeBuffer != null || this.fileChannel.position() != 0) { this.fileChannel.force(false); } else { this.mappedByteBuffer.force(); } this.flushedPosition.set(value); this.release(); } else { log.warn("in flush, hold failed, flush offset = " + this.flushedPosition.get()); this.flushedPosition.set(getReadPosition()); } } return this.getFlushedPosition(); }
When refreshed, isAbleToFlush, like isAbletoCommit, guarantees more than 4KB. Update the flushedPosition after refreshing to disk, and record the last write location of the physical file. The flush operation is at the physical file level.
Let's look at how CommitLog operates on commit & flush
FlushCommitLogService inherits Service Thread - > Thread, so it executes asynchronously.
Thread service | scene | Insert message performance |
---|---|---|
CommitRealTimeService | Asynchronous Brush & open memory byte buffer | first |
FlushRealTimeService | Asynchronous brush disc & close memory byte buffer | Second |
GroupCommitService | Synchronous brush pan | Third |
CommitRealTime Service calls mappedFileQueue. commit (commitData Least Pages) regularly to perform the submission. Wake up the flush CommitLogService to execute the drop after submitting.
[MappedFileQueue] public boolean commit(final int commitLeastPages) { boolean result = true; MappedFile mappedFile = findMappedFileByOffset(committedWhere,committedWhere == 0); if (mappedFile != null) { int offset = mappedFile.commit(commitLeastPages); // The location after the update, i.e. the start of the next submission long where = mappedFile.getFileFromOffset() + offset; //If they are not equal, they are written. Otherwise, the offset of the previous operation is zero, and after adding, they may still be equal to committedWhere. result = where == this.committedWhere; this.committedWhere = where; } return result; }
First find MappedFile ByOffset finds the file to be submitted. The formula is index (the subscript of the file in the collection)= (committedWhere-startOffset)/fileSize, committedWhere is the place to be submitted, such as committedWhere = 4000, startOffset = 0, fileSize = 1024, then index = 3, get the fourth MapFilpede from Queue, which is responsible for submitting its buffer to FileC. Hannel.
Flush Real Time Service also refreshes content to physical files on a regular basis, and updates flushedWhere after successful refresh, the main steps are similar to commit.
public boolean flush(final int flushLeastPages) { boolean result = true; MappedFile mappedFile = this.findMappedFileByOffset(this.flushedWhere, this.flushedWhere == 0); if (mappedFile != null) { long tmpTimeStamp = mappedFile.getStoreTimestamp(); int offset = mappedFile.flush(flushLeastPages); long where = mappedFile.getFileFromOffset() + offset; result = where == this.flushedWhere; this.flushedWhere = where; if (0 == flushLeastPages) { this.storeTimestamp = tmpTimeStamp; } } return result; }