Catalog
1, RequestProcessor
The request processors are linked together to process transactions. Requests are always processed in order. Independent servers and master-slave servers link together slightly different request processors. Requests always move forward through the request processor chain. The request is passed to RequestProcessor through processRequest. Typically, methods will always be called by a single thread. When shutdown is called, the request request processor should also shut down any request processors it is connected to.
public interface RequestProcessor { void processRequest(Request request) throws RequestProcessorException; void shutdown(); }
2, PrepRequestProcessor
This request processor usually starts when the request processor changes. It sets any transactions associated with the request to change the state of the system. It relies on zookeeper server to update outstanding requests to take into account the transactions to be applied in the queue when generating transactions.
It is usually the first processor in the Request processing chain. It inherits the zookeeper critical thread thread, implements the RequestProcessor interface, maintains a LinkedBlockingQueue internally, and holds the Request object of the client. The thread run method is to pull the value from the queue and process pRequest.
public class PrepRequestProcessor extends ZooKeeperCriticalThread implements RequestProcessor { // Submitted requests queue maintained LinkedBlockingQueue<Request> submittedRequests = new LinkedBlockingQueue<Request>(); // Next processor private final RequestProcessor nextProcessor; // Current zookeeper server ZooKeeperServer zks; // Construction method requires the next RequestProcessor public PrepRequestProcessor(ZooKeeperServer zks,RequestProcessor nextProcessor) { super("ProcessThread(sid:" + zks.getServerId() + " cport:" + zks.getClientPort() + "):", zks.getZooKeeperServerListener()); this.nextProcessor = nextProcessor; this.zks = zks; } // processRequest method public void processRequest(Request request) { // Set prepQueueStartTime start time request.prepQueueStartTime = Time.currentElapsedTime(); // Add to queue submittedRequests.add(request); ServerMetrics.getMetrics().PREP_PROCESSOR_QUEUED.add(1); } // Close the method, clear the queue, close the thread, and call the shutdown method of nextProcessor public void shutdown() { LOG.info("Shutting down"); submittedRequests.clear(); // Add Request.requestOfDeath to shut down the thread submittedRequests.add(Request.requestOfDeath); nextProcessor.shutdown(); } // Thread persistence method public void run() { try { while (true) { ServerMetrics.getMetrics().PREP_PROCESSOR_QUEUE_SIZE.add(submittedRequests.size()); Request request = submittedRequests.take(); ServerMetrics.getMetrics().PREP_PROCESSOR_QUEUE_TIME.add(Time.currentElapsedTime() - request.prepQueueStartTime); long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK; if (request.type == OpCode.ping) { traceMask = ZooTrace.CLIENT_PING_TRACE_MASK; } if (LOG.isTraceEnabled()) { ZooTrace.logRequest(LOG, traceMask, 'P', request, ""); } // Judge whether it is shutDown if (Request.requestOfDeath == request) { break; } // Set the prepStartTime time request.prepStartTime = Time.currentElapsedTime(); // Preprocess request pRequest(request); } } catch (RequestProcessorException e) { if (e.getCause() instanceof XidRolloverException) { LOG.info(e.getCause().getMessage()); } handleException(this.getName(), e); } catch (Exception e) { handleException(this.getName(), e); } LOG.info("PrepRequestProcessor exited loop!"); } }
3, SyncRequestProcessor
This request processor records the request information to disk, and it processes the request information in batches for efficient IO execution. The requested log is not passed to the next request processor until it is synchronized to disk.
SyncRequestProcessor is used in the following three scenarios
1.Leader - synchronizes the request to disk and forwards it to AckRequestProcessor, which sends the ack back to itself.
2.Follower - synchronizes the request to disk and forwards the request to SendAckRequestProcessor, which sends the packet to the leader. SendAckRequestProcessor is refreshable and allows us to force packets to be pushed to the leader.
3.Observer - synchronizes the submitted request to disk (received as a notification package). It never sends an ack back to the leader, so the next processor will be empty. This changes the semantics of txnlog on the observer because it contains only the submitted txn.
SyncRequestProcessor implements the RequestProcessor interface and inherits the zookeeper criticalthread thread class. The construction method is passed to zookeepserver and nextProcessor, and the queuedRequests queue is maintained internally. The thread run method constantly obtains requests from the queue
Then flush the log to disk.
public class SyncRequestProcessor extends ZooKeeperCriticalThread implements RequestProcessor { // queuedRequests queue private final BlockingQueue<Request> queuedRequests = new LinkedBlockingQueue<Request>(); // Snapshot thread semaphore private final Semaphore snapThreadMutex = new Semaphore(1); // zookeeper server private final ZooKeeperServer zks; // Next RequestProcessor private final RequestProcessor nextProcessor; // Batch Request queue to be refreshed private final Queue<Request> toFlush; // Construction method public SyncRequestProcessor(ZooKeeperServer zks,RequestProcessor nextProcessor) { super("SyncThread:" + zks.getServerId(), zks .getZooKeeperServerListener()); this.zks = zks; this.nextProcessor = nextProcessor; this.toFlush = new ArrayDeque<>(zks.getMaxBatchSize()); } // shutdown method public void shutdown() { LOG.info("Shutting down"); // Add the DEATH request object in the queue to stop the current thread queuedRequests.add(REQUEST_OF_DEATH); try { // The current thread join, stop the current thread run method as soon as possible this.join(); // Refresh this.flush(); } catch (InterruptedException e) { LOG.warn("Interrupted while wating for " + this + " to finish"); Thread.currentThread().interrupt(); } catch (IOException e) { LOG.warn("Got IO exception during shutdown"); } catch (RequestProcessorException e) { LOG.warn("Got request processor exception during shutdown"); } if (nextProcessor != null) { // Pass to next processor shutdown nextProcessor.shutdown(); } } // Processing request method public void processRequest(final Request request) { Objects.requireNonNull(request, "Request cannot be null"); request.syncQueueStartTime = Time.currentElapsedTime(); // Add Request to queue queuedRequests.add(request); ServerMetrics.getMetrics().SYNC_PROCESSOR_QUEUED.add(1); } // Thread run method public void run() { try { resetSnapshotStats(); lastFlushTime = Time.currentElapsedTime(); while (true) { ServerMetrics.getMetrics().SYNC_PROCESSOR_QUEUE_SIZE.add(queuedRequests.size()); long pollTime = Math.min(zks.getMaxWriteQueuePollTime(), getRemainingDelay()); // Constantly pull Request objects from the queue Request si = queuedRequests.poll(pollTime, TimeUnit.MILLISECONDS); if (si == null) { // In pollTime, no Request was found, and it was directly refreshed to disk in batch flush(); // After refresh, get si = queuedRequests.take(); } // Thread interrupt judgment if (si == REQUEST_OF_DEATH) { break; } long startProcessTime = Time.currentElapsedTime(); ServerMetrics.getMetrics().SYNC_PROCESSOR_QUEUE_TIME.add( startProcessTime - si.syncQueueStartTime); // track the number of records written to the log // Add log, if successfully executed down if (zks.getZKDatabase().append(si)) { // Start snapshot if (shouldSnapshot()) { resetSnapshotStats(); // Rollback Log zks.getZKDatabase().rollLog(); // Snapshot thread semaphore acquisition if (!snapThreadMutex.tryAcquire()) { LOG.warn("Too busy to snap, skipping"); } else { // Single thread snapshot new ZooKeeperThread("Snapshot Thread") { public void run() { try { zks.takeSnapshot(); } catch (Exception e) { LOG.warn("Unexpected exception", e); } finally { // It's over to release the semaphore lock snapThreadMutex.release(); } } }.start(); } } } else if (toFlush.isEmpty()) { // If the log fails to be added and can not be refreshed, it shall be forwarded to the next Processor for processing and refreshing if (nextProcessor != null) { nextProcessor.processRequest(si); if (nextProcessor instanceof Flushable) { ((Flushable)nextProcessor).flush(); } } continue; } // Add to toFlush queue toFlush.add(si); if (shouldFlush()) { // Judge whether the queue meets the batch condition, flush flush(); } ServerMetrics.getMetrics().SYNC_PROCESS_TIME.add(Time.currentElapsedTime() - startProcessTime); } } catch (Throwable t) { handleException(this.getName(), t); } LOG.info("SyncRequestProcessor exited!"); } // Batch flush private void flush() throws IOException, RequestProcessorException { if (this.toFlush.isEmpty()) { return; } ServerMetrics.getMetrics().BATCH_SIZE.add(toFlush.size()); long flushStartTime = Time.currentElapsedTime(); // Transaction submission zks.getZKDatabase().commit(); ServerMetrics.getMetrics().SYNC_PROCESSOR_FLUSH_TIME.add(Time.currentElapsedTime() - flushStartTime); if (this.nextProcessor == null) { this.toFlush.clear(); } else { while (!this.toFlush.isEmpty()) { // nextProcessor batch Request final Request i = this.toFlush.remove(); long latency = Time.currentElapsedTime() - i.syncQueueStartTime; ServerMetrics.getMetrics().SYNC_PROCESSOR_QUEUE_AND_FLUSH_TIME.add(latency); this.nextProcessor.processRequest(i); } if (this.nextProcessor instanceof Flushable) { ((Flushable)this.nextProcessor).flush(); } lastFlushTime = Time.currentElapsedTime(); } } }
4, FinalRequestProcessor
This request handler actually applies any queries between the request and the service. It is always at the end of the request processor chain, so it does not have a nextProcessor member. This request processor relies on zookeeper server to populate the outstanding request members of zookeeper server.
public class FinalRequestProcessor implements RequestProcessor { private static final Logger LOG = LoggerFactory.getLogger(FinalRequestProcessor.class); ZooKeeperServer zks; public FinalRequestProcessor(ZooKeeperServer zks) { this.zks = zks; } // Process Request public void processRequest(Request request) { if (LOG.isDebugEnabled()) { LOG.debug("Processing request:: " + request); } long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK; if (request.type == OpCode.ping) { traceMask = ZooTrace.SERVER_PING_TRACE_MASK; } // Process transaction results ProcessTxnResult rc = null; // changes queue of zookeeper server synchronized (zks.outstandingChanges) { // Call zks to process the request transaction rc = zks.processTxn(request); if (request.getHdr() != null) { TxnHeader hdr = request.getHdr(); long zxid = hdr.getZxid(); while (!zks.outstandingChanges.isEmpty() && zks.outstandingChanges.peek().zxid <= zxid) { ChangeRecord cr = zks.outstandingChanges.remove(); ServerMetrics.getMetrics().OUTSTANDING_CHANGES_REMOVED.add(1); if (cr.zxid < zxid) { LOG.warn("Zxid outstanding " + cr.zxid + " is less than current " + zxid); } if (zks.outstandingChangesForPath.get(cr.path) == cr) { zks.outstandingChangesForPath.remove(cr.path); } } } // do not add non quorum packets to the queue. if (request.isQuorum()) { zks.getZKDatabase().addCommittedProposal(request); } } if (request.type == OpCode.closeSession && connClosedByClient(request)) { if (closeSession(zks.serverCnxnFactory, request.sessionId) || closeSession(zks.secureServerCnxnFactory, request.sessionId)) { return; } } if (request.getHdr() != null) { long propagationLatency = Time.currentWallTime() - request.getHdr().getTime(); if (propagationLatency > 0) { ServerMetrics.getMetrics().PROPAGATION_LATENCY.add(propagationLatency); } } if (request.cnxn == null) { return; } ServerCnxn cnxn = request.cnxn; long lastZxid = zks.getZKDatabase().getDataTreeLastProcessedZxid(); String lastOp = "NA"; zks.decInProcess(); Code err = Code.OK; Record rsp = null; String path = null; try { if (request.getHdr() != null && request.getHdr().getType() == OpCode.error) { if (request.getException() != null) { throw request.getException(); } else { throw KeeperException.create(KeeperException.Code .get(((ErrorTxn) request.getTxn()).getErr())); } } KeeperException ke = request.getException(); if (ke instanceof SessionMovedException) { throw ke; } if (ke != null && request.type != OpCode.multi) { throw ke; } if (LOG.isDebugEnabled()) { LOG.debug("{}",request); } // Get the type of request switch (request.type) { case OpCode.ping: { lastOp = "PING"; updateStats(request, lastOp, lastZxid); // If ping, send response cnxn.sendResponse(new ReplyHeader(-2, lastZxid, 0), null, "response"); return; } case OpCode.createSession: { lastOp = "SESS"; updateStats(request, lastOp, lastZxid); // If the session is created, the session initialization is completed zks.finishSessionInit(request.cnxn, true); return; } case OpCode.multi: { lastOp = "MULT"; rsp = new MultiResponse() ; for (ProcessTxnResult subTxnResult : rc.multiResult) { OpResult subResult ; switch (subTxnResult.type) { case OpCode.check: subResult = new CheckResult(); break; case OpCode.create: subResult = new CreateResult(subTxnResult.path); break; case OpCode.create2: case OpCode.createTTL: case OpCode.createContainer: subResult = new CreateResult(subTxnResult.path, subTxnResult.stat); break; case OpCode.delete: case OpCode.deleteContainer: subResult = new DeleteResult(); break; case OpCode.setData: subResult = new SetDataResult(subTxnResult.stat); break; case OpCode.error: subResult = new ErrorResult(subTxnResult.err) ; if (subTxnResult.err == Code.SESSIONMOVED.intValue()) { throw new SessionMovedException(); } break; default: throw new IOException("Invalid type of op"); } ((MultiResponse)rsp).add(subResult); } break; } // If created, CreateResponse is returned case OpCode.create: { lastOp = "CREA"; rsp = new CreateResponse(rc.path); err = Code.get(rc.err); break; } case OpCode.create2: case OpCode.createTTL: case OpCode.createContainer: { lastOp = "CREA"; rsp = new Create2Response(rc.path, rc.stat); err = Code.get(rc.err); break; } case OpCode.delete: case OpCode.deleteContainer: { lastOp = "DELE"; err = Code.get(rc.err); break; } case OpCode.setData: { lastOp = "SETD"; rsp = new SetDataResponse(rc.stat); err = Code.get(rc.err); break; } case OpCode.reconfig: { lastOp = "RECO"; rsp = new GetDataResponse(((QuorumZooKeeperServer)zks).self.getQuorumVerifier().toString().getBytes(), rc.stat); err = Code.get(rc.err); break; } case OpCode.setACL: { lastOp = "SETA"; rsp = new SetACLResponse(rc.stat); err = Code.get(rc.err); break; } case OpCode.closeSession: { lastOp = "CLOS"; err = Code.get(rc.err); break; } case OpCode.sync: { lastOp = "SYNC"; SyncRequest syncRequest = new SyncRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, syncRequest); rsp = new SyncResponse(syncRequest.getPath()); break; } case OpCode.check: { lastOp = "CHEC"; rsp = new SetDataResponse(rc.stat); err = Code.get(rc.err); break; } case OpCode.exists: { lastOp = "EXIS"; // TODO we need to figure out the security requirement for this! ExistsRequest existsRequest = new ExistsRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, existsRequest); path = existsRequest.getPath(); if (path.indexOf('\0') != -1) { throw new KeeperException.BadArgumentsException(); } Stat stat = zks.getZKDatabase().statNode(path, existsRequest .getWatch() ? cnxn : null); rsp = new ExistsResponse(stat); break; } case OpCode.getData: { lastOp = "GETD"; GetDataRequest getDataRequest = new GetDataRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, getDataRequest); path = getDataRequest.getPath(); DataNode n = zks.getZKDatabase().getNode(path); if (n == null) { throw new KeeperException.NoNodeException(); } PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.READ, request.authInfo, path, null); Stat stat = new Stat(); byte b[] = zks.getZKDatabase().getData(path, stat, getDataRequest.getWatch() ? cnxn : null); rsp = new GetDataResponse(b, stat); break; } case OpCode.setWatches: { lastOp = "SETW"; SetWatches setWatches = new SetWatches(); // XXX We really should NOT need this!!!! request.request.rewind(); ByteBufferInputStream.byteBuffer2Record(request.request, setWatches); long relativeZxid = setWatches.getRelativeZxid(); zks.getZKDatabase().setWatches(relativeZxid, setWatches.getDataWatches(), setWatches.getExistWatches(), setWatches.getChildWatches(), cnxn); break; } case OpCode.getACL: { lastOp = "GETA"; GetACLRequest getACLRequest = new GetACLRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, getACLRequest); path = getACLRequest.getPath(); DataNode n = zks.getZKDatabase().getNode(path); if (n == null) { throw new KeeperException.NoNodeException(); } PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.READ | ZooDefs.Perms.ADMIN, request.authInfo, path, null); Stat stat = new Stat(); List<ACL> acl = zks.getZKDatabase().getACL(path, stat); try { PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.ADMIN, request.authInfo, path, null); rsp = new GetACLResponse(acl, stat); } catch (KeeperException.NoAuthException e) { List<ACL> acl1 = new ArrayList<ACL>(acl.size()); for (ACL a : acl) { if ("digest".equals(a.getId().getScheme())) { Id id = a.getId(); Id id1 = new Id(id.getScheme(), id.getId().replaceAll(":.*", ":x")); acl1.add(new ACL(a.getPerms(), id1)); } else { acl1.add(a); } } rsp = new GetACLResponse(acl1, stat); } break; } case OpCode.getChildren: { lastOp = "GETC"; GetChildrenRequest getChildrenRequest = new GetChildrenRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, getChildrenRequest); path = getChildrenRequest.getPath(); DataNode n = zks.getZKDatabase().getNode(path); if (n == null) { throw new KeeperException.NoNodeException(); } PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.READ, request.authInfo, path, null); List<String> children = zks.getZKDatabase().getChildren( path, null, getChildrenRequest .getWatch() ? cnxn : null); rsp = new GetChildrenResponse(children); break; } case OpCode.getAllChildrenNumber: { lastOp = "GETACN"; GetAllChildrenNumberRequest getAllChildrenNumberRequest = new GetAllChildrenNumberRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, getAllChildrenNumberRequest); path = getAllChildrenNumberRequest.getPath(); DataNode n = zks.getZKDatabase().getNode(path); if (n == null) { throw new KeeperException.NoNodeException(); } PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.READ, request.authInfo, path, null); int number = zks.getZKDatabase().getAllChildrenNumber(path); rsp = new GetAllChildrenNumberResponse(number); break; } case OpCode.getChildren2: { lastOp = "GETC"; GetChildren2Request getChildren2Request = new GetChildren2Request(); ByteBufferInputStream.byteBuffer2Record(request.request, getChildren2Request); Stat stat = new Stat(); path = getChildren2Request.getPath(); DataNode n = zks.getZKDatabase().getNode(path); if (n == null) { throw new KeeperException.NoNodeException(); } PrepRequestProcessor.checkACL(zks, request.cnxn, zks.getZKDatabase().aclForNode(n), ZooDefs.Perms.READ, request.authInfo, path, null); List<String> children = zks.getZKDatabase().getChildren( path, stat, getChildren2Request .getWatch() ? cnxn : null); rsp = new GetChildren2Response(children, stat); break; } case OpCode.checkWatches: { lastOp = "CHKW"; CheckWatchesRequest checkWatches = new CheckWatchesRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, checkWatches); WatcherType type = WatcherType.fromInt(checkWatches.getType()); path = checkWatches.getPath(); boolean containsWatcher = zks.getZKDatabase().containsWatcher( path, type, cnxn); if (!containsWatcher) { String msg = String.format(Locale.ENGLISH, "%s (type: %s)", path, type); throw new KeeperException.NoWatcherException(msg); } break; } case OpCode.removeWatches: { lastOp = "REMW"; RemoveWatchesRequest removeWatches = new RemoveWatchesRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, removeWatches); WatcherType type = WatcherType.fromInt(removeWatches.getType()); path = removeWatches.getPath(); boolean removed = zks.getZKDatabase().removeWatch( path, type, cnxn); if (!removed) { String msg = String.format(Locale.ENGLISH, "%s (type: %s)", path, type); throw new KeeperException.NoWatcherException(msg); } break; } case OpCode.getEphemerals: { lastOp = "GETE"; GetEphemeralsRequest getEphemerals = new GetEphemeralsRequest(); ByteBufferInputStream.byteBuffer2Record(request.request, getEphemerals); String prefixPath = getEphemerals.getPrefixPath(); Set<String> allEphems = zks.getZKDatabase().getDataTree().getEphemerals(request.sessionId); List<String> ephemerals = new ArrayList<>(); if (StringUtils.isBlank(prefixPath) || "/".equals(prefixPath.trim())) { ephemerals.addAll(allEphems); } else { for (String p: allEphems) { if(p.startsWith(prefixPath)) { ephemerals.add(p); } } } rsp = new GetEphemeralsResponse(ephemerals); break; } } } catch (SessionMovedException e) { cnxn.sendCloseSession(); return; } catch (KeeperException e) { err = e.code(); } catch (Exception e) { // log at error level as we are returning a marshalling // error to the user LOG.error("Failed to process " + request, e); StringBuilder sb = new StringBuilder(); ByteBuffer bb = request.request; bb.rewind(); while (bb.hasRemaining()) { sb.append(Integer.toHexString(bb.get() & 0xff)); } LOG.error("Dumping request buffer: 0x" + sb.toString()); err = Code.MARSHALLINGERROR; } ReplyHeader hdr = new ReplyHeader(request.cxid, lastZxid, err.intValue()); updateStats(request, lastOp, lastZxid); try { if (request.type == OpCode.getData && path != null && rsp != null) { // Serialized read responses could be cached by the connection object. // Cache entries are identified by their path and last modified zxid, // so these values are passed along with the response. GetDataResponse getDataResponse = (GetDataResponse)rsp; Stat stat = null; if (getDataResponse.getStat() != null) { stat = getDataResponse.getStat(); } cnxn.sendResponse(hdr, rsp, "response", path, stat); } else { cnxn.sendResponse(hdr, rsp, "response"); } if (request.type == OpCode.closeSession) { cnxn.sendCloseSession(); } } catch (IOException e) { LOG.error("FIXMSG",e); } } // shutdown method public void shutdown() { // we are the final link in the chain LOG.info("shutdown of request processor complete"); } }