Talk about storm's Assignment Distribution Service

Keywords: Programming supervisor Java Apache socket

order

This paper mainly studies Assignment Distribution Service of storm.

AssignmentDistributionService

storm-2.0.0/storm-server/src/main/java/org/apache/storm/nimbus/AssignmentDistributionService.java

/**
 * A service for distributing master assignments to supervisors, this service makes the assignments notification
 * asynchronous.
 *
 * <p>We support multiple working threads to distribute assignment, every thread has a queue buffer.
 *
 * <p>Master will shuffle its node request to the queues, if the target queue is full, we just discard the request,
 * let the supervisors sync instead.
 *
 * <p>Caution: this class is not thread safe.
 *
 * <pre>{@code
 * Working mode
 *                      +--------+         +-----------------+
 *                      | queue1 |   ==>   | Working thread1 |
 * +--------+ shuffle   +--------+         +-----------------+
 * | Master |   ==>
 * +--------+           +--------+         +-----------------+
 *                      | queue2 |   ==>   | Working thread2 |
 *                      +--------+         +-----------------+
 * }
 * </pre>
 */
public class AssignmentDistributionService implements Closeable {
	//......
    private ExecutorService service;

    /**
     * Assignments request queue.
     */
    private volatile Map<Integer, LinkedBlockingQueue<NodeAssignments>> assignmentsQueue;

    /**
     * Add an assignments for a node/supervisor for distribution.
     * @param node node id of supervisor.
     * @param host host name for the node.
     * @param serverPort node thrift server port.
     * @param assignments the {@link org.apache.storm.generated.SupervisorAssignments}
     */
    public void addAssignmentsForNode(String node, String host, Integer serverPort, SupervisorAssignments assignments) {
        try {
            //For some reasons, we can not get supervisor port info, eg: supervisor shutdown,
            //Just skip for this scheduling round.
            if (serverPort == null) {
                LOG.warn("Discard an assignment distribution for node {} because server port info is missing.", node);
                return;
            }

            boolean success = nextQueue().offer(NodeAssignments.getInstance(node, host, serverPort, assignments), 5L, TimeUnit.SECONDS);
            if (!success) {
                LOG.warn("Discard an assignment distribution for node {} because the target sub queue is full.", node);
            }

        } catch (InterruptedException e) {
            LOG.error("Add node assignments interrupted: {}", e.getMessage());
            throw new RuntimeException(e);
        }
    }

    private LinkedBlockingQueue<NodeAssignments> nextQueue() {
        return this.assignmentsQueue.get(nextQueueId());
    }
}
  • Nimbus notifies supervisor of task assignment results by calling addAs signments ForNode of Assignment Distribution Service
  • AdAssignments ForNode is mainly about putting Supervisor Assignments into assignments Queue

AssignmentDistributionService.getInstance

storm-2.0.0/storm-server/src/main/java/org/apache/storm/nimbus/AssignmentDistributionService.java

    /**
     * Factory method for initialize a instance.
     * @param conf config.
     * @return an instance of {@link AssignmentDistributionService}
     */
    public static AssignmentDistributionService getInstance(Map conf) {
        AssignmentDistributionService service = new AssignmentDistributionService();
        service.prepare(conf);
        return service;
    }

    /**
     * Function for initialization.
     *
     * @param conf config
     */
    public void prepare(Map conf) {
        this.conf = conf;
        this.random = new Random(47);

        this.threadsNum = ObjectReader.getInt(conf.get(DaemonConfig.NIMBUS_ASSIGNMENTS_SERVICE_THREADS), 10);
        this.queueSize = ObjectReader.getInt(conf.get(DaemonConfig.NIMBUS_ASSIGNMENTS_SERVICE_THREAD_QUEUE_SIZE), 100);

        this.assignmentsQueue = new HashMap<>();
        for (int i = 0; i < threadsNum; i++) {
            this.assignmentsQueue.put(i, new LinkedBlockingQueue<NodeAssignments>(queueSize));
        }
        //start the thread pool
        this.service = Executors.newFixedThreadPool(threadsNum);
        this.active = true;
        //start the threads
        for (int i = 0; i < threadsNum; i++) {
            this.service.submit(new DistributeTask(this, i));
        }
        // for local cluster
        localSupervisors = new HashMap<>();
        if (ConfigUtils.isLocalMode(conf)) {
            isLocalMode = true;
        }
    }
  • The getInstance method new an AssignmentDistributionService, while calling the prepare method for initialization
  • At prepare time, a LinkedBlockingQueue with the number of threadsNum is created, and the queue size is DaemonConfig.NIMBUS_ASSIGNMENTS_SERVICE_THREAD_QUEUE_SIZE.
  • In addition, a thread pool is created through Executors.newFixedThreadPool(threadsNum), and the number of DistributeTasks for threadsNum is submitted. Each queue corresponds to a DistributeTask.

DistributeTask

storm-2.0.0/storm-server/src/main/java/org/apache/storm/nimbus/AssignmentDistributionService.java

    /**
     * Task to distribute assignments.
     */
    static class DistributeTask implements Runnable {
        private AssignmentDistributionService service;
        private Integer queueIndex;

        DistributeTask(AssignmentDistributionService service, Integer index) {
            this.service = service;
            this.queueIndex = index;
        }

        @Override
        public void run() {
            while (service.isActive()) {
                try {
                    NodeAssignments nodeAssignments = this.service.nextAssignments(queueIndex);
                    sendAssignmentsToNode(nodeAssignments);
                } catch (InterruptedException e) {
                    if (service.isActive()) {
                        LOG.error("Get an unexpected interrupt when distributing assignments to node, {}", e.getCause());
                    } else {
                        // service is off now just interrupt it.
                        Thread.currentThread().interrupt();
                    }
                }
            }
        }

        private void sendAssignmentsToNode(NodeAssignments assignments) {
            if (this.service.isLocalMode) {
                //local node
                Supervisor supervisor = this.service.localSupervisors.get(assignments.getNode());
                if (supervisor != null) {
                    supervisor.sendSupervisorAssignments(assignments.getAssignments());
                } else {
                    LOG.error("Can not find node {} for assignments distribution", assignments.getNode());
                    throw new RuntimeException("null for node " + assignments.getNode() + " supervisor instance.");
                }
            } else {
                // distributed mode
                try (SupervisorClient client = SupervisorClient.getConfiguredClient(service.getConf(),
                                                                                    assignments.getHost(), assignments.getServerPort())) {
                    try {
                        client.getClient().sendSupervisorAssignments(assignments.getAssignments());
                    } catch (Exception e) {
                        //just ignore the exception.
                        LOG.error("Exception when trying to send assignments to node {}: {}", assignments.getNode(), e.getMessage());
                    }
                } catch (Throwable e) {
                    //just ignore any error/exception.
                    LOG.error("Exception to create supervisor client for node {}: {}", assignments.getNode(), e.getMessage());
                }

            }
        }
    }

    /**
     * Get an assignments from the target queue with the specific index.
     * @param queueIndex index of the queue
     * @return an {@link NodeAssignments}
     * @throws InterruptedException
     */
    public NodeAssignments nextAssignments(Integer queueIndex) throws InterruptedException {
        NodeAssignments target = null;
        while (true) {
            target = getQueueById(queueIndex).poll();
            if (target != null) {
                return target;
            }
            Time.sleep(100L);
        }
    }
  • When the Assignment Distribution Service prepare s, it submits DistributeTask to the thread pool
  • The run method of DistributeTask keeps looping, takes NodeAssignments from the corresponding queue, and then calls sendAssignmentsToNode for remote communication.
  • SendAssignments ToNode calls client. getClient (). sendSupervisor Assignments (assignments. getAssignments ())

Supervisor.launchSupervisorThriftServer

storm-2.0.0/storm-server/src/main/java/org/apache/storm/daemon/supervisor/Supervisor.java

    private void launchSupervisorThriftServer(Map<String, Object> conf) throws IOException {
        // validate port
        int port = getThriftServerPort();
        try {
            ServerSocket socket = new ServerSocket(port);
            socket.close();
        } catch (BindException e) {
            LOG.error("{} is not available. Check if another process is already listening on {}", port, port);
            throw new RuntimeException(e);
        }

        TProcessor processor = new org.apache.storm.generated.Supervisor.Processor(
            new org.apache.storm.generated.Supervisor.Iface() {
                @Override
                public void sendSupervisorAssignments(SupervisorAssignments assignments)
                    throws AuthorizationException, TException {
                    checkAuthorization("sendSupervisorAssignments");
                    LOG.info("Got an assignments from master, will start to sync with assignments: {}", assignments);
                    SynchronizeAssignments syn = new SynchronizeAssignments(getSupervisor(), assignments,
                                                                            getReadClusterState());
                    getEventManger().add(syn);
                }

                //......
            });
        this.thriftServer = new ThriftServer(conf, processor, ThriftConnectionType.SUPERVISOR);
        this.thriftServer.serve();
    }
  • When Supervisor. launchSupervisor Thrift Server is launched, TProcessor is added, and Supervisor Assignments are packaged as Synchronize Assignments and added to Event Manager.

SynchronizeAssignments.run

storm-2.0.0/storm-server/src/main/java/org/apache/storm/daemon/supervisor/timer/SynchronizeAssignments.java

/**
 * A runnable which will synchronize assignments to node local and then worker processes.
 */
public class SynchronizeAssignments implements Runnable {
	//......
	@Override
    public void run() {
        // first sync assignments to local, then sync processes.
        if (null == assignments) {
            getAssignmentsFromMaster(this.supervisor.getConf(), this.supervisor.getStormClusterState(), this.supervisor.getAssignmentId());
        } else {
            assignedAssignmentsToLocal(this.supervisor.getStormClusterState(), assignments);
        }
        this.readClusterState.run();
    }

    private static void assignedAssignmentsToLocal(IStormClusterState clusterState, SupervisorAssignments assignments) {
        if (null == assignments) {
            //unknown error, just skip
            return;
        }
        Map<String, byte[]> serAssignments = new HashMap<>();
        for (Map.Entry<String, Assignment> entry : assignments.get_storm_assignment().entrySet()) {
            serAssignments.put(entry.getKey(), Utils.serialize(entry.getValue()));
        }
        clusterState.syncRemoteAssignments(serAssignments);
    }
}
  • This calls assignedAssignments ToLocal, and then triggers this.readClusterState.run()
  • Assigned Assignments ToLocal calls clusterState.syncRemoteAssignments(serAssignments)

StormClusterStateImpl.syncRemoteAssignments

storm-2.0.0/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java

    @Override
    public void syncRemoteAssignments(Map<String, byte[]> remote) {
        if (null != remote) {
            this.assignmentsBackend.syncRemoteAssignments(remote);
        } else {
            Map<String, byte[]> tmp = new HashMap<>();
            List<String> stormIds = this.stateStorage.get_children(ClusterUtils.ASSIGNMENTS_SUBTREE, false);
            for (String stormId : stormIds) {
                byte[] assignment = this.stateStorage.get_data(ClusterUtils.assignmentPath(stormId), false);
                tmp.put(stormId, assignment);
            }
            this.assignmentsBackend.syncRemoteAssignments(tmp);
        }
    }
  • Here, update the serAssignments information to assignments Backend (local memory)
  • If remote is null, the allocation information is read from zk and updated to memory; zk address is ClusterUtils.assignmentPath(stormId)(/assignments/{topologyId})

ReadClusterState.run

storm-2.0.0/storm-server/src/main/java/org/apache/storm/daemon/supervisor/ReadClusterState.java

    @Override
    public synchronized void run() {
        try {
            List<String> stormIds = stormClusterState.assignments(null);
            Map<String, Assignment> assignmentsSnapshot = getAssignmentsSnapshot(stormClusterState);

            Map<Integer, LocalAssignment> allAssignments = readAssignments(assignmentsSnapshot);
            if (allAssignments == null) {
                //Something odd happened try again later
                return;
            }
            Map<String, List<ProfileRequest>> topoIdToProfilerActions = getProfileActions(stormClusterState, stormIds);

            HashSet<Integer> assignedPorts = new HashSet<>();
            LOG.debug("Synchronizing supervisor");
            LOG.debug("All assignment: {}", allAssignments);
            LOG.debug("Topology Ids -> Profiler Actions {}", topoIdToProfilerActions);
            for (Integer port : allAssignments.keySet()) {
                if (iSuper.confirmAssigned(port)) {
                    assignedPorts.add(port);
                }
            }
            HashSet<Integer> allPorts = new HashSet<>(assignedPorts);
            iSuper.assigned(allPorts);
            allPorts.addAll(slots.keySet());

            Map<Integer, Set<TopoProfileAction>> filtered = new HashMap<>();
            for (Entry<String, List<ProfileRequest>> entry : topoIdToProfilerActions.entrySet()) {
                String topoId = entry.getKey();
                if (entry.getValue() != null) {
                    for (ProfileRequest req : entry.getValue()) {
                        NodeInfo ni = req.get_nodeInfo();
                        if (host.equals(ni.get_node())) {
                            Long port = ni.get_port().iterator().next();
                            Set<TopoProfileAction> actions = filtered.get(port.intValue());
                            if (actions == null) {
                                actions = new HashSet<>();
                                filtered.put(port.intValue(), actions);
                            }
                            actions.add(new TopoProfileAction(topoId, req));
                        }
                    }
                }
            }

            for (Integer port : allPorts) {
                Slot slot = slots.get(port);
                if (slot == null) {
                    slot = mkSlot(port);
                    slots.put(port, slot);
                    slot.start();
                }
                slot.setNewAssignment(allAssignments.get(port));
                slot.addProfilerActions(filtered.get(port));
            }

        } catch (Exception e) {
            LOG.error("Failed to Sync Supervisor", e);
            throw new RuntimeException(e);
        }
    }
  • Here we call the setNewAssignment of slot for allocation, and set the Atomic Reference < LocalAssignment > newAssignment of slot.
  • Slot's run method polls for judgment of new Assignment through the stateMachineStep method and then updates nextState

Summary

  • Nimbus notifies supervisor of task assignment results by calling addAs signments ForNode of Assignment Distribution Service
    • Additional Assignments ForNode mainly puts Supervisor Assignments into assignments Queue; Assignment Distribution Service defaults to create a thread pool with a specified number of threads, while creating a queue with a specified number of threads and a Distribute Task
    • DistributeTask loops continuously to pull Synchronize Assignments from the specified queue, and then calls sendAssignments ToNode notification to supervisor
  • Supervisor launches Supervisor Thrift Server at startup, registers the processor responding to sendSupervisor Assignments, and packages the received Supervisor Assignments as Synchronize Assignments and adds them to Event Manager
    • EventManager executes its run method when handling SynchronizeAssignments, calls assignedAssignments ToLocal, and then triggers this.readClusterState.run()
    • AssignedAssignments ToLocal calls cluster State. syncRemoteAssignments (serAssignments) to update the allocation information to local memory; readClusterState.run() mainly updates the slot's new Assignment value, then relies on Slot's polling to sense state changes, and triggers the corresponding processing.

doc

Posted by SirChick on Fri, 01 Feb 2019 22:21:17 -0800