storm drpc example

Keywords: Programming supervisor Zookeeper Apache snapshot

order

This article mainly demonstrates an example of storm drpc

To configure

version: '2'
services:
    supervisor:
        image: storm
        container_name: supervisor
        command: storm supervisor -c storm.local.hostname="192.168.99.100" -c drpc.servers='["192.168.99.100"]' -c drpc.port=3772 -c drpc.invocations.port=3773 -c drpc.http.port=3774
        depends_on:
            - nimbus
            - zookeeper
        links:
            - nimbus
            - zookeeper
        restart: always
        ports:
            - 6700:6700
            - 6701:6701
            - 6702:6702
            - 6703:6703
            - 8000:8000
    drpc:
        image: storm
        container_name: drpc
        command: storm drpc -c storm.local.hostname="192.168.99.100" -c drpc.port=3772 -c drpc.invocations.port=3773 -c drpc.http.port=3774
        depends_on:
            - nimbus
            - supervisor
            - zookeeper
        links:
            - nimbus
            - supervisor
            - zookeeper
        restart: always
        ports:
            - 3772:3772
            - 3773:3773
            - 3774:3774
  • Here, drpc.servers and drpc.port and drpc.invocations.port are configured for supervisor so that worker can access DRPC nodes through drpc.invocations.port.
  • For DRPC services, expose drpc. port (for external DRCClient access), drpc. invocations. port (for worker access)

TridentTopology

    @Test
    public void testDeployDRPCStateQuery() throws InterruptedException, TException {
        TridentTopology topology = new TridentTopology();
        FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 3,
                new Values("the cow jumped over the moon"),
                new Values("the man went to the store and bought some candy"),
                new Values("four score and seven years ago"),
                new Values("how many apples can you eat"));
        spout.setCycle(true);
        TridentState wordCounts =
                topology.newStream("spout1", spout)
                        .each(new Fields("sentence"), new Split(), new Fields("word"))
                        .groupBy(new Fields("word"))
                        //NOTE transforms a Stream into a TridentState object
                        .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
                        .parallelismHint(6);

        topology.newDRPCStream("words")
                .each(new Fields("args"), new Split(), new Fields("word"))
                .groupBy(new Fields("word"))
                .stateQuery(wordCounts, new Fields("word"), new MapGet(), new Fields("count"))
                .each(new Fields("count"), new FilterNull())
                .aggregate(new Fields("count"), new Sum(), new Fields("sum"));

        StormTopology stormTopology = topology.build();

        //Remote submission of mvn clean package -Dmaven.test.skip=true
        //Storm by default uses System.getProperty("storm.jar") to fetch, and if it is not set, it cannot submit.
        System.setProperty("storm.jar",TOPOLOGY_JAR);

        Config conf = new Config();
        conf.put(Config.NIMBUS_SEEDS,Arrays.asList("192.168.99.100")); //Configure the nimbus connection host address, such as 192.168.10.1
        conf.put(Config.NIMBUS_THRIFT_PORT,6627);//Configure the nimbus connection port, default 6627
        conf.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList("192.168.99.100")); //Configure the zookeeper connection host address to store multiple hosts using a collection
        conf.put(Config.STORM_ZOOKEEPER_PORT,2181); //Configure zookeeper connection port, default 2181

        StormSubmitter.submitTopology("DRPCStateQuery", conf, stormTopology);
    }
  • Here, newStream creates a Trident State, and then newDRPCStream creates a DRPCStream whose stateQuery is specified as the Trident State created earlier.
  • Because TridentState stores the results in MemoryMapState, DRPCStream here performs stateQuery via drpc

DRPCClient

    @Test
    public void testLaunchDrpcClient() throws TException {
        Config conf = new Config();
        //NOTE sets the Config.DRPC_THRIFT_TRANSPORT_PLUGIN attribute, otherwise client runs empty pointer directly.
        conf.put(Config.DRPC_THRIFT_TRANSPORT_PLUGIN,SimpleTransportPlugin.class.getName());
        conf.put(Config.STORM_NIMBUS_RETRY_TIMES,3);
        conf.put(Config.STORM_NIMBUS_RETRY_INTERVAL,10000);
        conf.put(Config.STORM_NIMBUS_RETRY_INTERVAL_CEILING,10000);
        conf.put(Config.DRPC_MAX_BUFFER_SIZE, 104857600); // 100M
        DRPCClient client = new DRPCClient(conf, "192.168.99.100", 3772);
        System.out.println(client.execute("words", "cat dog the man"));
    }
  • Note that there must be no fewer configuration items here, otherwise null pointers will be triggered
  • Config.DRPC_THRIFT_TRANSPORT_PLUGIN is SimpleTransport Plugin. class. getName(), which is abandoned, but still runs.
  • Because SimpleTransportPlugin.class is used, Config.DRPC_MAX_BUFFER_SIZE is configured here.
  • DRPCClient configures the address and port of drpc
  • client.execute Pass in the function name specified by newDRPCStream

Summary

  • When using drpc, we need to start drpc server service node through storm drpc, and expose two ports, one for external DRPCClient call, one for worker access, and the other for drpc.http.port for HTTP protocol call (DRPCClient uses thrift protocol call).
  • supervisor configures drpc.servers, drpc.invocations.port so that worker can access drpc server
  • DRPCClient accesses using the port specified by drpc.port, and client.execute passes in the function name specified by newDRPCStream

doc

Posted by temidayo on Sun, 27 Jan 2019 14:33:15 -0800