order
This article mainly demonstrates an example of storm drpc
To configure
version: '2' services: supervisor: image: storm container_name: supervisor command: storm supervisor -c storm.local.hostname="192.168.99.100" -c drpc.servers='["192.168.99.100"]' -c drpc.port=3772 -c drpc.invocations.port=3773 -c drpc.http.port=3774 depends_on: - nimbus - zookeeper links: - nimbus - zookeeper restart: always ports: - 6700:6700 - 6701:6701 - 6702:6702 - 6703:6703 - 8000:8000 drpc: image: storm container_name: drpc command: storm drpc -c storm.local.hostname="192.168.99.100" -c drpc.port=3772 -c drpc.invocations.port=3773 -c drpc.http.port=3774 depends_on: - nimbus - supervisor - zookeeper links: - nimbus - supervisor - zookeeper restart: always ports: - 3772:3772 - 3773:3773 - 3774:3774
- Here, drpc.servers and drpc.port and drpc.invocations.port are configured for supervisor so that worker can access DRPC nodes through drpc.invocations.port.
- For DRPC services, expose drpc. port (for external DRCClient access), drpc. invocations. port (for worker access)
TridentTopology
@Test public void testDeployDRPCStateQuery() throws InterruptedException, TException { TridentTopology topology = new TridentTopology(); FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 3, new Values("the cow jumped over the moon"), new Values("the man went to the store and bought some candy"), new Values("four score and seven years ago"), new Values("how many apples can you eat")); spout.setCycle(true); TridentState wordCounts = topology.newStream("spout1", spout) .each(new Fields("sentence"), new Split(), new Fields("word")) .groupBy(new Fields("word")) //NOTE transforms a Stream into a TridentState object .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count")) .parallelismHint(6); topology.newDRPCStream("words") .each(new Fields("args"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .stateQuery(wordCounts, new Fields("word"), new MapGet(), new Fields("count")) .each(new Fields("count"), new FilterNull()) .aggregate(new Fields("count"), new Sum(), new Fields("sum")); StormTopology stormTopology = topology.build(); //Remote submission of mvn clean package -Dmaven.test.skip=true //Storm by default uses System.getProperty("storm.jar") to fetch, and if it is not set, it cannot submit. System.setProperty("storm.jar",TOPOLOGY_JAR); Config conf = new Config(); conf.put(Config.NIMBUS_SEEDS,Arrays.asList("192.168.99.100")); //Configure the nimbus connection host address, such as 192.168.10.1 conf.put(Config.NIMBUS_THRIFT_PORT,6627);//Configure the nimbus connection port, default 6627 conf.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList("192.168.99.100")); //Configure the zookeeper connection host address to store multiple hosts using a collection conf.put(Config.STORM_ZOOKEEPER_PORT,2181); //Configure zookeeper connection port, default 2181 StormSubmitter.submitTopology("DRPCStateQuery", conf, stormTopology); }
- Here, newStream creates a Trident State, and then newDRPCStream creates a DRPCStream whose stateQuery is specified as the Trident State created earlier.
- Because TridentState stores the results in MemoryMapState, DRPCStream here performs stateQuery via drpc
DRPCClient
@Test public void testLaunchDrpcClient() throws TException { Config conf = new Config(); //NOTE sets the Config.DRPC_THRIFT_TRANSPORT_PLUGIN attribute, otherwise client runs empty pointer directly. conf.put(Config.DRPC_THRIFT_TRANSPORT_PLUGIN,SimpleTransportPlugin.class.getName()); conf.put(Config.STORM_NIMBUS_RETRY_TIMES,3); conf.put(Config.STORM_NIMBUS_RETRY_INTERVAL,10000); conf.put(Config.STORM_NIMBUS_RETRY_INTERVAL_CEILING,10000); conf.put(Config.DRPC_MAX_BUFFER_SIZE, 104857600); // 100M DRPCClient client = new DRPCClient(conf, "192.168.99.100", 3772); System.out.println(client.execute("words", "cat dog the man")); }
- Note that there must be no fewer configuration items here, otherwise null pointers will be triggered
- Config.DRPC_THRIFT_TRANSPORT_PLUGIN is SimpleTransport Plugin. class. getName(), which is abandoned, but still runs.
- Because SimpleTransportPlugin.class is used, Config.DRPC_MAX_BUFFER_SIZE is configured here.
- DRPCClient configures the address and port of drpc
- client.execute Pass in the function name specified by newDRPCStream
Summary
- When using drpc, we need to start drpc server service node through storm drpc, and expose two ports, one for external DRPCClient call, one for worker access, and the other for drpc.http.port for HTTP protocol call (DRPCClient uses thrift protocol call).
- supervisor configures drpc.servers, drpc.invocations.port so that worker can access drpc server
- DRPCClient accesses using the port specified by drpc.port, and client.execute passes in the function name specified by newDRPCStream