Common errors in Flink on yarn

Keywords: Java Apache Scala Hadoop

1 Retrying connect to server
2 Unable to get ClusterClient status from Application Client
3 Cannot instantiate user function
4 Could not resolve substitution to a value: ${akka.stream.materializer}
5 java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer


1 Retrying connect to server
Flink on yarn relies on the hadoop cluster to execute the Flink startup command directly before starting hadoop

./bin/yarn-session.sh -n 1 -jm 2048 -tm 4096
1
The result is that flink can't connect to ResourceManager and the script is stuck for retry.

2018-05-19 14:36:08,062 INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032  
2018-05-19 14:36:09,231 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
2018-05-19 14:36:10,234 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
2018-05-19 14:36:11,235 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
2018-05-19 14:36:12,238 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
2018-05-19 14:36:13,240 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
2018-05-19 14:36:14,247 INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  


So don't worry. Start hadoop before you start Flink.

2 Unable to get ClusterClient status from Application Client
hadoop has been started, so execute the Flink startup command

./bin/yarn-session.sh -n 1 -jm 1024 -tm 4096

Flink did not start successfully

2018-05-19 15:30:10,456 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 
2018-05-19 15:30:21,680 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Could not retrieve the current cluster status. Skipping current retrieval attempt ...
java.lang.RuntimeException: Unable to get ClusterClient status from Application Client
        at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:253)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.runInteractiveCli(FlinkYarnSessionCli.java:443)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:720)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
        at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:862)
        at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
        ... 9 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
        at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:79)
        at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:857)
        ... 10 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at scala.concurrent.Await.result(package.scala)
        at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:77)
        ... 11 more
2018-05-19 15:30:21,691 WARN  org.apache.flink.yarn.YarnClusterClient                       - YARN reported application state FAILED
2018-05-19 15:30:21,692 WARN  org.apache.flink.yarn.YarnClusterClient                       - Diagnostics: Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt.
Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1521277661809_0006_01_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m  -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner  1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err 
        |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
The YARN cluster has failed
2018-05-19 15:30:21,693 INFO  org.apache.flink.yarn.YarnClusterClient                       - Sending shutdown request to the Application Master
2018-05-19 15:30:21,695 WARN  org.apache.flink.yarn.YarnClusterClient                       - YARN reported application state FAILED
2018-05-19 15:30:21,695 WARN  org.apache.flink.yarn.YarnClusterClient                       - Diagnostics: Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt.
Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1521277661809_0006_01_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m  -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner  1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err 
        |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
2018-05-19 15:30:21,697 INFO  org.apache.flink.yarn.ApplicationClient                       - Sending StopCluster request to JobManager.
2018-05-19 15:30:21,726 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: Connection refused: hadoop100/192.168.99.100:55053
2018-05-19 15:30:21,733 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://flink@hadoop100:55053]] Caused by: [Connection refused: hadoop100/192.168.99.100:55053]
2018-05-19 15:30:31,707 WARN  org.apache.flink.yarn.YarnClusterClient                       - Error while stopping YARN cluster.
java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:157)
        at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
        at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.ready(package.scala:169)
        at scala.concurrent.Await.ready(package.scala)
        at org.apache.flink.yarn.YarnClusterClient.shutdownCluster(YarnClusterClient.java:377)
        at org.apache.flink.yarn.YarnClusterClient.finalizeCluster(YarnClusterClient.java:347)
        at org.apache.flink.client.program.ClusterClient.shutdown(ClusterClient.java:263)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.runInteractiveCli(FlinkYarnSessionCli.java:466)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:720)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511)
2018-05-19 15:30:31,711 INFO  org.apache.flink.yarn.YarnClusterClient                       - Deleted Yarn properties file at /tmp/.yarn-properties-root
2018-05-19 15:30:31,881 INFO  org.apache.flink.yarn.YarnClusterClient                       - Application application_1521277661809_0006 finished with state FAILED and final state FAILED at 1521294610146
2018-05-19 15:30:31,882 WARN  org.apache.flink.yarn.YarnClusterClient                       - Application failed. Diagnostics Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt.
Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1521277661809_0006_01_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m  -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner  1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err 
        |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
2018-05-19 15:30:31,884 WARN  org.apache.flink.yarn.YarnClusterClient                       - If log aggregation is activated in the Hadoop cluster, we recommend to retrieve the full application log using this command:
        yarn logs -applicationId application_1521277661809_0006
(It sometimes takes a few seconds until the logs are aggregated)
2018-05-19 15:30:31,885 INFO  org.apache.flink.yarn.YarnClusterClient                       - YARN Client is shutting down
2018-05-19 15:30:31,909 INFO  org.apache.flink.yarn.ApplicationClient                       - Stopped Application client.
2018-05-19 15:30:31,911 INFO  org.apache.flink.yarn.ApplicationClient                       - Disconnect from JobManager Actor[akka.tcp://flink@hadoop100:55053/user/jobmanager#119148826].
2018-05-19 15:30:31,916 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2018-05-19 15:30:31,926 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: Connection refused: hadoop100/192.168.99.100:55053
2018-05-19 15:30:31,935 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://flink@hadoop100:55053]] Caused by: [Connection refused: hadoop100/192.168.99.100:55053]
2018-05-19 15:30:31,935 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2018-05-19 15:30:34,979 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Stopping interactive command line interface, YARN cluster has been stopped.


Such errors are generally caused by insufficient hadoop cluster resources (memory, disk, virtual memory, etc.).  
And most of the cases are due to the allocation of virtual memory beyond the limit, two ways to solve:
(1) Turn off hadoop's checking virtual memory. As follows:

<property>    
    <name>yarn.nodemanager.vmem-check-enabled</name>    
    <value>false</value>    
</property>  

(2) Turn down the allocated memory and try to change it to 800 to start normally. This method is not very long, running for a period of time, container s will eventually be kill ed.

AM Container for appattempt_1526107053244_0016_000001 exited with exitCode: -103
For more detailed output, check application tracking page:http://xxx:8099/cluster/app/application_1526107053244_0016Then, click on links to logs of each attempt.
Diagnostics: Container [pid=28987,containerID=container_1526107053244_0016_01_000001] is running beyond virtual memory limits. Current usage: 366.0 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1526107053244_0016_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 28987 28985 28987 28987 (bash) 0 0 108650496 299 /bin/bash -c /opt/jdk/jdk1.8.0_25/bin/java -Xmx200m -Dlog.file=/opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 1> /opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.out 2> /opt/bl07637/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.err
|- 29009 28987 28987 28987 (java) 5094 780 2186571776 93395 /opt/jdk/jdk1.8.0_25/bin/java -Xmx200m -Dlog.file=/opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt



3 Cannot instantiate user function
After submit jar:

org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function.
    at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:235)
    at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:231)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: cannot assign instance of org.apache.commons.collections.map.LinkedMap to field org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.pendingOffsetsToCommit of type org.apache.commons.collections.map.LinkedMap in instance of org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010
    at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
    at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2006)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:437)
    at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:424)
    at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:412)
    at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:373)
    at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:220)



After checking, most of them think that the commons-collections package conflict is caused. Flnk uses an older version (3.2.2). It is possible to introduce a higher version, but there is no reference in its jar. There has been no clue.  
Later, it was found that Flink always died because of insufficient resources, considering whether it was due to insufficient resources, resulting in Flink did not start and complete. After increasing the start-up resource parameters of flink, the jar is resubmitted to run perfectly.

That's the reason why I made a mistake.

4 Could not resolve substitution to a value: ${akka.stream.materializer}
After submit jar on the interface, report:

Exception in thread "main" com.typesafe.config.ConfigException$UnresolvedSubstitution: reference.conf @ jar:file:/D:/Workspace/Work/middleware/kafka2es/target/kafka2es-0.1.0-SNAPSHOT.jar!/reference.conf: 804: Could not resolve substitution to a value: ${akka.stream.materializer}
        at com.typesafe.config.impl.ConfigReference.resolveSubstitutions(ConfigReference.java:108)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379)
        at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312)
        at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379)
        at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312)
        at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379)
        at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312)
        at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379)
        at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312)
        at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379)
        at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312)
        at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398)
        at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142)
        at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:231)
        at com.typesafe.config.impl.SimpleConfig.resolveWith(SimpleConfig.java:74)
        at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:64)
        at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:59)
        at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:37)
        at com.typesafe.config.impl.ConfigImpl$1.call(ConfigImpl.java:374)
        at com.typesafe.config.impl.ConfigImpl$1.call(ConfigImpl.java:367)
        at com.typesafe.config.impl.ConfigImpl$LoaderCache.getOrElseUpdate(ConfigImpl.java:65)
        at com.typesafe.config.impl.ConfigImpl.computeCachedConfig(ConfigImpl.java:92)
        at com.typesafe.config.impl.ConfigImpl.defaultReference(ConfigImpl.java:367)
        at com.typesafe.config.ConfigFactory.defaultReference(ConfigFactory.java:413)
        at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:307)
        at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:683)
        at akka.actor.ActorSystem$.apply(ActorSystem.scala:245)
        at akka.actor.ActorSystem$.apply(ActorSystem.scala:288)
        at akka.actor.ActorSystem$.apply(ActorSystem.scala:263)
        at akka.actor.ActorSystem$.create(ActorSystem.scala:191)
        at org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:106)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.startJobManagerActorSystem(FlinkMiniCluster.scala:300)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.singleActorSystem$lzycompute$1(FlinkMiniCluster.scala:329)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.org$apache$flink$runtime$minicluster$FlinkMiniCluster$$singleActorSystem$1(FlinkMiniCluster.scala:329)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:343)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:341)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.Range.foreach(Range.scala:160)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:341)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:323)
        at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:107)
        at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1501)


Increase in the POM - > Maven - shaded - plugin - > configuration area of your own jar project:

<transformers>
    <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
        <resource>reference.conf</resource>
    </transformer>
</transformers>


5 java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
After submit jar on the interface, report:

java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Could not run the jar.
    at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:90)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.util.FlinkException: Could not run the jar.
    ... 9 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The program caused an error: 
    at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:93)
    at org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:334)
    at org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:87)
    at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:69)
    ... 8 more
Caused by: java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.setDeserializer(FlinkKafkaConsumer09.java:286)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.<init>(FlinkKafkaConsumer09.java:213)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.<init>(FlinkKafkaConsumer09.java:152)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010.<init>(FlinkKafkaConsumer010.java:128)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010.<init>(FlinkKafkaConsumer010.java:112)
    at com.best.middleware.search.kafka2es.xngmonitor.flink.Main.main(Main.java:77)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
    at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83)
    ... 11 more


NoClassDefFoundError knows what's going on, compiles pass, but no classes are found at runtime. Roughly speaking, the reasons are very detailed on the internet.  


This time, under yarn's lib, other partners placed inconsistent versions of kafka packages, causing conflicts.


--------------------- 
Original: https://blog.csdn.net/weitry/article/details/80375446

Posted by pandhandp on Tue, 23 Jul 2019 02:12:34 -0700