Yarn tuning

Keywords: Big Data Hadoop NodeManager xml Spark

1.Yarn Common Commands:

[rachel@bigdata-senior01 bin]$ ./yarn
Usage: yarn [--config confdir] COMMAND
where COMMAND is one of:
  resourcemanager      run the ResourceManager
  nodemanager          run a nodemanager on each slave
  timelineserver       run the timeline server
  rmadmin              admin tools
  version              print the version
  jar <jar>            run a jar file
  application          prints application(s) report/kill application
  applicationattempt   prints applicationattempt(s) report
  container            prints container(s) report
  node                 prints node report(s)
  logs                 dump container logs
  classpath            prints the class path needed to get the
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

1.1 Yarn application command

usage: application
 -appStates <States>             Works with -list to filter applications
                                 based on input comma-separated list of
                                 application states. The valid application
                                 state can be one of the following:
                                 ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
                                 NING,FINISHED,FAILED,KILLED
 -appTypes <Types>               Works with -list to filter applications
                                 based on input comma-separated list of
                                 application types.
 -help                           Displays help for all commands.
 -kill <Application ID>          Kills the application.
 -list                           List applications. Supports optional use
                                 of -appTypes to filter applications based
                                 on application type, and -appStates to
                                 filter applications based on application
                                 state.
 -movetoqueue <Application ID>   Moves the application to a different
                                 queue.
 -queue <Queue Name>             Works with the movetoqueue command to
                                 specify which queue to move an
                                 application to.
 -status <Application ID>        Prints the status of the application.

1.2hadoop jar == yarn jar

1.3Mapred VS Yarn

yarn is a scheduling platform, not only running MR, but also Spark hive.
yarn application covers a wider range

	mapred job -list 
			   -kill job-id
	yarn applicaation -kill <Application ID>
					  -list

2. On Yarn Tuning

For a 64G machine in production, 25%linux, 75% to big data processes

2.1 Data localization

DN and NM are deployed on the same node for data localization. Most companies deploy them on the same machine.
When calculating, the data is fetched directly from the datanode, which avoids the transmission time on the network IO. (data node) is used to save data.
If DN and NM are separated, the purpose is to separate computation from storage.

Memory tuning for 2.2DN and NM

DN memory is not the bigger the better. Look at the default memory of DN. As you can see below, it is 1000m (note: not 1G,1G=1024M, here is 1000M).

[rachel@bigdata-senior02 bin]$ ps -ef|grep datanode
rachel    3597     1  0 07:18 ?        00:03:56 /opt/modules/jdk1.7.0_67/bin/java -Dproc_datanode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/modules/hadoop-2.5.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/modules/hadoop-2.5.0 -Dhadoop.id.str=rachel -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/modules/hadoop-2.5.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/modules/hadoop-2.5.0/logs -Dhadoop.log.file=hadoop-rachel-datanode-bigdata-senior02.rachel.com.log -Dhadoop.home.dir=/opt/modules/hadoop-2.5.0 -Dhadoop.id.str=rachel -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/modules/hadoop-2.5.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
rachel    7618  7524  0 16:22 pts/5    00:00:00 grep datanode

Tune the memory of DN. In hadoop-env.sh, this parameter is 4G in general production.

- Xmx is the largest memory, and - Xms is the smallest memory.

export HADOOP_DATANODE_OPTS="-Xmx1024m -Xms1024m -Dhadoop.security.logger=ERROR,RFA
S $HADOOP_DATANODE_OPTS"

Look at DM's default memory, which is 1000M

[rachel@bigdata-senior02 hadoop]$ ps -ef|grep nodemanager
rachel    4297     1  0 08:04 ?        00:02:58 /opt/modules/jdk1.7.0_67/bin/java -Dproc_nodemanager -Xmx1000m -Dhadoop.log.dir=/opt/modules/hadoop-2.5.0/logs -Dyarn.log.dir=/opt/modules/hadoop-2.5.0/logs -Dhadoop.log.file=yarn-rachel-nodemanager-bigdata-senior02.rachel.com.log -Dyarn.log.file=yarn-rachel-nodemanager-bigdata-senior02.rachel.com.log -Dyarn.home.dir= -Dyarn.id.str=rachel -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/modules/hadoop-2.5.0/lib/native -Dyarn.policy.file=hadoop-policy.xml -server -Dhadoop.log.dir=/opt/modules/hadoop-2.5.0/logs -Dyarn.log.dir=/opt/modules/hadoop-2.5.0/logs -Dhadoop.log.file=yarn-rachel-nodemanager-bigdata-senior02.rachel.com.log -Dyarn.log.file=yarn-rachel-nodemanager-bigdata-senior02.rachel.com.log -Dyarn.home.dir=/opt/modules/hadoop-2.5.0 -Dhadoop.home.dir=/opt/modules/hadoop-2.5.0 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/modules/hadoop-2.5.0/lib/native -classpath /opt/modules/hadoop-2.5.0/etc/hadoop:/opt/modules/hadoop-2.5.0/etc/hadoop:/opt/modules/hadoop-2.5.0/etc/hadoop:/opt/modules/hadoop-2.5.0/share/hadoop/common/lib/*:/opt/modules/hadoop-2.5.0/share/hadoop/common/*:/opt/modules/hadoop-2.5.0/share/hadoop/hdfs:/opt/modules/hadoop-2.5.0/share/hadoop/hdfs/lib/*:/opt/modules/hadoop-2.5.0/share/hadoop/hdfs/*:/opt/modules/hadoop-2.5.0/share/hadoop/yarn/lib/*:/opt/modules/hadoop-2.5.0/share/hadoop/yarn/*:/opt/modules/hadoop-2.5.0/share/hadoop/mapreduce/lib/*:/opt/modules/hadoop-2.5.0/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/contrib/capacity-scheduler/*.jar:/opt/modules/hadoop-2.5.0/share/hadoop/yarn/*:/opt/modules/hadoop-2.5.0/share/hadoop/yarn/lib/*:/opt/modules/hadoop-2.5.0/etc/hadoop/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager
rachel    7669  7524  0 16:26 pts/5    00:00:00 grep nodemanager

In yarn-env.sh, this parameter is 3G in general production.

export YARN_NODEMANAGER_OPTS="-Xms3072m -Xmx3072m"

2.3 Container container memory allocation

Container: A logical concept, an abstraction of Yarn's resources, encapsulates the multidimensional resources of a node, such as memory, cpu, and disk.
When AM requests resources from RM, the resources returned by RM for AM are represented by containers.

For the above calculation process, 0.75*64-4-3=41G

The reason is that all memory sizes that can be allocated to container s are 40G, because large data processes are not only DN and NM, but also probably kafka and zookeeper need to occupy 75% of the memory.

# All memory sizes that a node can allocate to a Container
yarn.nodemanager.resource.memory-mb		40

# Minimum memory size allocated per Container - > 40/2 can run up to 20 containers at the same time
 yarn.scheduler.mininum-allocation-mb 2G (official 1G)

# Maximum memory size allocated by each Container - - > 40/40 can run at least one container at the same time
 yarn.scheduler.maxinum-allocation-mb 40G (official 8G)

2.4: Tuning of Conatiner's Memory Check Parameters on Yarn

This is usually used more in spark.

Controlling memory overrun kills processes
yarn.nodemanager.pmem-check-enabled
 Find out that Kill will drop the container and kill the process at the same time
yarn.nodemanager.vmem-check-enabled

yarn.nodemanager.vmem-pmem-ratio
 Virtual memory: Physical memory production ratio is 2.1:1

3: CPU optimization

Vcore: the core of virtual cpu is a concept introduced by yarn itself, because the performance of each machine is different, and the resources represented by the core of each machine are different. yarn introduced the concept of Vcore in order to standardize these different machines.

A physical core defaults to two vcore s

In production, two cores are usually reserved, that is, four vcore s are missing.

yarn.nodemanager.resource.cpu-vcores	8

yarn.scheduler.mininum-allocation-vcores

4:yarn scheduler

FIFO: First in, first out, long waiting time for homework
capacity calculation (x)
yarn determines the size of tasks and assigns them to different queues
B queue is a small task queue that preoccupies cluster resources
Make multiple tasks run in parallel.
Job1 doesn't get much resources in a fair way. Because part of it needs to be allocated to small queues.
Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fair Fa
Adjust system resources dynamically for job.
Some tasks run at 5:00 release some resources
Run the second task.
When job 2 runs out, the resources are released
job1 will re-occupy resources
It not only ensures the high utilization of resources, but also ensures the timely completion of small tasks.

Posted by soloslinger on Sun, 03 Feb 2019 08:15:18 -0800

Programmer Group