YDB is deployed in Mini Centos environment based on haoop, zookeeper and kafka
YDB introduction
YDB full name Yan Yun YDB It is a real-time, multi-dimensional, interactive query, statistics and analysis engine based on Hadoop distributed architecture. It has second-level performance under trillion data scale and stable and reliable performance at enterprise level.
Objective of this paper
Build YDB based on virtual machine version for functional experience, without requiring performance and stability (in short, running), and virtual machine can run in the mainstream desktop environment.
Installation and use of stand-alone version of YDB in pursuit of high efficiency and stability Yanyun YDB Easy and Easy Edition.
Hardware and Operating System
- CPU: 1x2 core
- MEM: 4G
- HDD: 64G
- SYS: Centos 6.6 x64 mini
System Environment Configuration
Necessary packages
yum install openssh-clients unzip
system configuration
System Configuration Reference YDB Programming Guide or Detailed Operating System Environment Dependent on YDB
software package
jdk-8u60-linux-x64.tar.gz hadoop-2.7.3.tar.gz zookeeper-3.4.6.tar.gz kafka_2.11-0.10.0.1.tgz spark1.6.3_hadoop2.7.3.tar.gz <www.ycloud.net.cn> ya100-1.1.8.11.0710.1988.413.stable.zip <www.ycloud.net.cn>
hdfs
Create a data catalog
mkdir -p /data mkdir -p /data/tmp/hadoop mkdir -p /data/hadoop/hdfs/nn mkdir -p /data/hadoop/hdfs/dn mkdir -p /data/hadoop/hdfs/sn
core-site.xml
fs.defaultFS - hdfs://<hostname> hadoop.tmp.dir - /data/tmp/hadoop
hdfs-site.xml
dfs.replication - 1 dfs.namenode.name.dir - /data/hadoop/hdfs/nn dfs.datanode.data.dir - /data/hadoop/hdfs/dn dfs.namenode.checkpoint.dir - /data/hadoop/hdfs/sn dfs.namenode.secondary.http-address - <hostname>:50090 dfs.permissions.enabled - false
slaves
<hostname>
hadoop-env.sh
JAVA_HOME=<jdk home> HADOOP_HEAPSIZE=64 HADOOP_NAMENODE_INIT_HEAPSIZE=64
Disable Secondary Name Node
start-dfs.sh & stop-dfs.sh unset SECONDARY_NAMENODES # SECONDARY_NAMENODES=
Format
hdfs namenode -format
start and stopping
start-dfs start-yarn
yarn
yarn-site.xml
yarn.resourcemanager.hostname - <hostname> yarn.nodemanager.resource.memory-mb - 2048 yarn.nodemanager.resource.cpu-vcores - 4 yarn.scheduler.minimum-allocation-mb - 8 yarn.scheduler.maximum-allocation-mb - 2048 yarn.scheduler.minimum-allocation-vcores - 1 yarn.scheduler.maximum-allocation-vcores - 4 yarn.nodemanager.vmem-check-enabled - false yarn.nodemanager.pmem-check-enabled - false
yarn-env.sh
YARN_RESOURCEMANAGER_HEAPSIZE=64 YARN_NODEMANAGER_HEAPSIZE=64
start and stopping
stop-yarn stop-dfs
zookpeer
zookeeper-env.sh
export JAVA_HOME=<jdk_home>
zoo.cfg
tickTime=2000 dataDir=/data/zookeeper clientPort=2181
start and stopping
bin/zkServer start bin/zkServer stop
kafka
server.properties
zookeeper.connect=<hostname>:2181 log.cleaner.dedupe.buffer.size=5242880
kafka-server-start.sh
export KAFKA_HEAP_OPTS="-Xmx64m -Xms64m"
start and stopping
bin/kafka-server-start.sh config/server.properties & bin/kafka-server-stop.sh
ydb
ya100_env.sh
export HADOOP_CONF_DIR=<hadoop_conf_dir> export SPARK_HOME=<spark_home> export YA100_EXECUTORS=1 export YA100_MEMORY=512m export YA100_CORES=1 export YA100_DRIVER_MEMORY=256m export HDFS_USER=root
ydb_site.yaml
storm.zookeeper.servers: "<hostname>" ydb.ya100.hb.connuser: root bootstrap.servers.ydb_syslog: "<hostname>"
create topic
bin/kafka-topics.sh --create --zookeeper zvm:2181 --replication-factor 1 --partitions 1 --topic bcp003
start and stopping
bin/start-all.sh bin/stop-all.sh
Summary
netstat
netstat -anp | grep LISTEN | grep 50070 # NN WEB netstat -anp | grep LISTEN | grep 8020 # NN netstat -anp | grep LISTEN | grep 50010 # DN netstat -anp | grep LISTEN | grep 8088 # RM WEB netstat -anp | grep LISTEN | grep 8042 # NM netstat -anp | grep LISTEN | grep 2181 # ZK netstat -anp | grep LISTEN | grep 9092 # KF netstat -anp | grep LISTEN | grep 1210 # YDB WEB netstat -anp | grep LISTEN | grep 10009 # YDB JDBC
http
http://<hostname>:50070 NN http://<hostname>:8088 RM http://<hostname>:8042 NM http://<hostname>:1210 YDB