ClickHouse High Availability Cluster

Keywords: Zookeeper CentOS xml

After chatting with some students who use clickhouse, many small and medium-sized companies still use the stand-alone version, which is unexpected; perhaps it is because of the clickhouse's performance in seconds and seconds, coupled with the small amount of data, the cost of data recovery is not high.

Topic: Today we try to build a 2*2 clickhouse cluster with two partitions, each with two replicas, a total of four nodes. The logical topology diagram is as follows.

host shard replica
hadoop2 01 01
hadoop5 01 02
hadoop6 02 01
hadoop7 02 02

1. Install clickhouse-server

Install clickhouse-server on all nodes, for reference clickhouse install on centos

2. Modify configuration config.xml

It involves three parts: remote_servers,zookeeper,macros, remote_servers and zookeeper of all nodes are the same, but different is macros. Each node modifies the values of shard and replica according to its role; the configuration of hadoop2 node is given below.

 <remote_servers incl="clickhouse_remote_servers">
        <perftest_2shards_2replicas>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>hadoop2</host>
                    <port>9666</port>
                </replica>
                <replica>
                    <host>hadoop5</host>
                    <port>9666</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>hadoop6</host>
                    <port>9666</port>
                </replica>
                <replica>
                    <host>hadoop7</host>
                    <port>9666</port>
                </replica>
            </shard>
        </perftest_2shards_2replicas>
    </remote_servers>


    <zookeeper>
        <node index="1">
            <host>hadoop1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>hadoop2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>hadoop3</host>
            <port>2181</port>
        </node>
        <node index="4">
            <host>hadoop4</host>
            <port>2181</port>
        </node>
        <node index="5">
            <host>hadoop5</host>
            <port>2181</port>
        </node>

    </zookeeper>

    <macros>
        <shard>01</shard>
        <replica>01</replica>
    </macros>

Start all clickhouse-server s

service clickhouse-server start

3. Formulation

Create t_s2_r2 at each node. The table template is as follows. shard and replica need to be replaced by themselves.

CREATE TABLE t_s2_r2\
(\
    dt Date,\
    path String \
)\
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/t_s2_r2','{replica}',dt, dt, 8192) 

For example, hadoop2 nodes

CREATE TABLE t_s2_r2\
(\
    dt Date,\
    path String \
)\
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/t_s2_r2','01',dt, dt, 8192) 

Create a distributed table t_s2_r2_all, which can be created at any node. t_s2_r2_all is like a view, pointing to all fragments, and the data is actually stored in the t_s2_r2 table of each node.

CREATE TABLE t_s2_r2_all AS t_s2_r2 ENGINE = Distributed(perftest_2shards_2replicas, default, t_s2_r2, rand())

insert data

insert into t_s2_r2_all values('2019-07-21','path1')
insert into t_s2_r2_all values('2019-07-22','path1')
insert into t_s2_r2_all values('2019-07-23','path1')
insert into t_s2_r2_all values('2019-07-23','path1')

View data

hadoop7 :) select * from t_s2_r2_all

SELECT *
FROM t_s2_r2_all

┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
│ 2019-07-22 │ path1 │
│ 2019-07-24 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

4 rows in set. Elapsed: 0.009 sec.

View each node table t_s2_r2 data

hadoop2 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

hadoop5 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

1 rows in set. Elapsed: 0.007 sec.

hadoop6 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
│ 2019-07-22 │ path1 │
│ 2019-07-24 │ path1 │
└────────────┴───────┘

3 rows in set. Elapsed: 0.002 sec.

hadoop7 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
│ 2019-07-22 │ path1 │
│ 2019-07-24 │ path1 │
└────────────┴───────┘

3 rows in set. Elapsed: 0.002 sec.

You can see that the data of hadoop2 and hadoop5 are identical, and the data of hadoop6 and hadoop7 are identical. Now, under the test, the data of hadoop2 is highly available and the nodes of hadoop2 are eliminated.
service clickhouse-server stop

The t_s2_r2_all data is still available, because shard01 also has a surviving copy of hadoop5

hadoop7 :) select * from t_s2_r2_all

SELECT *
FROM t_s2_r2_all

┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
│ 2019-07-22 │ path1 │
│ 2019-07-24 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

4 rows in set. Elapsed: 0.008 sec.

Data can still be inserted

insert into t_s2_r2_all values('2019-07-29','path2')

This data falls on shard01.

hadoop5 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-29 │ path2 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

2 rows in set. Elapsed: 0.002 sec.

Now start the clickhouse-server on the hadoop2 node. The data just inserted will automatically synchronize to t_s2_r2 of hadoop2.

hadoop2 :) select * from t_s2_r2

SELECT *
FROM t_s2_r2

┌─────────dt─┬─path──┐
│ 2019-07-29 │ path2 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

2 rows in set. Elapsed: 0.003 sec.

When the clickhouse-server on hadoop2 and hadoop5 is destroyed at the same time, t_s2_r2_all is not available. This should be well understood, and it is not tested here.

End

Posted by djtozz on Mon, 29 Jul 2019 02:39:08 -0700