Cassandra best practice configuration

Keywords: Operation & Maintenance Apache network SSL

In this article, we mainly introduce the related configurations of Cassandra. I will introduce the related configurations of Cassandra according to the horizontal dimension of cluster and node. I may not list some configurations here. Please refer to the detailed introduction in cassandra.yaml. How to configure Cassandra needs to be in the C under the conf directory when the cluster starts Just configure in assandra.yaml. In addition, our configuration needs to comply with yaml file configuration rules. 3.11.4

Cluster dimension

cluster_name: //The name of the cluster, which is Test Cluster by default, is enclosed by ''. Nodes with different cluster name s cannot form a cluster

num_tokens: 256 //The number of allocation tokens for a single node in the cluster. Because each token is randomly generated by using vnode, that is, the number of vnodes. In addition, if vnode is not used, each node can be used to pre allocate an initial token, which can be configured as follows:;

initial_token: //If the cluster does not want to use vnode, it needs to manually configure the token for each node and manually calculate the number of tokens of nodes. However, it is recommended to double the capacity when expanding. Vnode does not need

partitioner: //The cluster's data allocation algorithm, that is, the module in the common consistency hash algorithm that computes the hash, uses org.apache.cassandra.dht.Murmur3Partitioner by default, which is the recommended mum3 hash strategy now. There are other RandomPartitioner(md5), orderpreserving partitioner, byteorderedpartitioner (dictionary order), which is more compatible with scan, but there will be K Ey tilt

seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "127.0.0.1"
//This configuration is relatively important. It mainly refers to the configuration of seed nodes in a cluster. All nodes need to be the same, and the number is limited to a certain extent, depending on the cluster size. If it is 10 nodes, 2 are recommended to be ok.

endpoint_snitch: //The snitch strategy of a cluster involves the measurement of the replica node management of the cluster. The default is SimpleSnitch

Single node dimension

Related configurations involving a single node:

 hints_directory: //The directory of hint involved uses / var/lib/cassandra/hints by default. If a node is hung up and the node is responsible for hanging up the node hint record, the data will be recorded under this directory
 
 authenticator: //Authentication is related. The default is allowalauthenticator. All of them can pass the authentication and use account password authentication. This requires that all nodes in the cluster use the same configuration
 
 authorizer: //Authentication. By default, all can have any operation. You can configure Cassandra authority to perform different authentication. The configuration of each node is the same;
 
 data_file_directories: //The default directory for data files is / var/lib/cassandra/data. It is recommended to use multiple flash disks in combination.
 
 commitlog_directory: //The configuration directory of commitlog is / var/lib/cassandra/commitlog by default, and better disk configuration is recommended
 
 cdc_enabled: //Whether to use the cdc function is off by default. If it is on, the table configuration is also required to be on
 
 disk_failure_policy: //The processing configuration of single data disk's bad disk is stop by default. It does not accept the mission response and stops cilent's service. But it can be accessed through jmx
 
 commit_failure_policy: //The default is stop
 
 commitlog_sync:  //The default data sync policy of commitlog is periodic, which will affect the write performance of the system;
 commitlog_sync_period_in_ms: // The flush frequency in period mode is 10000ms; another sync strategy can also be used, which is one of the two options with period;
 commitlog_sync: //If it is batch, it is the time configuration of batch sync.
 commitlog_sync_batch_window_in_ms: 2
 
 commitlog_segment_size_in_mb: //How big is the commitlog? The default value is 32m
 commitlog_compression: //Support commitlog compression, default is lz4
 
 concurrent_reads: //Node read thread pool thread number, 32 by default, io bound, 16 * disk number recommended
 concurrent_writes: //Node write thread pool thread number, 32 by default, cpu bound, 8*cpu core number recommended
 concurrent_counter_writes: //counter thread pool number, 32 by default, the same as read
 
 memtable_allocation_type: //The default memory management of memtable is heap buffer, that is, on heap Buffer Management
 
 listen_address: //The address of the node listening service and the bind address interface of the local node. If the configuration file is not easy to manage, you can use the unified network card bind configuration: listen? Interface: eth0
 rpc_address: //Before 4.0, you need thrift. All of this can be configured with the default localhost
 
 

Other

Of course, if the cluster needs other configurations, such as security related, client to server, server to server ssl configuration, etc., you can configure them in yam.

System level configuration is described in other chapters.

Posted by mysqlnewjack on Mon, 23 Mar 2020 09:13:20 -0700