Git Product home page Git Product logo

clickhouse-setup's Introduction

Tutorial for set up clickhouse server

Single server with docker

  • Run server
docker run -d --name clickhouse-server -p 9000:9000 --ulimit nofile=262144:262144 yandex/clickhouse-server

  • Run client
docker run -it --rm --link clickhouse-server:clickhouse-server yandex/clickhouse-client  --host clickhouse-server

Now you can see if it success setup or not.

Setup Cluster

This part we will setup

  • 1 cluster, with 3 shards
  • Each shard has 2 replica server
  • Use ReplicatedMergeTree & Distributed table to setup our table.

Cluster

Let's see our docker-compose.yml first.

version: '3'

services:
    clickhouse-zookeeper:
        image: zookeeper
        ports:
            - "2181:2181"
            - "2182:2182"
        container_name: clickhouse-zookeeper
        hostname: clickhouse-zookeeper

    clickhouse-01:
        image: yandex/clickhouse-server
        hostname: clickhouse-01
        container_name: clickhouse-01
        ports:
            - 9001:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-01.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-01:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"

    clickhouse-02:
        image: yandex/clickhouse-server
        hostname: clickhouse-02
        container_name: clickhouse-02
        ports:
            - 9002:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-02.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-02:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"

    clickhouse-03:
        image: yandex/clickhouse-server
        hostname: clickhouse-03
        container_name: clickhouse-03
        ports:
            - 9003:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-03.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-03:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"

    clickhouse-04:
        image: yandex/clickhouse-server
        hostname: clickhouse-04
        container_name: clickhouse-04
        ports:
            - 9004:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-04.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-04:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"

    clickhouse-05:
        image: yandex/clickhouse-server
        hostname: clickhouse-05
        container_name: clickhouse-05
        ports:
            - 9005:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-05.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-05:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"

    clickhouse-06:
        image: yandex/clickhouse-server
        hostname: clickhouse-06
        container_name: clickhouse-06
        ports:
            - 9006:9000
        volumes:
                - ./config/clickhouse_config.xml:/etc/clickhouse-server/config.xml
                - ./config/clickhouse_metrika.xml:/etc/clickhouse-server/metrika.xml
                - ./config/macros/macros-06.xml:/etc/clickhouse-server/config.d/macros.xml
                # - ./data/server-06:/var/lib/clickhouse
        ulimits:
            nofile:
                soft: 262144
                hard: 262144
        depends_on:
            - "clickhouse-zookeeper"
networks:
    default:
        external:
            name: clickhouse-net

We have 6 clickhouse server container and one zookeeper container.

To enable replication ZooKeeper is required. ClickHouse will take care of data consistency on all replicas and run restore procedure after failure automatically. It's recommended to deploy ZooKeeper cluster to separate servers.

ZooKeeper is not a requirement — in some simple cases you can duplicate the data by writing it into all the replicas from your application code. This approach is not recommended — in this case ClickHouse is not able to guarantee data consistency on all replicas. This remains the responsibility of your application.

Let's see config file.

./config/clickhouse_config.xml is the default config file in docker, we copy it out and add this line

    <!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
         By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
         Values for substitutions are specified in /yandex/name_of_substitution elements in that file.
      -->
    <include_from>/etc/clickhouse-server/metrika.xml</include_from>

So lets see metrika.xml

<yandex>
	<clickhouse_remote_servers>
		<cluster_1>
			<shard>
                                <weight>1</weight>
                                <internal_replication>true</internal_replication>
				<replica>
					<host>clickhouse-01</host>
					<port>9000</port>
				</replica>
				<replica>
					<host>clickhouse-06</host>
					<port>9000</port>
				</replica>
			</shard>
			<shard>
                                <weight>1</weight>
                                <internal_replication>true</internal_replication>
				<replica>
					<host>clickhouse-02</host>
					<port>9000</port>
				</replica>
				<replica>
					<host>clickhouse-03</host>
					<port>9000</port>
				</replica>
			</shard>
			<shard>
                                <weight>1</weight>
                                <internal_replication>true</internal_replication>

				<replica>
					<host>clickhouse-04</host>
					<port>9000</port>
				</replica>
				<replica>
					<host>clickhouse-05</host>
					<port>9000</port>
				</replica>
			</shard>
		</cluster_1>
	</clickhouse_remote_servers>
        <zookeeper-servers>
            <node index="1">
                <host>clickhouse-zookeeper</host>
                <port>2181</port>
            </node>
        </zookeeper-servers>
        <networks>
            <ip>::/0</ip>
        </networks>
        <clickhouse_compression>
            <case>
                <min_part_size>10000000000</min_part_size>
                <min_part_size_ratio>0.01</min_part_size_ratio>
                <method>lz4</method>
            </case>
        </clickhouse_compression>
</yandex>

and macros.xml, each instances has there own macros settings, like server 1:

<yandex>
    <macros>
        <replica>clickhouse-01</replica>
        <shard>01</shard>
        <layer>01</layer>
    </macros>
</yandex>

Make sure your macros settings is equal to remote server settings in metrika.xml

So now you can start the server.

docker network create clickhouse-net
docker-compose up -d

Conn to server and see if the cluster settings fine;

docker run -it --rm --network="clickhouse-net" --link clickhouse-01:clickhouse-server yandex/clickhouse-client --host clickhouse-server
clickhouse-01 :) select * from system.clusters;

SELECT *
FROM system.clusters 

┌─cluster─────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┐
│ cluster_1                   │         111 │ clickhouse-01172.21.0.490001 │ default │                  │
│ cluster_1                   │         112 │ clickhouse-06172.21.0.590001 │ default │                  │
│ cluster_1                   │         211 │ clickhouse-02172.21.0.890000 │ default │                  │
│ cluster_1                   │         212 │ clickhouse-03172.21.0.690000 │ default │                  │
│ cluster_1                   │         311 │ clickhouse-04172.21.0.790000 │ default │                  │
│ cluster_1                   │         312 │ clickhouse-05172.21.0.390000 │ default │                  │
│ test_shard_localhost        │         111 │ localhost     │ 127.0.0.190001 │ default │                  │
│ test_shard_localhost_secure │         111 │ localhost     │ 127.0.0.194400 │ default │                  │
└─────────────────────────────┴───────────┴──────────────┴─────────────┴───────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┘

If you see this, it means cluster's settings work well(but not conn fine).

Replica Table

So now we have a cluster and replica settings. For clickhouse, we need to create ReplicatedMergeTree Table as a local table in every server.

CREATE TABLE ttt (id Int32) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/ttt', '{replica}') PARTITION BY id ORDER BY id

and Create Distributed Table conn to local table

CREATE TABLE ttt_all as ttt ENGINE = Distributed(cluster_1, default, ttt, rand());

Insert and test

gen some data and test.

# docker exec into client server 1 and
for ((idx=1;idx<=100;++idx)); do clickhouse-client --host clickhouse-server --query "Insert into default.ttt_all values ($idx)"; done;

For Distributed table.

select count(*) from ttt_all;

For loacl table.

select count(*) from ttt;

Authentication

Please see config/users.xml

  • Conn
docker run -it --rm --network="clickhouse-net" --link clickhouse-01:clickhouse-server yandex/clickhouse-client --host clickhouse-server -u user1 --password 123456

Source

clickhouse-setup's People

Contributors

jneo8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

clickhouse-setup's Issues

Support for on cluster '{cluster}'

Thanks for creating this repo !

While creating replicated tables / DBs with something like,

CREATE DATABASE trails on cluster '{cluster}';

I get a missing macro warning.

Code: 62. DB::Exception: Received from clickhouse-server:9000. 
DB::Exception: No macro 'cluster' in config while processing substitutions in '{cluster}' at '1' 
or macro is not supported here.

This goes away if I add <cluster>cluster_1</cluster> to all the macros

<yandex>
    <macros>
        <cluster>cluster_1</cluster>       -- << specify cluster macro
        <replica>clickhouse-06</replica>
        <shard>01</shard>
        <layer>01</layer>
    </macros>
</yandex>

and the command now works,

clickhouse-01 :) CREATE DATABASE trails on cluster '{cluster}';

CREATE DATABASE trails ON CLUSTER `{cluster}`

Query id: 97728a04-2f0e-49c3-8a66-49b25e891c1f

┌─host──────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse-0690000 │       │                   50 │
│ clickhouse-0290000 │       │                   40 │
│ clickhouse-0390000 │       │                   30 │
│ clickhouse-0590000 │       │                   20 │
│ clickhouse-0490000 │       │                   10 │
│ clickhouse-0190000 │       │                   00 │
└───────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

6 rows in set. Elapsed: 0.177 sec.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.