Apache ZooKeeper Cluster Installation Guide

Introduction

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable.

Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

ZooKeeper’s architecture supports high availability through redundant services. In case if some instance of Zookeeper fails, then clients can ask another ZooKeeper leader instance. ZooKeeper nodes store their data in a hierarchical name space, much like a file system or a tree data structure. ZooKeeper is used by many well known companies including Yahoo, Odnoklassniki, Reddit, NetApp, eBay and many others.

This article focuses on how to install and configure Apache ZooKeeper cluster on Linux.

Hardware requirements

  • For reliable ZooKeeper service, you should deploy ZooKeeper in a cluster known as an ensemble. As long as a majority of the ensemble is up, the service will be available. Because Zookeeper requires a majority, it is best to use an odd number of machines. For example, with four machines ZooKeeper can only handle the failure of a single machine; if two machines fail, the remaining two machines do not constitute a majority. However, with five machines ZooKeeper can handle the failure of two machines.
  • Apache recommending to deploy ZooKeeper on dedicated RHEL servers, with dual-core processors, 2GB of RAM, and 80GB IDE hard drives.

Software requirements

  • CentOS 7/RHEL 64 bit Operating System.
  • Java SE Development Kit 6 or greater.

Installation

Create a User for ZooKeeper

  • As root, create a user called zookeeper

$ useradd zookeeper

  • Set password

$ passwd zookeeper

  • Your ZooKeeper user is now ready. Log into it.
$ su - zookeeper

Download latest stable ZooKeeper

  • Download the 3.4.8 release into user folder.

$ wget http://apache.spd.co.il/zookeeper/stable/zookeeper-3.4.8.tar.gz

  • Unpack downloaded archive.

$ tar -xzf zookeeper-3.4.8.tar.gz

Configure the ZooKeeper Server

  • The configuration below is relevant for each node in the cluster.
  • Create in data directory /var/zookeeper/ the myid file with unique server identifier. For example, myid of server 1 would contain the text "1" and nothing more.
  • Create file conf/zoo.cfg with following configuration:

tickTime=2000
dataDir=/var/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.=:2888:3888
server.=:2888:3888
. . .

Note: To explore nodes in the cluster ZooKeeper uses zoo.cfg file. In most cases the zoo.cfg file is the same on all nodes.

  • As user root grant write permissions for /var/zookeeper folder

$ chmod -R 777 /var/zookeeper

Starting the server

$ bin/zkServer.sh start

ZooKeeper cluster status check

  • To check the status of ZooKeeper cluster run the following command on each terminal

$ bin/zkServer.sh status

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.