Kafka cluster installation

Sun, Jan 9, 2022 4-minute read

We use kafka at work and I wanted a home cluster so I could try out stuff and learn more about kafka.

To make it easier to mess up everything I have made a script that can automate the installation of the cluster for my needs.

Prequisites: Create 4 VM’s with

4 CPU
8GB RAM
16GB Disk
100GB Disk

Install debian on all of them - and make sure that the 16GB disk shows up as /dev/sda, since the script needs to use the 100GB disk as the root of the kafka installation.

Name them: kafka-zoo1, kafka-broker1, kafka-broker2, kafka-broker3 - and update dns so the names can resolve.

ssh to kafka-zoo1 create the kafka-install.sh in the current dir and paste the script from the bottom of this post.

Then run the script:

sudo ./kafka-install.sh zookeeper

Wait for installation to complete and verify that the zookeeper is running by doing

sudo systemctl status kafka-zookeeper

If the zookeeper is running, i.e. systemctl shows a status similar to:

● kafka-zookeeper.service - Apache Zookeeper server (Kafka)
     Loaded: loaded (/kafka/kafka-zookeeper.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-01-09 13:26:59 CET; 2h 13min ago
       Docs: http://zookeeper.apache.org
   Main PID: 2437 (java)
      Tasks: 38 (limit: 9470)
     Memory: 99.3M
        CPU: 16.634s
     CGroup: /system.slice/kafka-zookeeper.service
             └─2437 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLeve>

You are good to continue.

ssh to each of the broker servers and create the kafka-install.sh in the current dir and paste the script from the bottom of this post.

Then run the script with an incrementing number for the number at the end, i.e. on broker1 you use 1, on broker2 you use 2 and so forth:

sudo ./kafka-install.sh broker 1

Wait for installation to complete and verify that the broker service is running by doing

sudo systemctl status kafka-broker

If the broker is running, i.e. systemctl shows a status similar to:

● kafka-broker.service - Apache Zookeeper server (Kafka)
     Loaded: loaded (/lib/systemd/system/kafka-broker.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-01-09 14:45:24 CET; 21s ago
       Docs: http://zookeeper.apache.org
   Main PID: 440 (java)
      Tasks: 72 (limit: 9470)
     Memory: 404.5M
        CPU: 6.484s
     CGroup: /system.slice/kafka-broker.service
             └─440 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 >

This is a guide for myself.

Installation script

This needs to be created on all servers as an executable file called kafka-install.sh.

#!/bin/sh
type=$1
id=$2

base_install()
{
    echo 'Creating user kafka'
    sudo useradd -d /kafka -U -p 123456 kafka
    echo 'Creating kafka partition'
    sudo fdisk /dev/sdb <<EOT
n
p



w
EOT

    echo 'Creating filesystem'
    sudo mkfs -t ext4 /dev/sdb1

    echo 'creating mount point & homedir /kafka'
    sudo mkdir /kafka

    echo 'mounting /dev/sdb1 to /kafka'
    sudo mount -t auto /dev/sdb1 /kafka

    echo 'Making mount point persistent'
    uuid=$(sudo blkid /dev/sdb1 -o export -s UUID |grep UUID)
    sudo sh -c "echo '$uuid /kafka               ext4    errors=remount-ro 0       1' >> /etc/fstab"

    echo 'Changing ownership of /kafka to kafka user'
    sudo chown -R kafka:kafka /kafka

    cd /kafka
    sudo chmod a+rwx /kafka

    echo 'installing java'
    sudo apt install -y openjdk-11-jre-headless
}


install_broker()
{
	broker_id=$1
	echo 'Downloading kafka package'
	sudo -u kafka wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
	echo 'Uncompressing kafka package'
	sudo -u kafka tar -xvpf kafka_2.13-3.0.0.tgz
	echo 'creating symbolic link to /kafka/kafka'
	sudo -u kafka ln -s ./kafka_2.13-3.0.0 ./kafka
	cd /kafka/kafka
	echo 'Updating configuration to zookeeper connection details'
	sudo -u kafka sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=kafka-zoo1.root.dom:2181/g' config/server.properties


	echo 'Changing broker id to $broker_id'
	reg="s/broker.id=0/broker.id=$broker_id/g"
	sudo -u kafka sed -i $reg config/server.properties
	echo 'Changing log directory to /kafka/kafka-logs'
	sudo -u kafka sed -i 's*log.dirs=/tmp/kafka-logs*log.dirs=/kafka/kafka-logs*g' config/server.properties

	echo 'Creating systemd service file'
	cat <<EOT >> /kafka/kafka-broker.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target kafka.mount
After=network.target remote-fs.target kafka.mount

[Service]
Type=simple
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/kafka/kafka/bin/kafka-server-start.sh /kafka/kafka/config/server.properties
ExecStop=/kafka/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target
EOT

	sudo cp /kafka/kafka-broker.service /usr/lib/systemd/system/kafka-broker.service
	
	echo 'enabling and starting broker service'
	sudo systemctl daemon-reload
	sudo systemctl enable kafka-broker.service
	sudo systemctl start kafka-broker
}

install_zookeeper()
{
	echo 'Downloading kafka package'
	sudo -u kafka wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz

	echo 'Uncompressing kafka package'
	sudo -u kafka tar -xvpf kafka_2.13-3.0.0.tgz

	echo 'creating symbolic link to /kafka/kafka'
	sudo -u kafka  ln -s ./kafka_2.13-3.0.0 ./kafka
	cd /kafka/kafka

	echo 'Changing datadir for zookeeper'
	sudo -u kafka sed -i 's*dataDir=/tmp/zookeeper*dataDir=/kafka/zookeper-data*g' config/zookeeper.properties

	echo 'Creating systemd service file'
	cat <<EOT >> /kafka/kafka-zookeeper.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target kafka.mount
After=network.target remote-fs.target kafka.mount

[Service]
Type=simple
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/kafka/kafka/bin/zookeeper-server-start.sh /kafka/kafka/config/zookeeper.properties
ExecStop=/kafka/kafka/bin/zookeeper-server-stop.sh

[Install]
WantedBy=multi-user.target
EOT

	sudo cp /kafka/kafka-zookeeper.service /usr/lib/systemd/system/kafka-zookeeper.service

	echo 'enabling and starting broker service'
	sudo systemctl daemon-reload
	sudo systemctl enable kafka-zookeeper.service
	sudo systemctl start kafka-zookeeper

}

case "$type" in
    "broker") echo "Broker installation starting"
	base_install
	install_broker $id
	;;
    "zookeeper") echo "Zookeeper installation starting"
	base_install
	install_zookeeper
	;;
	*) echo "Provide either broker or zookeeper as first argument"
	;;
esac