Kafka cluster installation
We use kafka at work and I wanted a home cluster so I could try out stuff and learn more about kafka.
To make it easier to mess up everything I have made a script that can automate the installation of the cluster for my needs.
Prequisites: Create 4 VM’s with
4 CPU
8GB RAM
16GB Disk
100GB Disk
Install debian on all of them - and make sure that the 16GB disk shows up as /dev/sda, since the script needs to use the 100GB disk as the root of the kafka installation.
Name them: kafka-zoo1, kafka-broker1, kafka-broker2, kafka-broker3 - and update dns so the names can resolve.
ssh to kafka-zoo1 create the kafka-install.sh in the current dir and paste the script from the bottom of this post.
Then run the script:
sudo ./kafka-install.sh zookeeper
Wait for installation to complete and verify that the zookeeper is running by doing
sudo systemctl status kafka-zookeeper
If the zookeeper is running, i.e. systemctl shows a status similar to:
● kafka-zookeeper.service - Apache Zookeeper server (Kafka)
Loaded: loaded (/kafka/kafka-zookeeper.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-01-09 13:26:59 CET; 2h 13min ago
Docs: http://zookeeper.apache.org
Main PID: 2437 (java)
Tasks: 38 (limit: 9470)
Memory: 99.3M
CPU: 16.634s
CGroup: /system.slice/kafka-zookeeper.service
└─2437 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLeve>
You are good to continue.
ssh to each of the broker servers and create the kafka-install.sh in the current dir and paste the script from the bottom of this post.
Then run the script with an incrementing number for the number at the end, i.e. on broker1 you use 1, on broker2 you use 2 and so forth:
sudo ./kafka-install.sh broker 1
Wait for installation to complete and verify that the broker service is running by doing
sudo systemctl status kafka-broker
If the broker is running, i.e. systemctl shows a status similar to:
● kafka-broker.service - Apache Zookeeper server (Kafka)
Loaded: loaded (/lib/systemd/system/kafka-broker.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-01-09 14:45:24 CET; 21s ago
Docs: http://zookeeper.apache.org
Main PID: 440 (java)
Tasks: 72 (limit: 9470)
Memory: 404.5M
CPU: 6.484s
CGroup: /system.slice/kafka-broker.service
└─440 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 >
This is a guide for myself.
Installation script
This needs to be created on all servers as an executable file called kafka-install.sh.
#!/bin/sh
type=$1
id=$2
base_install()
{
echo 'Creating user kafka'
sudo useradd -d /kafka -U -p 123456 kafka
echo 'Creating kafka partition'
sudo fdisk /dev/sdb <<EOT
n
p
w
EOT
echo 'Creating filesystem'
sudo mkfs -t ext4 /dev/sdb1
echo 'creating mount point & homedir /kafka'
sudo mkdir /kafka
echo 'mounting /dev/sdb1 to /kafka'
sudo mount -t auto /dev/sdb1 /kafka
echo 'Making mount point persistent'
uuid=$(sudo blkid /dev/sdb1 -o export -s UUID |grep UUID)
sudo sh -c "echo '$uuid /kafka ext4 errors=remount-ro 0 1' >> /etc/fstab"
echo 'Changing ownership of /kafka to kafka user'
sudo chown -R kafka:kafka /kafka
cd /kafka
sudo chmod a+rwx /kafka
echo 'installing java'
sudo apt install -y openjdk-11-jre-headless
}
install_broker()
{
broker_id=$1
echo 'Downloading kafka package'
sudo -u kafka wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
echo 'Uncompressing kafka package'
sudo -u kafka tar -xvpf kafka_2.13-3.0.0.tgz
echo 'creating symbolic link to /kafka/kafka'
sudo -u kafka ln -s ./kafka_2.13-3.0.0 ./kafka
cd /kafka/kafka
echo 'Updating configuration to zookeeper connection details'
sudo -u kafka sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=kafka-zoo1.root.dom:2181/g' config/server.properties
echo 'Changing broker id to $broker_id'
reg="s/broker.id=0/broker.id=$broker_id/g"
sudo -u kafka sed -i $reg config/server.properties
echo 'Changing log directory to /kafka/kafka-logs'
sudo -u kafka sed -i 's*log.dirs=/tmp/kafka-logs*log.dirs=/kafka/kafka-logs*g' config/server.properties
echo 'Creating systemd service file'
cat <<EOT >> /kafka/kafka-broker.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target kafka.mount
After=network.target remote-fs.target kafka.mount
[Service]
Type=simple
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/kafka/kafka/bin/kafka-server-start.sh /kafka/kafka/config/server.properties
ExecStop=/kafka/kafka/bin/kafka-server-stop.sh
[Install]
WantedBy=multi-user.target
EOT
sudo cp /kafka/kafka-broker.service /usr/lib/systemd/system/kafka-broker.service
echo 'enabling and starting broker service'
sudo systemctl daemon-reload
sudo systemctl enable kafka-broker.service
sudo systemctl start kafka-broker
}
install_zookeeper()
{
echo 'Downloading kafka package'
sudo -u kafka wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
echo 'Uncompressing kafka package'
sudo -u kafka tar -xvpf kafka_2.13-3.0.0.tgz
echo 'creating symbolic link to /kafka/kafka'
sudo -u kafka ln -s ./kafka_2.13-3.0.0 ./kafka
cd /kafka/kafka
echo 'Changing datadir for zookeeper'
sudo -u kafka sed -i 's*dataDir=/tmp/zookeeper*dataDir=/kafka/zookeper-data*g' config/zookeeper.properties
echo 'Creating systemd service file'
cat <<EOT >> /kafka/kafka-zookeeper.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target kafka.mount
After=network.target remote-fs.target kafka.mount
[Service]
Type=simple
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/kafka/kafka/bin/zookeeper-server-start.sh /kafka/kafka/config/zookeeper.properties
ExecStop=/kafka/kafka/bin/zookeeper-server-stop.sh
[Install]
WantedBy=multi-user.target
EOT
sudo cp /kafka/kafka-zookeeper.service /usr/lib/systemd/system/kafka-zookeeper.service
echo 'enabling and starting broker service'
sudo systemctl daemon-reload
sudo systemctl enable kafka-zookeeper.service
sudo systemctl start kafka-zookeeper
}
case "$type" in
"broker") echo "Broker installation starting"
base_install
install_broker $id
;;
"zookeeper") echo "Zookeeper installation starting"
base_install
install_zookeeper
;;
*) echo "Provide either broker or zookeeper as first argument"
;;
esac