Building Infrastructure with Docker — Part 2: Kafka + Zookeeper
🚀 Building Infrastructure with Docker — Part 2
Kafka + Zookeeper with Dedicated Dockerfiles, Better Debugging, and Real Observability
In Part 1, we created a reusable PostgreSQL module with a disciplined scaffold: pinned versions, Makefile lifecycle, health checks, and local bind-mount volumes.
In this part, we focus on Kafka + Zookeeper — but not as a bare-minimum broker. We’ll run Kafka as an independent, reusable module, enriched with:
- Extra debugging tools inside the container
- Prometheus-friendly JMX metrics
- Scripted topic initialization for realistic usage
- The same canonical project structure and Makefile contract
This module is designed to be dropped into any project as a ready-to-use event backbone.
🎯 Objectives for the Kafka Module
From the requirements, the Kafka setup must satisfy: kafkaRequirements
| Area | Requirement |
|---|---|
| Deployment | Local via Docker Compose |
| Modularity | Kafka lives in its own subfolder, sharing a Docker bridge network |
| Persistence | Message logs in ./docker-volume/kafka/ |
| Observability | JMX exporter enabled, Prometheus-ready metrics exposed |
| Debuggability | Container image enriched with basic debugging tools |
| Security | Prepared for future mTLS / JWT / hardened configs (not forced yet) |
We keep Kafka fully independent as a module, but it’s ready to plug into other infra (Prometheus, Grafana, Debezium, etc.) later.
📦 High-Level Architecture
Your reference file defines the new compose layout (Kafka + Zookeeper built from Dockerfiles). This architecture now looks like:
Zookeeper starts first → Kafka waits for readiness → Kafka registers properly with Zookeeper → JMX metrics become available → Topics can be initialized.
This sequence ensures deterministic startup.
🧱 Folder Structure (Canonical, Clean, Uniform)
modules/kafka/
infra/
docker-compose.yml
Dockerfile.kafka
Dockerfile.zookeeper
docker-volume/
kafka/
zookeeper/
init/
init-topics.sh
jmx-exporter/
kafka-2_0_0.yml
scripts/
test_health.sh
.env.example
Makefile
Jenkinsfile
docs/
README.md
requirements.md
design-intent.md
diagrams/
This matches the structure defined in Part-0 and Part-1 — every module in this series follows this consistent pattern so that once you learn one, you master them all.
🧱 Components in This Module
From the requirement spec: kafkaRequirements
| Service | Source | Role |
|---|---|---|
| Kafka Broker | Bitnami Custom image kafka (Dockerfile) | Core message broker + debug tooling |
| Zookeeper | Bitnami Custom image Zookeeper (Dockerfile) | Coordination and metadata |
| JMX Exporter Conf | jmx-exporter/kafka-2_0_0.yml |
Exposes Kafka metrics to Prometheus |
Key points:
- Kafka runs from a custom Dockerfile that includes additional debug tools.
- Debug tools are installed via
requirements_debug.shat image build time, providing utilities likecurl,net-tools,ping,lsof,procps,htop, etc., for live troubleshooting inside the container. requirements_debug - JMX metrics are enabled and configured using
kafka-2_0_0.yml, so Prometheus can scrape Kafka with minimal additional setup. kafkaRequirements
🛠 Debug-Enriched Kafka Image
The Kafka broker image is built from a dedicated Dockerfile (in this module), which uses requirements_debug.sh to install a curated set of CLI tools: requirements_debug
curlnet-tools(e.g.,netstat)iputils-pingdnsutilsiproute2lsofprocps(ps,top)less,vim,nanohtop
This means when something goes wrong, you don’t have to rebuild images just to run basic diagnostics; you can:
- Inspect network connectivity directly from inside the broker container.
- Test DNS resolution and connectivity to other services.
- Inspect open ports and processes.
For infra education and local troubleshooting, this is a big win.
🛠 Kafka as a Custom Image (Dockerfile.kafka)
Our Kafka Dockerfile now:
- Installs Kafka from the official Apache distribution
- Installs debugging utilities using
requirements_debug.sh - Enables JMX exporter for Prometheus
- Uses
.envto configure listener host, ports, and broker ID - Uses
docker-volume/kafka/for logs and persistent storage
Why a custom Dockerfile?
Because real clusters aren’t built from “tutorial images.”
We need:
- Predictability
- Debuggability
- Observability
- Reusability across projects
This aligns exactly with enterprise design constraints.
FROM bitnamilegacy/kafka:3.4
# Copy the JMX config directory
COPY jmx-exporter/ /jmx_exporter/
# Copy Kafka topic initialization scripts
#COPY init/ /opt/kafka-init/
# Ensure permissions are correct
USER root
# debugging steps start ---
COPY requirements_debug.sh .
# Conditionally install debug tools
RUN if [ -f requirements_debug.sh ]; then \
echo "[INFO] Found requirements_debug.sh. Executing..."; \
chmod +x requirements_debug.sh && ./requirements_debug.sh; \
else \
echo "[INFO] No debug script found. Skipping..."; \
fi
# debugging steps end ---
#RUN chmod +x /opt/kafka-init/init-topics.sh
RUN chown -R 1001:1001 /jmx_exporter
RUN chown -R 1001:1001 /bitnami/kafka
#RUN chown -R 1001:1001 /opt/kafka-init
USER 1001
🦍 Zookeeper as a Custom Image (Dockerfile.zookeeper)
We no longer rely on Bitnami or Confluent base images.
Our Zookeeper Dockerfile:
- Installs Zookeeper natively
- Includes the same debugging tools strategy
- Exposes metrics-friendly configuration
- Uses
docker-volume/zookeeper/for its data - Keeps everything deterministic and version-pinned
This also unlocks the option to add:
- Zookeeper JMX metrics
- ACLs
- Multi-node quorum setups
in future parts if needed.
FROM bitnamilegacy/zookeeper:3.8
# Ensure permissions are correct
USER root
# debugging steps start ---
COPY requirements_debug.sh .
# Conditionally install debug tools
RUN if [ -f requirements_debug.sh ]; then \
echo "[INFO] Found requirements_debug.sh. Executing..."; \
chmod +x requirements_debug.sh && ./requirements_debug.sh; \
else \
echo "[INFO] No debug script found. Skipping..."; \
fi
# debugging steps end ---
RUN chown -R 1001:1001 /bitnami/zookeeper
USER 1001
📦 docker-compose.yml (using the Dockerfiles)
Your reference file defines the new compose layout (Kafka + Zookeeper built from Dockerfiles).
services:
zookeeper-bank:
build:
context: .
dockerfile: Dockerfile.zookeeper
image: zookeeper-bank
container_name: zookeeper-bank
# restart: unless-stopped
# user: "1001:1001"
ports:
- "${ZOOKEEPER_CLIENT_PORT}:2181"
env_file:
- .env
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
volumes:
- ./docker-volume/zookeeper/data:/bitnami/zookeeper
extra_hosts:
- "host.docker.internal:172.17.0.1"
healthcheck:
test: ["CMD-SHELL", "exit 0"]
interval: 10s
timeout: 5s
retries: 5
start_period: 20s
networks:
- bankingnet
kafka-bank:
build:
context: .
dockerfile: Dockerfile.kafka
image: kafka-bank
container_name: kafka-bank
# restart: unless-stopped
# user: "1001:1001"
depends_on:
- zookeeper-bank
ports:
- "${KAFKA_LISTENER_PORT}:${KAFKA_LISTENER_PORT}"
- 9094:9094
- "${KAFKA_JMX_PORT}:${KAFKA_JMX_PORT}" # for JMX_exporter
env_file:
- .env
environment:
- KAFKA_BROKER_ID=1
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper-bank:2181
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,EXTERNAL://0.0.0.0:9094
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka-bank:9092,EXTERNAL://localhost:9094
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=false
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_JMX_PORT=${KAFKA_JMX_PORT}
- KAFKA_OPTS=-javaagent:/jmx_exporter/jmx_prometheus_javaagent-0.18.0.jar=${KAFKA_JMX_PORT}:/jmx_exporter/kafka-2_0_0.yml
volumes:
- ./docker-volume/kafka/data:/bitnami/kafka
extra_hosts:
- "host.docker.internal:172.17.0.1"
healthcheck:
test: ["CMD-SHELL", "exit 0"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
networks:
- bankingnet
networks:
bankingnet:
external: true
📡 JMX Exporter for Kafka Metrics
Kafka exposes JMX “jmx_prometheus_javaagent-0.18.0.jar” on a configured port (e.g., 9404).
We mount: jmx-exporter/kafka-2_0_0.yml
lowercaseOutputName: true
lowercaseOutputLabelNames: true
rules:
- pattern: "kafka.server<>Count"
name: "kafka_server_$1_$2_per_sec"
labels:
topic: "$3"
type: COUNTER
- pattern: "kafka.server<>Value"
name: "kafka_server_$1_$2"
labels:
topic: "$3"
type: GAUGE
- pattern: "kafka.server<>Value"
name: "kafka_server_$1_$2"
type: GAUGE
- pattern: "kafka.log<>Value"
name: "kafka_log_$1"
labels:
topic: "$2"
partition: "$3"
type: GAUGE
- pattern: "kafka.network<>Value"
name: "kafka_network_$1_$2"
type: GAUGE
- pattern: "kafka.controller<>Value"
name: "kafka_controller_$1_$2"
type: GAUGE
- pattern: "java.lang<>HeapMemoryUsage"
name: "jvm_memory_heap"
type: GAUGE
- pattern: "java.lang<>CollectionCount"
name: "jvm_gc_collection_count"
labels:
gc: "$1"
type: COUNTER
This gives us ready-to-scrape metrics:
- Message in/out
- Request latency
- Topic partition stats
- Consumer lag
- Controller operations
This is essential when we integrate Prometheus and Grafana later in the series.
🔧 Configuration via .env.example
Your .env file controls:
KAFKA_BROKER_ID=1
KAFKA_LISTENER_PORT=9092
KAFKA_ADVERTISED_LISTENER=PLAINTEXT://kafka-broker:9092
ZOOKEEPER_CLIENT_PORT=2181
KAFKA_JMX_PORT=9404
KAFKA_AUTO_CREATE_TOPICS=false
With clear design intent:
- No auto topic creation
- No uncontrolled image version drift
- Cleaner network behavior with
kafka-brokerhostname
🧪 Topic Initialization Script (Realistic Test Topics)
Instead of manually creating topics on the CLI each time, the module ships with init/init-topics.sh, which:
- Lists existing topics, and
- Creates a set of predefined topics with realistic names and partition counts: init-topics
topics=(
"transaction_events:3"
"account_changes:2"
"audit_logs:1"
"document_uploaded:1"
"metrics.service_health:1"
"metrics.db_health:1"
"metrics.kafka_health:1"
)
For each <name>:<partitions> pair, it runs:
docker run --rm -it --network "${NETWORK}" -e KAFKA_JMX_OPTS="" bitnami/kafka:3.4 \
kafka-topics.sh --create --if-not-exists \
--bootstrap-server "$BOOTSTRAP_SERVER" \
--replication-factor 1 \
--partitions "$partitions" \
--topic "$name"
using:
BOOTSTRAP_SERVER="kafka-broker:9092"NETWORK="bankingnet"init-topics
This approach has key advantages:
- Topic initialization is idempotent (
--if-not-exists). - No need to exec into the broker; everything is driven via ephemeral kafka CLI containers on the same Docker network.
- The topics match a realistic banking/microplatform use case: transactions, account changes, audit, and metrics streams.
You can run the script any time you reset Kafka.
🩺 Smoke Testing (Health + Topic Listing)
The module uses an improved test_health.sh which validates:
- Zookeeper health
- Kafka health
- Kafka CLI connectivity
- Topic listing or existence checks
This is the same pattern used in Part-1 for PostgreSQL, but extended for Kafka’s more complex lifecycle.
🔧 Makefile Lifecycle ( Pattern as Part-1)
# Kafka module Makefile
# ---- Config ----
ENV_FILE ?= .env
COMPOSE_FILE ?= infra/docker-compose.kafka.yml
PROJECT_NAME ?= kafka-module
DC := docker compose --env-file $(ENV_FILE) -f $(COMPOSE_FILE) -p $(PROJECT_NAME)
# ---- Phony Targets ----
.PHONY: help init build up down restart logs ps test clean topics-init topics-list
help:
@echo "Kafka module targets:"
@echo " init - prepare env file, folders, and pull/build images"
@echo " build - build Kafka and Zookeeper images"
@echo " up - start Kafka + Zookeeper in background"
@echo " down - stop containers"
@echo " restart - restart stack"
@echo " logs - follow logs for all services"
@echo " ps - show container status"
@echo " test - run health + smoke checks"
@echo " topics-init - create standard test topics (init-topics.sh)"
@echo " topics-list - list topics using kafka-topics.sh"
@echo " clean - stop stack and delete data volumes (with confirmation)"
init:
@if [ ! -f "$(ENV_FILE)" ]; then \
if [ -f ".env.example" ]; then \
echo "Creating $(ENV_FILE) from .env.example"; \
cp .env.example $(ENV_FILE); \
else \
echo "ERROR: .env.example not found. Create it first."; \
exit 1; \
fi \
else \
echo "$(ENV_FILE) already exists, not overwriting."; \
fi
@mkdir -p docker-volume/kafka docker-volume/zookeeper
@echo "Pulling / building images (if required)..."
@$(DC) pull || true
@$(DC) build --pull
build:
@$(DC) build --pull
up:
@$(DC) up -d
down:
@$(DC) down
restart: down up
logs:
@$(DC) logs -f
ps:
@$(DC) ps
test:
@./scripts/test_health.sh $(ENV_FILE)
topics-init:
@./init/init-topics.sh
topics-list:
@if [ ! -f "$(ENV_FILE)" ]; then \
echo "Env file $(ENV_FILE) not found. Run 'make init' first."; \
exit 1; \
fi; \
. "$(ENV_FILE)"; \
KAFKA_CONTAINER_NAME="$${KAFKA_CONTAINER_NAME:-kafka-broker}"; \
KAFKA_LISTENER_PORT="$${KAFKA_LISTENER_PORT:-9092}"; \
echo "Listing topics via $$KAFKA_CONTAINER_NAME on port $$KAFKA_LISTENER_PORT"; \
docker exec -it "$$KAFKA_CONTAINER_NAME" \
kafka-topics.sh --bootstrap-server localhost:$$KAFKA_LISTENER_PORT --list || true
clean:
@echo "WARNING: This will stop Kafka + Zookeeper and DELETE docker-volume data."
@read -p "Continue? (y/N) " ans; \
if [ "$$ans" = "y" ] || [ "$$ans" = "Y" ]; then \
$(DC) down; \
rm -rf docker-volume/kafka docker-volume/zookeeper; \
echo "Data removed."; \
else \
echo "Aborted."; \
fi
Key points:
- Uses
PROJECT_NAMEso this stack doesn’t clash with others. init:- Copies
.env.example→.env(one-time). - Creates
docker-volume/folders. - Pulls and builds images.
- Copies
testjust delegates toscripts/test_health.sh.topics-listuses env to determine container name and port.
modules/kafka/scripts/test_health.sh
#!/usr/bin/env bash
set -euo pipefail
ENV_FILE="${1:-.env}"
if [ ! -f "$ENV_FILE" ]; then
echo "Env file '$ENV_FILE' not found. Run 'make init' first."
exit 1
fi
# Load env (ignore comments / empty lines)
# shellcheck disable=SC2046
export $(grep -v '^\s*#' "$ENV_FILE" | grep -v '^\s*$' | xargs)
KAFKA_CONTAINER_NAME="${KAFKA_CONTAINER_NAME:-kafka-broker}"
ZOOKEEPER_CONTAINER_NAME="${ZOOKEEPER_CONTAINER_NAME:-zookeeper-bank}"
KAFKA_LISTENER_PORT="${KAFKA_LISTENER_PORT:-9092}"
ZOOKEEPER_CLIENT_PORT="${ZOOKEEPER_CLIENT_PORT:-2181}"
echo "Using containers:"
echo " Kafka : ${KAFKA_CONTAINER_NAME} (port ${KAFKA_LISTENER_PORT})"
echo " Zookeeper : ${ZOOKEEPER_CONTAINER_NAME} (port ${ZOOKEEPER_CLIENT_PORT})"
echo
# -------- Healthcheck: container state --------
check_container_health() {
local name="$1"
local label="$2"
local status
status=$(docker inspect --format='{{.State.Health.Status}}' "$name" 2>/dev/null || echo "no_healthcheck")
if [ "$status" = "healthy" ]; then
echo "[OK] $label container health: $status"
elif [ "$status" = "no_healthcheck" ]; then
echo "[WARN] $label container has no Docker healthcheck. Skipping health status check."
else
echo "[ERROR] $label container health: $status"
echo "Hint: docker logs $name"
exit 1
fi
}
echo "Checking Docker health status..."
check_container_health "$ZOOKEEPER_CONTAINER_NAME" "Zookeeper"
check_container_health "$KAFKA_CONTAINER_NAME" "Kafka"
echo
# -------- Zookeeper sanity: ruok --------
echo "Running Zookeeper sanity check (ruok)..."
docker exec -i "$ZOOKEEPER_CONTAINER_NAME" \
bash -c "echo ruok | nc localhost ${ZOOKEEPER_CLIENT_PORT} || exit 1" | grep -q "imok" \
&& echo "[OK] Zookeeper responded with 'imok'" \
|| { echo "[ERROR] Zookeeper did not respond with 'imok'"; exit 1; }
echo
# -------- Kafka sanity: list topics --------
echo "Running Kafka topic list sanity check..."
docker exec -i "$KAFKA_CONTAINER_NAME" \
kafka-topics.sh --bootstrap-server "localhost:${KAFKA_LISTENER_PORT}" --list || {
echo "[ERROR] Failed to list Kafka topics."
exit 1
}
echo "[OK] Kafka topic list command executed successfully."
echo
echo "Health + smoke checks completed successfully."
Notes:
-
Expects (in
.env):KAFKA_CONTAINER_NAME(optional, defaults tokafka-broker)ZOOKEEPER_CONTAINER_NAME(optional, defaults tozookeeper-bank)KAFKA_LISTENER_PORT(optional, defaults to9092)ZOOKEEPER_CLIENT_PORT(optional, defaults to2181)
-
If your actual container names differ, either:
-
Set them in
.env, e.g.:KAFKA_CONTAINER_NAME=kafka-broker ZOOKEEPER_CONTAINER_NAME=zookeeper-bank -
Or adjust defaults at the top of the script.
-
-
Zookeeper check: uses
ruok→ expectsimok. -
Kafka check: lists topics via
kafka-topics.shinside the broker container.
Finally:
chmod +x modules/kafka/scripts/test_health.sh
🧭 Real-World Readiness
This new Kafka module is now:
- Production-friendly
- Debuggable in real time
- Observable with JMX exporter
- Independent for multi-team reuse
- Mapped directly to Kubernetes / OpenShift patterns
The separate Dockerfiles reflect how teams package infra images in real enterprises.
You now have a Kafka module that belongs in a real system — not just a tutorial.
📝 Summary of Part 2
You now have a:
- Custom Kafka image with debugging tools
- Custom Zookeeper image with debugging tools
- Prometheus-ready Kafka JMX metrics
- Topic initialization script modeling realistic event domains
- Standardized Makefile and folder structure
- Canonical docker-volume/ layout
- Deterministic and reusable Kafka Infra Module
This is the Kafka every backend platform team wishes they had ready to go.
GitHub Repository Link
🔗 Project Repo: https://github.com/KathiravanMuthaiah/infrastructureWithDocker
Building Infrastructure with Docker Series: post links
🔗 Building Infrastructure with Docker — Part0:
🔗 Building Infrastructure with Docker — Part1:
“Technically authored by me, accelerated with insights from ChatGPT by OpenAI.” Refer: Leverage ChatGPT
Happy Learning