Apache Kafka — Interview Questions
Stack context: This system uses Apache Kafka (Confluent 7.6.1, KRaft mode) as the event-streaming backbone for asynchronous inter-service communication. Topics include
order.created,payment.processed, andorder.status.updated. Avro + Confluent Schema Registry is used for serialization, and Spring's@RetryableTopichandles retry/DLT flows.
Q1 — What is Apache Kafka and how does it differ from a traditional message queue? junior
Answer: Apache Kafka is a distributed, append-only event log. Unlike a traditional message queue (RabbitMQ, ActiveMQ) where a message is removed once consumed, Kafka retains messages on disk for a configurable retention period. Multiple independent consumers can read the same messages at different offsets.
Key differences:
- Persistence: Kafka stores messages regardless of consumption; queues delete after acknowledgment.
- Multiple consumers: Kafka supports independent consumer groups, each with their own offset. Queues deliver each message to one consumer only.
- Ordering: Kafka guarantees ordering within a partition. Queues typically offer FIFO but not strict partition-level ordering.
- Throughput: Kafka is designed for millions of messages per second using sequential disk I/O and batching.
Real use case in this system: The order.created topic is consumed independently by payment-processors and notification-processors. Both groups receive every event without interfering with each other.
Q2 — What are topics, partitions, and offsets? junior
Answer:
- Topic: A named category/stream of messages (e.g.,
order.created). Think of it as a database table. - Partition: A topic is split into N ordered, immutable sub-logs. Each partition is stored independently and can be hosted on a different broker. This enables horizontal scalability.
- Offset: A monotonically increasing integer assigned to each message within a partition. Each consumer group tracks its own committed offset per partition, enabling independent replay and resume-on-restart.
order.created (3 partitions):
Partition 0: [offset-0, offset-1, offset-2, ...]
Partition 1: [offset-0, offset-1, ...]
Partition 2: [offset-0, ...]
Real use case: In this system order.created has 3 partitions. Orders are keyed by orderId, so all events for the same order always land in the same partition, guaranteeing per-order processing order.
Q3 — What is a consumer group and how does partition assignment work? junior
Answer: A consumer group is a logical grouping of consumers that share the work of consuming a topic. Kafka assigns each partition to exactly one consumer within a group, ensuring each message is processed once per group.
Rules:
- If consumers < partitions: some consumers handle multiple partitions.
- If consumers = partitions: 1:1 assignment.
- If consumers > partitions: extra consumers are idle (no partition assigned).
When a consumer joins or leaves, Kafka triggers a rebalance to redistribute partitions.
Real use case: payment-processors group has consumers equal to the 3 partitions of order.created. Each consumer handles one partition. If a payment-service pod crashes, Kafka rebalances and the remaining consumers absorb its partition.
Q4 — What is auto.offset.reset and when would you use earliest vs latest? junior
Answer:
auto.offset.reset determines where a new consumer group (no committed offset yet) starts reading:
earliest: Read from the very beginning of the topic (offset 0 of each partition).latest: Skip all existing messages; start reading only new ones produced after the consumer starts.
When to use each:
earliest: Testing, data reprocessing, event sourcing catch-up. Used in this system for local dev so new service pods can process all historic events.latest: Production consumers that should only handle live traffic and not backfill historical data (e.g., live dashboards).
Q5 — Explain the @KafkaListener annotation and how it works in Spring. junior
Answer:
@KafkaListener marks a method as a Kafka message handler. Spring's KafkaListenerContainerFactory creates a MessageListenerContainer that manages the consumer loop, offset commit, and thread management.
@KafkaListener(
topics = "order.created",
groupId = "payment-processors",
containerFactory = "kafkaListenerContainerFactory"
)
public void handleOrder(OrderCreatedEvent event) { ... }
Key behaviors:
- Deserialization happens via configured
Deserializer(Avro in this system). - By default, offsets are committed after the method returns successfully (at-least-once delivery).
- Exceptions propagate to the configured
ErrorHandler(or@RetryableTopicif configured).
Q6 — What is Avro and why use it instead of JSON for Kafka messages? junior
Answer: Apache Avro is a binary serialization format with a schema defined in JSON. Compared to JSON:
| Feature | Avro | JSON |
|---|---|---|
| Size | Compact binary (~5-10× smaller) | Verbose text |
| Schema enforcement | Required at produce/consume | Optional |
| Evolution | Supports backward/forward compatibility | No built-in rules |
| Performance | Fast binary encode/decode | Slower text parse |
Schema Registry integration: Avro schemas are registered with Confluent Schema Registry. Only the schema ID (4 bytes) is sent in the message payload. Consumers fetch the schema by ID to deserialize.
Real use case: This system uses Avro for OrderCreatedEvent, PaymentProcessedEvent, and OrderStatusUpdatedEvent. Schema Registry prevents a producer from publishing a breaking schema change.
Q7 — What is the Confluent Schema Registry and how does it prevent breaking changes? junior
Answer: Schema Registry is a centralized service that stores and versions Avro (or Protobuf/JSON Schema) schemas. Producers register a schema before publishing; consumers retrieve schemas by ID.
Compatibility levels (configured per subject):
- BACKWARD: New schema can read data written by the previous schema (consumers can upgrade first).
- FORWARD: Previous schema can read data written by the new schema (producers can upgrade first).
- FULL: Both directions.
If a proposed schema violates the compatibility rule (e.g., removing a required field), the Registry rejects it and the producer fails to start — preventing runtime deserialization errors.
Q8 — What is @RetryableTopic and how does Spring implement retry logic with it? junior
Answer:
@RetryableTopic is a Spring Kafka annotation that configures non-blocking retry using separate retry topics instead of blocking the main consumer thread.
@RetryableTopic(
attempts = "4",
backoff = @Backoff(delay = 1000, multiplier = 2.0),
dltTopicSuffix = ".DLT"
)
@KafkaListener(topics = "order.created", groupId = "payment-processors")
public void processPayment(OrderCreatedEvent event) { ... }
Spring auto-creates order.created-retry-0, order.created-retry-1, order.created-retry-2, and order.created.DLT. On exception, the message is forwarded to the next retry topic with a header indicating the delay. After all attempts are exhausted, the message lands in the DLT.
Advantage over blocking retry: The main consumer continues processing other messages while retries are scheduled, preventing a single bad message from stalling the entire partition.
Q9 — What is a Dead Letter Topic (DLT) and how should it be handled in production? senior
Answer: A DLT receives messages that failed all retry attempts. It serves as a "poison pill parking lot" that preserves failed events for investigation and reprocessing.
Production handling strategy:
- Monitor: Alert on DLT consumer-group lag > 0 (PagerDuty/Datadog).
- @DltHandler: Annotate a method to process DLT messages — log structured context (orderId, error type, attempt count, headers).
- Manual replay: After fixing the root cause, re-publish DLT messages back to the main topic.
- Automatic replay (optional): A scheduled job re-publishes DLT messages after a cooldown period.
@DltHandler
public void handleDlt(OrderCreatedEvent event, @Header KafkaHeaders.RECEIVED_TOPIC String topic) {
log.error("DLT received: orderId={}, topic={}", event.getOrderId(), topic);
// persist to dead_letters table for ops dashboard
}
This system: order.created.DLT has 3 partitions. The @DltHandler logs the failure; ops teams can replay via the admin endpoint.
Q10 — What is the Transactional Outbox Pattern and why is it needed? senior
Answer: The Transactional Outbox solves the dual-write problem: you cannot atomically write to both a database and Kafka in a single transaction (they are different systems).
Problem: order-service creates an order in PostgreSQL and publishes to Kafka. If the app crashes after the DB commit but before the Kafka publish, the order exists but no event is sent — downstream services never know.
Solution: Write the event to an outbox table in the same DB transaction as the order. A separate OutboxPublisher polls the outbox, publishes to Kafka, then marks the row as sent.
BEGIN TRANSACTION
INSERT INTO orders (...)
INSERT INTO outbox (event_type, payload, published=false)
COMMIT
-- Separate thread:
SELECT * FROM outbox WHERE published = false
→ KafkaTemplate.send(...)
→ UPDATE outbox SET published = true WHERE id = ?
Guarantees: At-least-once delivery (idempotent consumers needed). The order and event are always consistent.
This system: order-service implements this with OutboxPublisher polling every 100 ms.
Q11 — Explain Kafka's delivery semantics: at-most-once, at-least-once, exactly-once. senior
Answer:
| Semantic | Description | Config | Risk |
|---|---|---|---|
| At-most-once | Commit offset before processing | enable.auto.commit=true, early commit |
Message loss on consumer crash |
| At-least-once | Commit after processing | Manual commit after handler returns | Duplicate processing on crash |
| Exactly-once | Idempotent producer + transactions | enable.idempotence=true, isolation.level=read_committed |
Higher latency, complexity |
At-least-once + idempotent consumers is the pragmatic choice for most systems. This system uses it: @RetryableTopic may redeliver the same event, so payment-service checks if an order payment already exists before processing.
Exactly-once is appropriate for financial ledgers or inventory deductions where duplicates are unacceptable and the overhead is justified.
Q12 — How does Kafka KRaft mode differ from ZooKeeper-based Kafka? senior
Answer:
KRaft (Kafka Raft) replaces ZooKeeper as the metadata store in Kafka 3.x+. The cluster elects a Controller Quorum using the Raft consensus algorithm, and metadata is stored in an internal __cluster_metadata topic.
Benefits of KRaft:
- Eliminates ZooKeeper operational overhead (separate JVM cluster to maintain).
- Faster metadata propagation (metadata stored in Kafka itself).
- Single system to monitor, secure, and scale.
- Supports larger clusters (ZK had practical limits ~tens of thousands of partitions; KRaft supports millions).
- Simplifies Docker/Kubernetes deployments (one container type instead of two).
This system: Uses KRaft mode (docker-compose.yml runs Kafka with KAFKA_KRAFT_MODE=true, no ZooKeeper container).
Q13 — What is consumer lag and how do you monitor and alert on it? senior
Answer: Consumer lag is the difference between the latest produced offset and the last committed consumer offset for a partition:
lag = latest_offset - committed_offset
High lag means consumers are falling behind producers — events are queuing up.
Monitoring:
kafka-consumer-groups.sh --describe(CLI)- JMX metrics:
kafka.consumer:type=consumer-fetch-manager-metrics,attribute=records-lag-max - Kafka Exporter + Prometheus + Grafana (common production setup)
Alerting thresholds (example):
- Warning: lag > 1,000 for more than 2 minutes
- Critical: lag > 10,000 or lag growing at > 500/min
Remediation:
- Add more consumers (up to partition count)
- Increase consumer processing throughput (tune
max.poll.records, optimize DB calls) - Investigate DLT — stuck DLT processing can block main topic commits
Q14 — How does Kafka guarantee ordering and what are its limitations? senior
Answer: Kafka guarantees ordering within a partition. Messages with the same key always go to the same partition (via key hash % partition count). A single consumer per partition in a consumer group processes them in order.
Limitations:
- Cross-partition ordering: No guarantee. If an order has events in partition 0 and partition 1, they may be processed out of sequence (shouldn't happen with key-based routing, but edge cases exist on partition count change).
- Rebalance window: During consumer rebalance, the previous consumer may not have committed its last offset. The new consumer re-processes from the last commit — potential duplicates.
- Repartitioning: Increasing partition count breaks key-to-partition mapping for existing keys until all messages are consumed. Plan partition count upfront.
Q15 — What are idempotent producers and when are they important? senior
Answer:
An idempotent producer ensures that retrying a failed send() does not result in duplicate messages in Kafka. Each producer is assigned a Producer ID (PID), and each message gets a sequence number. Kafka brokers track PID+sequence and deduplicate retried messages.
Enable: enable.idempotence=true (default in Kafka 3+).
When important:
- Network blips where the broker receives a message but the ACK is lost, causing the producer to retry.
- Combined with
acks=allandretries=Integer.MAX_VALUEfor strongest durability.
Transactional producers: Extend idempotency across partitions and topics — required for exactly-once semantics.
Q16 — Explain acks configuration and its impact on durability vs. throughput. senior
Answer:
acks |
Meaning | Durability | Throughput |
|---|---|---|---|
0 |
Fire and forget — no ACK waited | Lowest | Highest |
1 |
Leader ACK only | Medium | Medium |
all (-1) |
All in-sync replicas (ISR) ACK | Highest | Lowest |
acks=all combined with min.insync.replicas=2 ensures that at least 2 replicas have the message before the producer gets success. If the leader crashes before replication, the message isn't lost.
This system: Order events use acks=all (configured via Spring spring.kafka.producer.acks=all) because losing an order event is unacceptable. Payment notifications can tolerate acks=1.
Q17 — What is a Kafka compacted topic and when would you use one? senior
Answer: A compacted topic retains only the latest value for each key, removing older versions during log compaction. It behaves like a changelog or snapshot store rather than an event log.
Use cases:
- CDC (Change Data Capture): Latest state of a database row keyed by primary key.
- Configuration store: Downstream services subscribe to get current configuration.
- Materialized views: Reconstruct current state without replaying entire history.
How compaction works: A background thread (log cleaner) periodically scans log segments, keeps the latest record per key, and discards older ones. Tombstones (null value) signal deletion.
This system: Does not use compacted topics currently, but product.catalog.updated would be a candidate — consumers need the latest product state, not full history.
Q18 — How do you handle schema evolution in Avro without breaking consumers? senior
Answer: Schema Registry enforces compatibility rules. For backward compatibility (consumers can read new schema with the old one):
- Safe: Adding optional fields with defaults.
- Safe: Removing optional fields.
- Unsafe: Removing required fields, renaming fields, changing types.
Evolution example:
// V1
{"name": "OrderCreatedEvent", "fields": [{"name": "orderId", "type": "string"}]}
// V2 — backward compatible (new optional field with default)
{"name": "OrderCreatedEvent", "fields": [
{"name": "orderId", "type": "string"},
{"name": "priority", "type": ["null", "string"], "default": null}
]}
Rolling upgrade strategy:
- Register V2 schema.
- Deploy consumers first (they handle V1 messages with default for missing field).
- Deploy producers (publish V2).
This system: Deploy order-service consumers before order-service producers when adding new event fields.
Q19 — What is the difference between poll() timeout and max.poll.interval.ms? senior
Answer:
poll()timeout: The durationpoll()blocks waiting for records. If records are available immediately, it returns sooner. Set it to a value that balances responsiveness and CPU.max.poll.interval.ms: The maximum time between consecutivepoll()calls. If a consumer doesn't callpoll()within this window (e.g., processing takes too long), the broker considers it dead and triggers a rebalance.
Common mistake: Setting max.poll.records high and having slow message processing. The consumer processes 500 records but takes 6 minutes, exceeding max.poll.interval.ms=300000 (5 min default) → rebalance → duplicate processing.
Fix: Reduce max.poll.records, increase max.poll.interval.ms accordingly, or process records asynchronously (with care around offset commit ordering).
Q20 — How does partition rebalance work with the cooperative sticky assignor? senior
Answer: The default Eager Assignor (Range/RoundRobin) triggers a "stop-the-world" rebalance: all consumers revoke ALL their partitions, then re-assign. This causes processing gaps.
The Cooperative Sticky Assignor (available since Kafka 2.4) implements incremental rebalancing:
- Consumers propose their current assignments.
- The Group Coordinator identifies partitions that need to move.
- Only those partitions are revoked and reassigned; other partitions continue processing.
Benefits: No processing pause for partitions that don't move. Reduces consumer lag spike on rebalance.
Configure: partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor
Q21 — How would you implement exactly-once processing end-to-end? senior
Answer: Full exactly-once requires coordination at three layers:
- Idempotent producer (
enable.idempotence=true): Deduplicates retried produce calls within a session. - Transactional producer: Groups multiple sends into an atomic unit.
- Consumer isolation (
isolation.level=read_committed): Consumers only see committed transactional messages, not in-flight or aborted ones.
producer.initTransactions();
producer.beginTransaction();
try {
producer.send(record1);
producer.send(record2);
producer.sendOffsetsToTransaction(offsets, consumerGroupMetadata);
producer.commitTransaction();
} catch (Exception e) {
producer.abortTransaction();
}
Trade-offs: ~10-20% throughput reduction. Only appropriate for financial or inventory systems where duplicates are catastrophic. Most systems accept at-least-once + idempotent consumers.
Q22 — What is the role of min.insync.replicas and how does it interact with acks=all? senior
Answer:
min.insync.replicas (broker/topic config) defines the minimum number of replicas that must acknowledge a write for it to succeed when acks=all.
Formula: replication.factor >= min.insync.replicas + 1 (leave room for one broker to be offline).
Example (replication.factor=3, min.insync.replicas=2):
- Broker 1 (leader) + Broker 2 (ISR) = 2 replicas acknowledge → write succeeds.
- Broker 1 (leader) only → write fails with
NotEnoughReplicasException.
Meaning: With this config, you can lose 1 broker and still write. You cannot lose 2 brokers simultaneously.
This system: Using default single-node Kafka in Docker for dev/test. Production would use replication.factor=3, min.insync.replicas=2.
Q23 — How would you debug a consumer that is not receiving messages? junior
Answer: Systematic debugging steps:
- Check consumer group status:
kafka-consumer-groups.sh --describe --group payment-processors— verify members and assigned partitions. - Check lag: Is lag 0 (caught up) or growing?
- Check topic existence:
kafka-topics.sh --list— verify topic name (typo in config?). - Check
auto.offset.reset: Is itlatestand the consumer started before messages were produced? - Check deserializer: Deserialization errors cause the consumer to log and skip (or throw, depending on error handler).
- Check Spring logs: Enable
logging.level.org.springframework.kafka=DEBUG. - Produce a test message manually:
kafka-console-producer.sh→ see if consumer receives it. - Check ACL: Is the consumer authorized to read the topic (in secured clusters)?
Q24 — What is the difference between commitSync and commitAsync? junior
Answer:
commitSync(): Blocks until the broker acknowledges the offset commit. Retries on failure. Guarantees the commit succeeded before returning — safe but adds latency.commitAsync(): Non-blocking. Fires the commit and returns immediately with a callback for success/failure. Higher throughput but riskier on shutdown.
Best practice: Use commitAsync() during normal processing for throughput, and commitSync() in the finally block on shutdown to ensure the last batch is committed.
Spring default: AckMode.BATCH — commits offsets after each poll batch returns. This is a form of async commit optimized for throughput.
Q25 — How do you test Kafka consumers and producers in unit/integration tests? junior
Answer: Unit tests (no broker needed):
- Use
MockProducerandMockConsumerfromkafka-clients. - Test message handling logic directly by invoking the listener method with a mock payload.
Integration tests (with Testcontainers):
@Testcontainers
class OrderEventConsumerTest {
@Container
static KafkaContainer kafka = new KafkaContainer(DockerImageName.parse("confluentinc/cp-kafka:7.6.1"));
@Test
void shouldProcessOrderCreated() {
// produce test message → await consumer invocation → assert side effects
}
}
This system: Uses @EmbeddedKafka (Spring) for per-service tests and DockerComposeContainer for full E2E tests. KafkaTestUtils provides getSingleRecord() helper.
Q26 — What is a Kafka Streams application and how does it differ from a consumer group? senior
Answer: Kafka Streams is a library for stateful, exactly-once stream processing within a JVM application:
- Joins, aggregations, windowing across multiple topics.
- Maintains local state stores (RocksDB) backed by changelog topics.
- Fault-tolerant: state can be rebuilt from changelog on restart.
- No separate cluster: runs inside your application.
vs. Consumer group:
| Consumer Group | Kafka Streams | |
|---|---|---|
| Processing | Stateless per-message | Stateful (joins, windows) |
| State | External (DB) | Local state store + changelog |
| Exactly-once | Complex to achieve | Built-in with processing.guarantee=exactly_once_v2 |
| Complexity | Low | Higher |
This system: Does not use Kafka Streams. Payment and notification services use simple consumer groups because they don't require stateful aggregation. Streams would be appropriate for "calculate total order value per customer per hour" analytics.
Q27 — How does Kafka handle backpressure? senior
Answer: Kafka does not have built-in backpressure signaling from consumer to producer. Instead, backpressure is managed through:
- Consumer lag accumulation: Messages queue up in the topic. Producers write at their rate; consumers read at their rate. The gap (lag) absorbs the burst.
- Producer blocking: If the producer's internal buffer (
buffer.memory) fills up,send()blocks formax.block.msbefore throwing an exception. - Application-level throttling: Rate limit producers based on consumer lag metrics (adaptive rate limiting).
- Spring reactive WebFlux: In this system's API Gateway, reactive backpressure via Project Reactor
Flux.limitRate()can throttle upstream requests.
This system: The order.created topic acts as a buffer. If payment-service is slow (DB contention during sales peak), lag grows in the topic. No messages are lost; they process when capacity is available.
Q28 — Explain Kafka's log segment structure and how retention works. senior
Answer: Each partition is stored as a series of log segments on disk:
order.created-0/
00000000000000000000.log ← message data
00000000000000000000.index ← offset → byte position index
00000000000000000000.timeindex ← timestamp → offset index
00000000000001000000.log ← next segment (after roll)
A new segment is created when the active segment reaches log.segment.bytes (default 1 GB) or log.roll.ms (default 7 days).
Retention modes:
- Time-based (
log.retention.hours=168): Delete segments older than N hours. - Size-based (
log.retention.bytes): Delete oldest segments when total partition size exceeds limit. - Compaction (per-topic): Retain only latest value per key (described in Q17).
This system: Default 7-day retention. Sufficient for retry/DLT scenarios. A payment DLT message is never older than days before ops investigates.
Q29 — How would you increase Kafka throughput for a high-volume producer? senior
Answer: Key producer throughput tunings:
| Config | Default | High-throughput Value |
|---|---|---|
batch.size |
16 KB | 65536 (64 KB) or higher |
linger.ms |
0 | 10-50 ms |
compression.type |
none | lz4 or snappy |
buffer.memory |
32 MB | 128 MB+ |
acks |
all |
1 (if durability trade-off acceptable) |
linger.ms > 0 causes the producer to wait briefly before sending, allowing more messages to batch together. Combined with larger batch.size, this dramatically increases throughput at a small latency cost.
Consumer throughput:
- Increase
fetch.min.bytesandfetch.max.wait.msto reduce fetch round-trips. - Increase
max.poll.records. - Scale consumers up to partition count.
Q30 — How would you design a multi-tenant Kafka setup where tenant data must be isolated? senior
Answer: Three main strategies:
1. Topic-per-tenant:
- Topic naming:
tenantA.order.created,tenantB.order.created - Full isolation, simple ACLs.
- Scaling problem: 1,000 tenants × 10 topics = 10,000 topics → ZooKeeper/KRaft overhead.
2. Partition-per-tenant:
- Single topic with partition key = tenantId.
- Consumers filter by tenant header.
- Limited to partition count tenants; no ACL isolation.
3. Kafka cluster-per-tenant (dedicated cluster):
- Full isolation: storage, throughput, security.
- Highest cost and operational burden.
- Required for strict data residency compliance (GDPR, HIPAA).
Recommended (SaaS):
- Small/medium tenants: shared topics with tenant key routing + ACL filtering.
- Large/enterprise tenants: dedicated topics or clusters.
- Use Kafka's built-in ACLs (
kafka-acls.sh) to restrict consumer group access per topic.
This system: Single-tenant. No isolation needed. Would use topic-per-tenant if extended to multi-tenant SaaS.
Q31 — What is Kafka Connect and when would you use it? junior
Answer: Kafka Connect is a framework for streaming data between Kafka and external systems (databases, object stores, search engines, etc.) using reusable connectors — without writing custom producer/consumer code.
Two connector types:
- Source connector: Reads from an external system → publishes to Kafka topic (e.g., Debezium reads PostgreSQL WAL →
order.eventstopic). - Sink connector: Reads from Kafka topic → writes to external system (e.g., Elasticsearch Sink Connector → search index).
// Example: Debezium PostgreSQL source connector config
{
"name": "orders-cdc-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "postgres",
"database.port": "5432",
"database.dbname": "orders_db",
"database.server.name": "orders",
"table.include.list": "public.orders",
"plugin.name": "pgoutput"
}
}
When to use: CDC pipelines, ETL into data warehouses, syncing Kafka topics to Elasticsearch for search. Avoids writing boilerplate producer/consumer code for standard integrations.
This system: Not used directly, but Debezium Connect would be the natural choice to replace the manual OutboxPublisher polling with CDC-based outbox relay.
Q32 — How do you choose the right number of partitions for a topic? senior
Answer: Partitions determine parallelism and throughput. Over-partitioning wastes resources; under-partitioning limits throughput.
Formula starting point:
partitions = max(target_throughput / producer_throughput_per_partition,
target_throughput / consumer_throughput_per_partition)
Practical guidelines:
- Start with
partitions = max_parallelism_needed(e.g., number of consumer instances you plan to run). - A single partition typically supports 10–50 MB/s write throughput depending on hardware.
- For low-volume topics: 1–3 partitions is fine.
- For high-volume topics: start at 6–12, benchmark, then scale.
Factors to consider:
| Factor | Impact |
|---|---|
| Target consumer parallelism | 1 partition per consumer max |
| Message ordering requirements | More partitions → harder to maintain global order |
| Replication overhead | partitions × replication.factor log files on each broker |
| Retention size | Each partition is stored separately |
Warning: You can increase partitions later, but it changes the key-to-partition mapping. This breaks ordering guarantees for key-based producers. Plan conservatively upfront for order-sensitive topics.
This system: order.created uses 3 partitions — matching the 3 payment-service pods. If we scale to 6 pods, we'd increase to 6 partitions during a maintenance window with consumer group re-assignment.
Q33 — What is the difference between key-based partitioning and round-robin? junior
Answer:
- Key-based (default): The producer hashes the message key using
murmur2(key) % numPartitions. All messages with the same key go to the same partition, guaranteeing ordering per key. - Round-robin (
nullkey orRoundRobinPartitioner): Messages are distributed evenly across partitions in rotation. No ordering guarantee across messages.
// Key-based — orderId ensures all events for the same order go to the same partition
ProducerRecord<String, OrderCreatedEvent> record =
new ProducerRecord<>("order.created", order.getId().toString(), event);
// Round-robin — no key
ProducerRecord<String, LogEvent> logRecord =
new ProducerRecord<>("app.logs", null, logEvent);
When to use round-robin: Logs, metrics, or any data where ordering doesn't matter and you want maximum throughput distribution.
When to use key-based: Domain events where processing order matters (e.g., all events for orderId=123 must process in sequence: created → paid → shipped).
This system: order.created, payment.processed, and order.status.updated all use orderId as the key.
Q34 — How does Kafka handle message compression and what are the tradeoffs? junior
Answer: Kafka supports producer-side compression. The producer compresses a batch of messages; the broker stores the compressed batch; consumers decompress.
Supported codecs:
| Codec | Compression Ratio | CPU Cost | Best For |
|---|---|---|---|
none |
1× | None | Low-volume or pre-compressed data |
gzip |
4–8× | High | High ratio needed, CPU available |
snappy |
2–4× | Low | Balanced (Google's default) |
lz4 |
2–4× | Very low | Lowest latency |
zstd |
4–8× | Medium | Best ratio/speed tradeoff (Kafka 2.1+) |
# Spring Boot producer config
spring:
kafka:
producer:
compression-type: lz4
batch-size: 65536 # 64 KB batch
linger-ms: 10 # wait up to 10ms to fill batch
Key insight: Compression works best with larger batches (linger.ms > 0). Compressing single tiny messages has overhead without benefit.
This system: Uses lz4 for Avro-serialized order events. Avro is already compact binary, so compression gain is modest (~20%), but latency savings from smaller network payloads justify it.
Q35 — What are Kafka message headers and how are they used? junior
Answer: Kafka headers are key-value metadata attached to a message, separate from the message key and value. They don't affect partitioning and are stored as byte arrays.
Common uses:
- Tracing: propagate
X-Correlation-ID,X-Trace-IDacross services. - Retry metadata:
@RetryableTopicuses headers likekafka_original-topic,kafka_exception-message,kafka_backoff-timestamp. - Schema version: custom schema version indicator.
- Source system: identify which service produced the message.
// Producer: add correlation ID header
@Autowired KafkaTemplate<String, OrderCreatedEvent> kafkaTemplate;
public void publish(OrderCreatedEvent event, String correlationId) {
var record = new ProducerRecord<>("order.created", event.getOrderId(), event);
record.headers().add("X-Correlation-ID", correlationId.getBytes(StandardCharsets.UTF_8));
kafkaTemplate.send(record);
}
// Consumer: read header
@KafkaListener(topics = "order.created", groupId = "payment-processors")
public void handle(
OrderCreatedEvent event,
@Header(value = "X-Correlation-ID", required = false) byte[] correlationIdBytes
) {
String correlationId = correlationIdBytes != null
? new String(correlationIdBytes, StandardCharsets.UTF_8)
: UUID.randomUUID().toString();
MDC.put("correlationId", correlationId);
// process...
}
This system: Propagates X-Correlation-ID from the HTTP request through Kafka events to enable distributed tracing across order → payment → notification flow.
Q36 — How do you produce messages with Spring Boot's KafkaTemplate? junior
Answer:
KafkaTemplate is Spring Kafka's high-level producer abstraction that wraps the native Kafka Producer. It handles serialization, threading, and async callbacks.
@Service
public class OrderEventPublisher {
private final KafkaTemplate<String, OrderCreatedEvent> kafkaTemplate;
public OrderEventPublisher(KafkaTemplate<String, OrderCreatedEvent> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void publish(OrderCreatedEvent event) {
CompletableFuture<SendResult<String, OrderCreatedEvent>> future =
kafkaTemplate.send("order.created", event.getOrderId(), event);
future.whenComplete((result, ex) -> {
if (ex != null) {
log.error("Failed to publish order event: orderId={}", event.getOrderId(), ex);
} else {
RecordMetadata meta = result.getRecordMetadata();
log.info("Published to partition={} offset={}",
meta.partition(), meta.offset());
}
});
}
}
# application.yml
spring:
kafka:
producer:
bootstrap-servers: localhost:9092
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
acks: all
retries: 3
enable-idempotence: true
properties:
schema.registry.url: http://localhost:8081
Key methods:
kafkaTemplate.send(topic, key, value)— async send, returnsCompletableFuture.kafkaTemplate.sendDefault(key, value)— sends to the configured default topic.kafkaTemplate.executeInTransaction(ops -> ...)— transactional batch send.
Q37 — What is the purpose of fetch.min.bytes and fetch.max.wait.ms? senior
Answer: These two configs control how the broker responds to consumer fetch requests, trading latency vs. throughput:
fetch.min.bytes(default: 1 byte): The broker waits until it has at least this many bytes of data before responding to a fetch request. Increasing it allows the broker to batch more data per response, reducing fetch round-trips.fetch.max.wait.ms(default: 500 ms): The maximum time the broker will wait iffetch.min.byteshasn't been satisfied yet. Acts as a ceiling so consumers don't wait forever on low-volume topics.
Consumer sends FETCH request
→ Broker checks: do I have ≥ fetch.min.bytes?
YES → respond immediately
NO → wait up to fetch.max.wait.ms, then respond with what's available
Tuning for throughput (high-volume topics):
spring:
kafka:
consumer:
fetch-min-size: 102400 # 100 KB — wait for 100 KB before responding
fetch-max-wait: 500 # but no more than 500ms
max-poll-records: 500
Tuning for low latency (real-time alerting):
spring:
kafka:
consumer:
fetch-min-size: 1 # respond immediately with any data
fetch-max-wait: 50 # max 50ms wait
This system: Notification service uses low fetch-min-size=1 for near-real-time delivery of order.status.updated events.
Q38 — What is the __consumer_offsets internal topic? senior
Answer:
__consumer_offsets is a Kafka-internal compacted topic that stores committed consumer group offsets. Before Kafka 0.9, offsets were stored in ZooKeeper; since 0.9, they're stored in this topic for better scalability and availability.
Structure:
- Key:
(group_id, topic, partition)— identifies a specific consumer group's position in a partition. - Value:
offset + metadata + timestamp— the committed offset and optional metadata string.
50 partitions by default (controlled by offsets.topic.num.partitions). Each consumer group is assigned to a partition of __consumer_offsets based on hash(group_id) % 50.
How to inspect (debugging):
# List all consumer groups
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
# Describe a group (shows committed offsets, lag, and assigned members)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--describe --group payment-processors
# Output:
# GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
# payment-processors order.created 0 1523 1523 0
# payment-processors order.created 1 1201 1205 4 ← lag!
This system: Used for monitoring. A lag > 0 alert on payment-processors group triggers investigation into payment-service pod health.
Q39 — How do you implement consumer pause and resume for rate limiting? senior
Answer:
Kafka consumers support pause(partitions) and resume(partitions) to temporarily stop fetching from specific partitions without leaving the consumer group or triggering a rebalance.
Use cases:
- Rate limit downstream services (e.g., external payment gateway has a 100 req/s limit).
- Circuit breaker: pause consumption when downstream is degraded.
- Backpressure from a slow database write path.
@Service
public class RateLimitedPaymentConsumer {
private final RateLimiter rateLimiter = RateLimiter.create(100); // 100 events/sec
@KafkaListener(topics = "order.created", groupId = "payment-processors",
containerFactory = "kafkaListenerContainerFactory")
public void handle(OrderCreatedEvent event,
Acknowledgment ack,
Consumer<?, ?> consumer) {
// Pause if rate limit exceeded
if (!rateLimiter.tryAcquire()) {
Collection<TopicPartition> assigned = consumer.assignment();
consumer.pause(assigned);
// Resume after a short delay (Spring will call poll() again)
// In practice, use a scheduled task or AOP-based circuit breaker
}
processPayment(event);
ack.acknowledge();
// Resume paused partitions after processing
if (!consumer.paused().isEmpty()) {
consumer.resume(consumer.paused());
}
}
}
Spring approach: Use KafkaListenerEndpointRegistry to pause/resume the entire listener container:
@Autowired KafkaListenerEndpointRegistry registry;
// Pause
registry.getListenerContainer("paymentListener").pause();
// Resume
registry.getListenerContainer("paymentListener").resume();
Q40 — What is static group membership (group.instance.id) and why does it matter? senior
Answer:
By default, Kafka consumers have dynamic membership: each poll() session gets a new member ID, and any disconnect (pod restart, GC pause) triggers a rebalance.
Static membership (group.instance.id) assigns a stable identity to a consumer. When a static member disconnects and reconnects within session.timeout.ms, Kafka recognizes it by instance ID and skips the rebalance, reassigning the same partitions.
# application.yml — give each pod a stable instance ID
spring:
kafka:
consumer:
group-instance-id: payment-processor-pod-${HOSTNAME}
session-timeout-ms: 60000 # give pod 60s to restart before rebalance
Benefits:
- Eliminate unnecessary rebalances during rolling deployments (Kubernetes pod restarts).
- Reduce consumer lag spikes — no stop-the-world rebalance during deploys.
- State preservation — Kafka Streams benefits most: local state stores stay with the same instance ID.
Tradeoff: If a pod dies permanently (not just restarting), its partitions won't be reassigned until session.timeout.ms expires. Set it shorter than your deployment window.
This system: payment-service pods in Kubernetes use HOSTNAME-based group.instance.id. A rolling deploy of 3 pods takes ~90 seconds but causes zero rebalances.
Q41 — How do you handle large messages in Kafka? senior
Answer:
Kafka's default max.message.bytes is 1 MB. For larger payloads, you have three strategies:
1. Increase message size limits (simplest, not recommended past ~10 MB):
# Broker topic config
log.message.max.bytes=10485760 # 10 MB
# Producer
spring.kafka.producer.properties.max.request.size=10485760
# Consumer
spring.kafka.consumer.properties.fetch.message.max.bytes=10485760
2. External reference pattern (recommended for large blobs): Store the large payload in object storage (S3, MinIO) and send only a reference in the Kafka message:
// Producer
String s3Key = "orders/large-payload/" + orderId + ".json";
s3Client.putObject(bucket, s3Key, largePayload);
LargeOrderEvent event = LargeOrderEvent.newBuilder()
.setOrderId(orderId)
.setPayloadRef(s3Key) // reference, not data
.build();
kafkaTemplate.send("order.large", orderId, event);
// Consumer
LargeOrderEvent event = ...;
String payload = s3Client.getObject(bucket, event.getPayloadRef());
3. Kafka chunking (complex, avoid if possible): Split the message into multiple Kafka records with sequence headers, reassemble on the consumer side.
This system: Order events are small Avro records (~200 bytes). If image uploads were included, the external reference pattern would be used with MinIO.
Q42 — What is the difference between subscribe() and assign() in a Kafka consumer? junior
Answer:
subscribe() |
assign() |
|
|---|---|---|
| Usage | consumer.subscribe(List.of("order.created")) |
consumer.assign(List.of(new TopicPartition("order.created", 0))) |
| Partition assignment | Kafka Group Coordinator manages it | Manual — you specify exact partitions |
| Rebalance | Triggered automatically on membership changes | No rebalancing — you own the assignment |
| Consumer group | Participates in a group | No group coordination (no offset commit to broker group) |
| Offset management | Group offsets via commitSync/commitAsync |
Manual or commitSync with assigned partitions |
When to use assign():
- Exactly-once replay of specific partitions for testing or backfill.
- Custom partition assignment strategies outside Kafka's group protocol.
- Reading a partition at a specific offset for debugging.
// Direct partition assignment for reprocessing partition 0 from offset 500
TopicPartition tp = new TopicPartition("order.created", 0);
consumer.assign(List.of(tp));
consumer.seek(tp, 500L); // start from specific offset
while (true) {
ConsumerRecords<String, OrderCreatedEvent> records = consumer.poll(Duration.ofMillis(100));
// process...
}
Spring context: @KafkaListener always uses subscribe() internally. For assign() semantics, use KafkaConsumer directly.
Q43 — How does Spring Kafka's ConcurrentMessageListenerContainer achieve parallelism? senior
Answer:
ConcurrentMessageListenerContainer spins up N internal KafkaMessageListenerContainer instances, each running its own poll loop in a dedicated thread. Each thread is an independent consumer in the same consumer group.
@Bean
public ConcurrentKafkaListenerContainerFactory<String, OrderCreatedEvent>
kafkaListenerContainerFactory(ConsumerFactory<String, OrderCreatedEvent> consumerFactory) {
ConcurrentKafkaListenerContainerFactory<String, OrderCreatedEvent> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.setConcurrency(3); // 3 threads = 3 consumers in the group
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
return factory;
}
Or via annotation:
@KafkaListener(
topics = "order.created",
groupId = "payment-processors",
concurrency = "3" // overrides factory default for this listener
)
public void handle(OrderCreatedEvent event, Acknowledgment ack) { ... }
Thread safety: Each thread has its own KafkaConsumer instance (not thread-safe). Do NOT share a single KafkaConsumer across threads.
Partition assignment: With concurrency=3 and 3 partitions, each thread gets 1 partition. With concurrency=6 and 3 partitions, 3 threads are idle.
This system: payment-service uses concurrency=3 matching the 3 partitions of order.created. All 3 pods × 3 threads = 9 consumers, but only 3 partitions means 6 consumers are idle. Correct configuration: 1 thread per pod = 3 pods = 3 consumers = 1 per partition.
Q44 — What is Kafka MirrorMaker 2 and how does it enable cross-datacenter replication? senior
Answer: MirrorMaker 2 (MM2) is a Kafka Connect-based tool for replicating topics between Kafka clusters. It's the production replacement for the original MirrorMaker.
Architecture:
Primary DC (us-east) Secondary DC (eu-west)
┌─────────────────┐ ┌─────────────────────────┐
│ Kafka Cluster A │──MM2────▶│ Kafka Cluster B │
│ order.created │ │ us-east.order.created │
│ payment.processed│ │ us-east.payment.processed│
└─────────────────┘ └─────────────────────────┘
MM2 prefixes replicated topics with the source cluster alias (us-east.) to avoid naming conflicts. It replicates:
- Topic data (messages + partitioning)
- Consumer group offsets (translated to the target cluster's offset space)
- Topic configurations
Use cases:
- Disaster recovery: Failover to secondary cluster on primary outage.
- Geo-replication: Serve EU consumers from EU cluster to reduce latency.
- Data aggregation: Aggregate events from multiple regional clusters into a global cluster for analytics.
Consumer offset sync: MM2 translates offsets using a checkpointing mechanism, enabling consumer groups to resume on the secondary cluster with minimal lag after failover.
This system: Single-region Docker setup. MM2 would be added if deployed to multi-region production (primary: us-east-1, DR: us-west-2).
Q45 — How do you implement message filtering in a Kafka consumer without creating separate topics? junior
Answer: Two approaches for filtering without extra topics:
1. RecordFilterStrategy (Spring Kafka):
Configures a filter at the container level. Filtered records are acknowledged but not passed to the listener method.
@Bean
public ConcurrentKafkaListenerContainerFactory<String, OrderCreatedEvent>
filteredListenerContainerFactory(ConsumerFactory<String, OrderCreatedEvent> cf) {
var factory = new ConcurrentKafkaListenerContainerFactory<String, OrderCreatedEvent>();
factory.setConsumerFactory(cf);
// Filter: only process HIGH_VALUE orders (> $500)
factory.setRecordFilterStrategy(record ->
record.value().getTotalAmount().compareTo(new BigDecimal("500")) < 0
);
factory.setAckDiscarded(true); // auto-ack filtered messages
return factory;
}
2. Header-based routing in the listener method:
@KafkaListener(topics = "order.created", groupId = "fraud-detectors")
public void handle(
OrderCreatedEvent event,
@Header("order-region") String region
) {
if (!"EU".equals(region)) {
return; // skip non-EU orders silently
}
runFraudCheck(event);
}
Tradeoff: Consumer still fetches all messages from Kafka — filtering is application-level. For high-volume topics where only a small fraction is relevant, consider a dedicated filtered topic (produced by a streaming processor) to reduce consumer load.
Q46 — What are the tradeoffs between Avro, Protobuf, and JSON Schema with Schema Registry? senior
Answer:
| Avro | Protobuf | JSON Schema | |
|---|---|---|---|
| Wire format | Binary | Binary | JSON text |
| Schema language | JSON | .proto IDL |
JSON Schema |
| Size | Smallest | Smallest | Largest (5–10×) |
| Speed | Fast | Fastest | Slowest |
| Language support | Wide | Widest | Universal |
| Schema evolution | Backward/forward with defaults | Field number-based, robust | Limited enforcement |
| Tooling | Strong with Confluent | Strong everywhere | Basic |
| Human readable | No | No | Yes |
When to choose:
- Avro: Default choice for Kafka with Confluent Schema Registry. Best schema enforcement + compact binary. Ideal for internal services.
- Protobuf: When you already use gRPC/Protobuf for service communication or need maximum performance. Field number-based evolution is more robust than Avro's name-based.
- JSON Schema: When consumers are external systems, browser clients, or teams that need human-readable payloads without binary tooling.
Schema Registry supports all three with the same compatibility enforcement and schema ID embedding.
This system: Uses Avro throughout internal services. If an external partner API needed to consume events, a separate JSON Schema topic or a transformation layer would be added.
Q47 — How does sendOffsetsToTransaction enable consume-transform-produce exactly-once? senior
Answer: In a read-process-write pipeline (consume from topic A, transform, produce to topic B), exactly-once requires that the consumer offset commit and the producer send are atomic — either both commit or neither does.
sendOffsetsToTransaction() atomically includes the consumer offset commit inside the producer transaction. If the transaction aborts, the offset commit is rolled back too.
@Bean
public KafkaTransactionManager<String, TransformedEvent> txManager(
ProducerFactory<String, TransformedEvent> pf) {
return new KafkaTransactionManager<>(pf);
}
// Transactional consume-transform-produce
@Transactional("kafkaTransactionManager")
@KafkaListener(topics = "order.created", groupId = "transformer-group",
containerFactory = "exactlyOnceFactory")
public void transformAndForward(
ConsumerRecord<String, OrderCreatedEvent> record,
@Autowired KafkaTemplate<String, TransformedEvent> producerTemplate
) {
TransformedEvent transformed = transform(record.value());
// Within the same transaction: send + offset commit
producerTemplate.send("order.transformed", record.key(), transformed);
// sendOffsetsToTransaction is called automatically by Spring
// when using ChainedKafkaTransactionManager or TransactionalKafkaMessageListenerContainer
}
Key configs:
spring:
kafka:
producer:
transaction-id-prefix: "transformer-txn-"
consumer:
isolation-level: read_committed # don't read uncommitted transactional messages
listener:
ack-mode: MANUAL # Spring manages offset commit inside transaction
This system: Not implemented (payment-service is not a consume-transform-produce pipeline). Would be applicable if building a stream processor that enriches order.created before forwarding to order.enriched.
Q48 — What is consumer group heartbeat and how do session.timeout.ms and heartbeat.interval.ms interact? junior
Answer:
A Kafka consumer sends periodic heartbeats to the Group Coordinator broker to signal it is alive. If the coordinator doesn't receive a heartbeat within session.timeout.ms, it removes the consumer from the group and triggers a rebalance.
heartbeat.interval.ms(default: 3000 ms): How often the consumer sends a heartbeat. Must be< session.timeout.ms / 3.session.timeout.ms(default: 45000 ms): How long the coordinator waits for a heartbeat before declaring the consumer dead.
Consumer ──heartbeat──▶ Coordinator (every heartbeat.interval.ms)
Coordinator tracks last-seen timestamp per consumer
If now - last_seen > session.timeout.ms → consumer presumed dead → rebalance
Heartbeats vs. max.poll.interval.ms:
Heartbeats are sent by a background thread — a consumer can be heartbeating but still trigger a rebalance if it doesn't call poll() within max.poll.interval.ms. These are two independent liveness checks:
session.timeout.ms/heartbeat.interval.ms: Network/process liveness.max.poll.interval.ms: Processing liveness (consumer isn't stuck in business logic).
Recommended configuration:
spring:
kafka:
consumer:
heartbeat-interval: 3000 # 3s heartbeat
session-timeout-ms: 15000 # 15s session timeout (detect fast failures)
properties:
max.poll.interval.ms: 300000 # 5 min for slow processing
Q49 — How would you perform a zero-downtime Kafka topic partition increase? senior
Answer: Increasing partitions is a one-way operation — you cannot reduce them without recreating the topic. It also breaks key-to-partition mapping for existing keys, which can violate ordering guarantees.
Procedure for zero-downtime:
Step 1: Increase partitions during a low-traffic window:
kafka-topics.sh --bootstrap-server localhost:9092 \
--alter --topic order.created \
--partitions 6
Step 2: Wait for all existing messages in old partitions (0–2) to be consumed. Check lag:
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--describe --group payment-processors
Step 3: Scale consumers to match new partition count (e.g., from 3 to 6 pods).
Step 4: Verify new partitions (3–5) are being assigned and consumed.
Ordering caveat: After the change, murmur2("orderId-123") % 6 maps to a different partition than murmur2("orderId-123") % 3. In-flight messages for a key may be split across the old and new partition. Strategy:
- Pre-drain: Ensure all in-flight messages complete before routing new ones to new partitions.
- Sticky partitioner: Continue routing existing key traffic to old partitions temporarily using a custom partitioner.
- Accept brief reordering: For non-critical ordering requirements.
This system: order.created partition increase from 3 → 6 would require a maintenance window to drain existing messages and update the payment-service deployment count.
Q50 — How does Kafka's isolation.level protect consumers in transactional topics? senior
Answer:
isolation.level controls whether consumers read messages from transactional producers before those transactions are committed.
read_uncommitted (default):
- Consumers see all messages immediately as they are appended, including those from in-progress or subsequently aborted transactions.
- Risk: consumer processes a message from an aborted transaction (a "dirty read").
read_committed:
- Consumers only see messages from committed transactions AND non-transactional messages.
- Aborted transaction messages are never visible.
- Consumer may see a read gap: the consumer pauses at the Last Stable Offset (LSO) — the offset up to which all transactions are decided — until in-flight transactions commit or abort.
# Consumer config for exactly-once processing
spring:
kafka:
consumer:
properties:
isolation.level: read_committed
Last Stable Offset (LSO) vs. High Watermark (HW):
Partition log:
offset 0: msg (non-transactional) ✓ visible to all
offset 1: msg (txn A - committed) ✓ visible to read_committed
offset 2: msg (txn B - in-flight) ✗ read_committed pauses here (LSO=2)
offset 3: msg (txn A - committed) ✓ becomes visible after txn B resolves
This system: Payment and notification consumers use read_uncommitted (default) since order events are not published transactionally. If the Outbox Publisher were upgraded to use Kafka transactions, consumers would switch to read_committed to avoid processing rolled-back order records.