Microservices Architecture — Interview Questions

Stack context: This system is a six-service ecommerce platform (api-gateway, user-service, product-service, order-service, payment-service, notification-service) using event-driven architecture with Apache Kafka. Key patterns implemented: transactional outbox, saga (choreography), DLT (dead-letter topic), API gateway, session-based auth, circuit breaker, rate limiting, and distributed tracing.


Q1 — What is microservices architecture and what problems does it solve? junior

Answer: Microservices is an architectural style where an application is decomposed into small, independently deployable services, each focused on a specific business capability.

Problems it solves:

Trade-offs introduced:


Q2 — What is the Transactional Outbox Pattern and why is it needed? junior

Answer: The Transactional Outbox Pattern solves the "dual write" problem: how to atomically update a database AND publish an event to Kafka.

The problem: A network failure after DB commit but before Kafka publish leaves the system inconsistent — order saved but no event sent, payment never processed.

The solution:

  1. Write the business entity (order) AND an outbox event record to the same database in one transaction.
  2. A separate poller reads unprocessed outbox events and publishes them to Kafka.
  3. On successful Kafka publish, mark the outbox event as processed.
DB Transaction:
  INSERT INTO orders (id, ...) VALUES (...)
  INSERT INTO outbox_events (id, event_type, payload, status) VALUES (...)
  
  -- Committed atomically. If Kafka is down, the outbox has the event.

OutboxPoller (scheduled):
  SELECT * FROM outbox_events WHERE status = 'PENDING'
  FOR EACH event:
    kafkaTemplate.send(event.topic, event.payload)
    UPDATE outbox_events SET status = 'PUBLISHED' WHERE id = event.id

This system: order-service implements the outbox pattern. The OrderCreatedEvent is written to the outbox table within the same transaction that creates the Order entity, ensuring at-least-once delivery to Kafka.


Q3 — What is the Saga Pattern and when is it used? junior

Answer: The Saga Pattern manages long-running business transactions that span multiple services without a distributed transaction (2PC). Each service executes a local transaction and publishes an event. If a step fails, compensating transactions undo previous steps.

Two implementations:

Choreography (event-driven, this system):

order-service: ORDER_CREATED event →
payment-service: processes payment → PAYMENT_PROCESSED event →
order-service: updates order to CONFIRMED →
notification-service: sends confirmation email

Orchestration (centralized):

Compensation: If payment fails:

payment-service: PAYMENT_FAILED event →
order-service: updates order to FAILED (compensation) →
product-service: releases reserved stock (compensation) →
notification-service: sends failure notification

Q4 — What is a Dead Letter Topic (DLT) and how is it used? junior

Answer: A Dead Letter Topic (DLT) is a Kafka topic where messages that cannot be processed after all retry attempts are sent. It prevents poison messages from blocking normal message processing.

Flow:

order.created → consumer processes → fails
                → retry 1 (100ms wait) → fails
                → retry 2 (200ms wait) → fails
                → retry 3 (400ms wait) → fails (4th attempt total)
                → send to order.created.DLT

Spring Kafka @RetryableTopic:

@RetryableTopic(
    attempts = "4",
    backoff = @Backoff(delay = 100, multiplier = 2.0),
    dltTopicSuffix = ".DLT"
)
@KafkaListener(topics = "order.created")
public void processOrder(OrderCreatedEvent event) {
    // On 4th failure, message goes to order.created.DLT
}

DLT monitoring: Operators monitor the DLT. A non-empty DLT triggers an alert. Messages are analyzed, root cause fixed, and messages replayed manually or via an automated dead letter processor.

This system: order.created.DLT collects failed order events for manual review and replay.


Q5 — What is the API Gateway Pattern and what responsibilities does it have? junior

Answer: An API Gateway is a single entry point for all client requests. It handles cross-cutting concerns that would otherwise be duplicated across all services.

Responsibilities:

Concern Without Gateway With Gateway
Authentication Each service validates JWT Gateway validates once, injects headers
Rate limiting Each service implements it Gateway enforces per-user limits
Routing Client knows all service URLs Client talks only to gateway
SSL termination Each service needs certificate Gateway handles SSL
CORS Each service configures CORS Gateway handles CORS
Request tracing Each service adds trace ID Gateway injects correlation ID

This system (api-gateway on port 8080):


Q6 — What is eventual consistency and how do you handle it in microservices? senior

Answer: Eventual consistency means that after a series of updates, all replicas/services will eventually reach the same state — but there's no guarantee of immediate consistency.

In this system: After order placement:

  1. order-service creates order with status PENDING.
  2. payment-service processes payment (may take 100ms-10s).
  3. order-service updates order to CONFIRMED.

Between steps 1 and 3, a user reading their order status sees PENDING. This is eventual consistency.

Handling strategies:

CAP theorem: In distributed systems, you can have only 2 of: Consistency, Availability, Partition Tolerance. Microservices typically choose Availability + Partition Tolerance (AP), accepting eventual consistency.


Q7 — What is idempotency and how do you implement it? senior

Answer: An operation is idempotent if calling it multiple times with the same input produces the same result as calling it once. Essential for safe retries in event-driven systems.

Why needed: Kafka guarantees at-least-once delivery. The same OrderCreatedEvent may be delivered multiple times. Payment should only be processed once.

Implementation strategies:

1. Idempotency key in DB (most reliable):

@Entity
public class ProcessedEvent {
    @Id
    private String eventId;     // Kafka message key or UUID in event
    private Instant processedAt;
}

@Transactional
public void processPayment(PaymentRequest request) {
    if (processedEventRepo.existsById(request.getEventId())) {
        return;  // duplicate — skip
    }
    processedEventRepo.save(new ProcessedEvent(request.getEventId()));
    // ... process payment
}

2. Unique constraint in DB:

CREATE UNIQUE INDEX idx_payment_order ON payments(order_id);
-- Second attempt to insert payment for same order_id throws exception → rollback → safe

3. Redis idempotency (shorter TTL, distributed):

Boolean firstTime = redisTemplate.opsForValue()
    .setIfAbsent("processed:" + eventId, "1", Duration.ofHours(24));
if (Boolean.FALSE.equals(firstTime)) return; // already processed

Q8 — What is the Circuit Breaker Pattern and how does it prevent cascading failures? senior

Answer: The Circuit Breaker prevents a system from repeatedly calling a failing service, allowing it time to recover.

States:

CLOSED → OPEN (after N failures or X% failure rate)
        → HALF_OPEN (after wait duration) → probes with limited requests
        → CLOSED (if probe succeeds) or OPEN (if probe fails)

Resilience4j in Spring Boot:

@CircuitBreaker(name = "payment-service", fallbackMethod = "paymentFallback")
public PaymentResult processPayment(PaymentRequest request) {
    return paymentClient.process(request);
}

public PaymentResult paymentFallback(PaymentRequest request, Exception e) {
    log.warn("Circuit open for payment-service: {}", e.getMessage());
    return PaymentResult.pending("PAYMENT_SERVICE_UNAVAILABLE");
}
resilience4j.circuitbreaker.instances.payment-service:
  slidingWindowSize: 10
  failureRateThreshold: 50        # open when 50% of 10 requests fail
  waitDurationInOpenState: 10s
  permittedNumberOfCallsInHalfOpenState: 3

Cascading failure scenario: Without circuit breaker — payment-service is slow → order-service holds threads waiting → order-service becomes slow → gateway threads exhausted → entire system down. With circuit breaker — open immediately, return fallback, system remains responsive.


Q9 — How does service-to-service authentication work in a microservices system? senior

Answer: When service A calls service B, service B needs to verify the call is legitimate and identify the calling service.

Strategy 1: Trust gateway-injected headers (this system):

Strategy 2: Service-to-service JWT (mutual auth):

Strategy 3: mTLS (mutual TLS):

Strategy 4: API keys:

Best practice: Trust gateway-injected headers for user context (as this system does), plus mTLS or service JWTs for service identity verification.


Q10 — What is service discovery and how does it work? junior

Answer: Service discovery allows services to find each other dynamically without hardcoded IP addresses, which change frequently in containerized environments.

Two types:

Client-side discovery (Spring Cloud + Eureka):

Server-side discovery (Kubernetes / NGINX / AWS ELB):

# Spring Cloud Eureka registration
eureka:
  client:
    service-url.defaultZone: http://eureka-server:8761/eureka
  instance:
    instance-id: ${spring.application.name}:${server.port}

This system: Uses static URIs (http://order-service:8083) because Docker Compose DNS resolves service names. In Kubernetes, http://order-service.default.svc.cluster.local:8083 or just http://order-service:8083 works via K8s DNS.


Q11 — What is Bounded Context and how does it apply to microservice design? senior

Answer: Bounded Context (from Domain-Driven Design) defines the explicit boundary within which a specific domain model applies. Each microservice should ideally correspond to one bounded context.

Product in multiple contexts:

Service What "Product" means Fields
product-service Catalog item name, description, price, images, category
order-service Ordered item (snapshot) productId, name, price (at time of order), quantity
inventory-service Stock unit productId, warehouseId, quantity, location

Each service has its own Product model. The order-service's OrderItem snapshot the product price at order time — it does NOT depend on the live product price in product-service.

Why separate: If order-service called product-service for every order lookup, it would create tight coupling, network dependency, and cascade failures.

Anti-pattern: A single shared Product class imported by all services. This creates a monolith disguised as microservices.


Q12 — What is CQRS and when is it used? senior

Answer: CQRS (Command Query Responsibility Segregation) separates write (command) and read (query) operations into different models and potentially different data stores.

Structure:

// Command (write)
public class CreateOrderCommand {
    UUID productId;
    int quantity;
    UUID customerId;
}

// Query model (read, denormalized)
public record OrderSummaryView(
    UUID orderId, String customerEmail, String productName,
    BigDecimal total, OrderStatus status, Instant createdAt) {}

Event Sourcing + CQRS: Commands create events stored in an event log. Read models are projections built by replaying events.

When to use CQRS: High read/write ratio disparity, complex query requirements, multiple read models needed, event sourcing. Overkill for simple CRUD services.

This system: order-service uses a simple JPA model. CQRS would be valuable if complex order dashboards or reporting are added.


Q13 — What is the Strangler Fig Pattern for migrating a monolith to microservices? senior

Answer: The Strangler Fig Pattern gradually extracts functionality from a monolith into microservices, without a risky big-bang rewrite.

Strategy:

  1. Place a gateway/proxy in front of the monolith.
  2. Extract one bounded context (e.g., product catalog) into a new microservice.
  3. Route /products/** requests to the new microservice; all other traffic still goes to the monolith.
  4. Migrate data from the monolith DB to the new service's DB.
  5. Repeat for each bounded context until the monolith is replaced.
Phase 1: All traffic → Monolith
Phase 2: /products → ProductService; rest → Monolith
Phase 3: /products, /orders → Services; rest → Monolith (shrinking)
Phase N: Monolith retired

Challenges:


Q14 — How do you handle distributed tracing across microservices? junior

Answer: Distributed tracing tracks a single request as it flows through multiple services, allowing you to see the full execution path and latency breakdown.

Core concepts:

This system (Micrometer Tracing + Zipkin):

management:
  tracing:
    sampling:
      probability: 1.0    # trace 100% of requests (use 0.1 in production)
  zipkin:
    tracing:
      endpoint: http://zipkin:9411/api/v2/spans

Spring Boot 3.x auto-configures trace context propagation via B3 headers. When gateway calls order-service, it adds X-B3-TraceId, X-B3-SpanId, X-B3-ParentSpanId headers automatically.

Zipkin UI: Shows waterfall view of spans — identify which service added latency. http://localhost:9411


Q15 — What is health check and readiness/liveness probes in microservices? junior

Answer: Liveness probe: Is the service running? If it fails, the container is restarted. Readiness probe: Is the service ready to accept traffic? If it fails, the service is removed from the load balancer but not restarted.

# Kubernetes probes
livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Spring Boot Actuator:

management:
  health:
    livenessstate.enabled: true
    readinessstate.enabled: true
  endpoint.health.probes.enabled: true

Custom health indicator (Redis connectivity):

@Component
public class RedisHealthIndicator implements HealthIndicator {
    @Override
    public Health health() {
        try {
            redisTemplate.opsForValue().get("health-check");
            return Health.up().withDetail("redis", "connected").build();
        } catch (Exception e) {
            return Health.down().withDetail("redis", e.getMessage()).build();
        }
    }
}

Q16 — What is the Sidecar Pattern in microservices? senior

Answer: The Sidecar Pattern deploys a helper process alongside the main service container, sharing the same lifecycle, network, and resources. The sidecar handles cross-cutting concerns without modifying the main application.

Common sidecars:

Sidecar Function
Envoy / Istio proxy mTLS, load balancing, circuit breaking, retries
Log shipper (Fluentd/Filebeat) Collect and forward logs to centralized logging
Metrics agent (Prometheus exporter) Expose application metrics
Config reloader Hot-reload config without restart
Secret manager Inject secrets from Vault/AWS SM

In Kubernetes (Service Mesh):

containers:
  - name: order-service
    image: order-service:latest
  - name: envoy-proxy         # sidecar
    image: envoy:latest
    # Intercepts all network traffic, handles mTLS, retries, etc.

Benefit: Order-service code has zero security/observability code. The sidecar handles it. Upgrade security posture without redeploying services.


Q17 — What is an anti-corruption layer (ACL) and when do you need one? senior

Answer: An Anti-Corruption Layer (ACL) is a translation layer between two bounded contexts with different domain models. It prevents the concepts of one context from polluting another.

Scenario: order-service needs product information from product-service, but their models differ.

// product-service model
record CatalogProduct(UUID productId, String productName, Money listPrice,
                       String categoryCode, List<String> imageUrls) {}

// order-service needs
record OrderLineProduct(UUID id, String name, BigDecimal price) {}

// Anti-corruption layer: translates between contexts
@Service
public class ProductAcl {
    private final ProductServiceClient client;

    public OrderLineProduct getForOrder(UUID productId) {
        CatalogProduct catalog = client.getProduct(productId);
        return new OrderLineProduct(
            catalog.productId(),
            catalog.productName(),
            catalog.listPrice().amount()  // extract scalar from Money value object
        );
    }
}

When needed:


Q18 — How do you handle database migrations in microservices? junior

Answer: Each microservice owns its own database. Schema changes are managed with a migration tool (Flyway or Liquibase) that runs on service startup.

Flyway (used in this system):

src/main/resources/db/migration/
  V1__create_orders_table.sql
  V2__add_order_status_index.sql
  V3__add_outbox_events_table.sql
spring:
  flyway:
    enabled: true
    locations: classpath:db/migration
    baseline-on-migrate: false

Zero-downtime migration patterns (blue-green deployments):

  1. Expand: Add new column as nullable (old and new service version can run together).
  2. Migrate: Backfill the column in a background job.
  3. Contract: Remove the old column after all instances use the new version.

Never: Rename a column in one migration — it breaks existing running instances. Always expand-migrate-contract.

This system: Flyway runs on application startup. For distributed systems, use flyway.out-of-order=false and ensure only one instance runs migrations (use locking or init containers in K8s).


Q19 — What is the Bulkhead Pattern and how does it prevent resource exhaustion? senior

Answer: The Bulkhead Pattern isolates resources (thread pools, connection pools) per integration point, preventing one slow downstream from exhausting all resources.

Without bulkhead: payment-service is slow → order-service spawns threads waiting for payment → thread pool exhausted → order-service cannot process any requests (including unrelated ones).

With bulkhead: Separate thread pool per downstream service.

@Bulkhead(name = "payment-service", fallbackMethod = "paymentFallback")
@CircuitBreaker(name = "payment-service")
public PaymentResult processPayment(PaymentRequest req) {
    return paymentClient.process(req);
}
resilience4j.bulkhead.instances.payment-service:
  maxConcurrentCalls: 20        # max 20 concurrent payment calls
  maxWaitDuration: 500ms        # wait 500ms for a slot before rejecting

Thread pool bulkhead (Resilience4j @ThreadPoolBulkhead):

resilience4j.thread-pool-bulkhead.instances.payment-service:
  maxThreadPoolSize: 10
  coreThreadPoolSize: 5
  queueCapacity: 5

Note with virtual threads: Virtual threads eliminate the thread exhaustion problem — each blocking call gets its own virtual thread. Bulkheads still prevent overloading the downstream service.


Q20 — How do you implement distributed locking in microservices? senior

Answer: Distributed locks coordinate access to shared resources across multiple service instances.

Redis-based lock (Redlock algorithm):

// Acquire lock (SET NX EX)
Boolean acquired = redisTemplate.opsForValue()
    .setIfAbsent("lock:order:" + orderId, "locked", Duration.ofSeconds(10));

if (!Boolean.TRUE.equals(acquired)) {
    throw new ConcurrencyException("Order " + orderId + " is already being processed");
}

try {
    // Critical section — only one instance executes this
    processOrderPayment(orderId);
} finally {
    // Release lock (use Lua script to ensure atomicity)
    redisTemplate.delete("lock:order:" + orderId);
}

Atomic release with Lua (prevents releasing another instance's lock):

String luaScript = """
    if redis.call("get", KEYS[1]) == ARGV[1] then
        return redis.call("del", KEYS[1])
    else
        return 0
    end
    """;
redisTemplate.execute(new DefaultRedisScript<>(luaScript, Long.class),
    List.of("lock:order:" + orderId), lockValue);

Database-based lock:

// PostgreSQL advisory lock
jdbcTemplate.execute("SELECT pg_advisory_lock(" + orderId.hashCode() + ")");
try { ... }
finally { jdbcTemplate.execute("SELECT pg_advisory_unlock(" + orderId.hashCode() + ")"); }

Q21 — What is consumer-driven contract testing? senior

Answer: Consumer-Driven Contract (CDC) testing verifies that a service's API matches what its consumers expect, without requiring both services to run simultaneously.

How it works:

  1. Consumer writes a contract (pact file) defining what it expects from the provider API.
  2. Provider runs the contract in its tests to verify it satisfies all consumer expectations.
  3. Contracts are stored in a shared repository (Pact Broker or Maven repo).

Spring Cloud Contract (used in this system via common module):

// contract defined by order-service (consumer)
Contract.make {
    request {
        method 'GET'
        url '/products/123e4567-e89b-12d3-a456-426614174000'
    }
    response {
        status 200
        body([id: $(anyUuid()), name: $(anyNonBlankString()), price: $(anyPositiveInt())])
        headers { contentType applicationJson() }
    }
}

Benefits:

This system: common module contains stubs generated from contracts, used in integration tests with WireMock.


Q22 — What is graceful shutdown in microservices and how is it implemented? junior

Answer: Graceful shutdown ensures a service completes in-flight requests before stopping, preventing data loss or client errors.

Spring Boot graceful shutdown:

server:
  shutdown: graceful
spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s  # wait up to 30s for in-flight requests

How it works:

  1. Kubernetes (or Docker) sends SIGTERM to the process.
  2. Spring stops accepting new requests (readiness probe fails → removed from LB).
  3. Spring waits for in-flight requests to complete (up to 30s).
  4. Spring closes DB connections, Kafka producers (flushes in-flight messages).
  5. Process exits with code 0.

Kafka producer flush (critical for outbox poller):

@PreDestroy
public void shutdown() {
    log.info("Shutting down outbox poller...");
    running = false;
    kafkaTemplate.flush();  // ensure all buffered messages are sent
}

This system: All services configure graceful shutdown. The outbox poller has @PreDestroy to stop polling and flush Kafka before JVM exits.


Q23 — How does the Rate Limiting pattern work in microservices? junior

Answer: Rate limiting controls the number of requests a client can make in a time window, protecting services from overload and abuse.

Algorithms:

Algorithm Description Burst handling
Token bucket Bucket with N tokens; each request consumes one; bucket refills at rate R Allows burst up to bucket size
Fixed window Count requests per fixed time window (e.g., 100/min) Burst at window boundary
Sliding window log Track timestamps of recent requests Smooth, but memory-heavy
Leaky bucket Requests queue and are processed at fixed rate No burst — smooths traffic

This system (Token Bucket):

HTTP 429 Too Many Requests with Retry-After header tells clients how long to wait.


Q24 — What is the Inbox/Outbox Pattern and how does it differ from just using Kafka? senior

Answer: The Outbox Pattern (write side) ensures events are reliably published from the producer side. The Inbox Pattern (read side) ensures events are idempotently processed on the consumer side.

Combined (full reliability):

Producer (order-service):
  DB Transaction:
    INSERT INTO orders(...)         -- business record
    INSERT INTO outbox(event_id, topic, payload, processed=false)  -- event record
  
  Poller: publish outbox events → mark processed

Consumer (payment-service):
  Receive Kafka message
  DB Transaction:
    IF NOT EXISTS (SELECT 1 FROM inbox WHERE event_id = ?)
      INSERT INTO inbox(event_id, processed_at)  -- idempotency record
      INSERT INTO payments(...)                   -- business record

Without Outbox: Producer can fail between DB commit and Kafka publish → event lost. Without Inbox: Consumer can process the same event twice → duplicate payment. With both: Exactly-once business semantics achieved with at-least-once Kafka delivery.

This system: Outbox pattern is implemented in order-service. Inbox pattern (explicit idempotency table) would be implemented in payment-service for production robustness.


Q25 — How do you handle versioning of Kafka event schemas? senior

Answer: Event schema versioning is critical for evolving events without breaking consumers.

Avro + Schema Registry (this system):

Compatibility types:

Type Meaning Allows
BACKWARD New consumer can read old messages Add optional fields
FORWARD Old consumer can read new messages Remove optional fields
FULL Both BACKWARD and FORWARD Only add/remove optional fields
NONE No compatibility check Any change

Schema evolution rules (BACKWARD compat):

Alternative (versioned topics): Create new topic order.created.v2 for breaking changes. Run consumers for both v1 and v2 during migration.


Q26 — What is a service mesh and how does it differ from API gateway? senior

Answer:

API Gateway Service Mesh
Location Edge (north-south traffic) Internal (east-west traffic)
Manages External client → services Service → service
Features Routing, auth, rate limiting, SSL termination mTLS, observability, retries, circuit breaking
Implementation Application layer (Spring Cloud Gateway, NGINX) Infrastructure layer (Istio, Linkerd sidecar proxies)
Requires code changes Some None (transparent proxy)

Service mesh sidecars intercept all network traffic between services:

order-service → [Envoy sidecar] → [network] → [Envoy sidecar] → payment-service

Envoy handles retries, circuit breakers, mTLS, and tracing — without any code changes in order-service or payment-service.

This system: No service mesh currently. All cross-cutting concerns are in application code (Spring Resilience4j, Micrometer). A service mesh (Istio) would replace much of this code in a Kubernetes deployment.


Q27 — How do you implement feature flags in microservices? senior

Answer: Feature flags (feature toggles) allow enabling/disabling features without deploying new code, enabling canary releases, A/B testing, and safe rollbacks.

Implementation options:

1. Configuration-based (simple):

features:
  new-payment-flow: true
  loyalty-points: false
@Value("${features.new-payment-flow}")
private boolean newPaymentFlowEnabled;

if (newPaymentFlowEnabled) {
    return newPaymentService.process(request);
}
return legacyPaymentService.process(request);

2. Redis-based (dynamic, no restart):

Boolean enabled = redisTemplate.opsForValue().get("feature:new-payment-flow");

3. External flag service (LaunchDarkly, Unleash):

Canary with feature flags: Enable new checkout flow for 1% of users. Monitor error rates. Increase to 10%, 50%, 100%. Instant rollback if issues arise.


Q28 — What is the Retry Pattern and when should you not retry? junior

Answer: The Retry Pattern automatically re-executes a failed operation with the assumption that transient failures (network blip, momentary service overload) will resolve.

Spring Retry:

@Retryable(
    retryFor = {HttpServerErrorException.class, ResourceAccessException.class},
    maxAttempts = 3,
    backoff = @Backoff(delay = 500, multiplier = 2, maxDelay = 5000)
)
public ProductDto getProduct(UUID id) {
    return productClient.get(id);
}

Exponential backoff with jitter (avoid retry storms):

Attempt 1: wait 500ms
Attempt 2: wait 1000ms ± random(0-200ms)  
Attempt 3: wait 2000ms ± random(0-400ms)

Do NOT retry:

This system: @RetryableTopic in Kafka consumers retries failed message processing. HTTP calls between services use circuit breaker + retry only on idempotent operations.


Q29 — How do you design for observability in microservices? senior

Answer: Observability = ability to understand internal state from external outputs. Three pillars:

1. Metrics (Micrometer + Prometheus + Grafana):

// Custom business metric
Counter orderCounter = Counter.builder("orders.created")
    .tag("status", "success")
    .register(meterRegistry);
orderCounter.increment();

// Histogram for latency
Timer.Sample timer = Timer.start(meterRegistry);
processOrder(request);
timer.stop(Timer.builder("order.processing.time").register(meterRegistry));

2. Logs (structured JSON, correlated by traceId):

{"timestamp":"2026-05-31T14:30:00Z","level":"INFO","service":"order-service",
 "traceId":"abc123","spanId":"def456","orderId":"uuid","message":"Order created"}

3. Traces (Zipkin/Jaeger — this system uses Zipkin):

Alerting:


Q30 — What are anti-patterns in microservices to avoid? senior

Answer: Common microservices anti-patterns and their solutions:

1. Distributed Monolith: Services are deployed separately but are tightly coupled (shared DB, synchronous chains). Every change requires coordinating multiple services. Fix: Define proper bounded contexts, use events for loose coupling.

2. Chatty services: Service A makes 10 synchronous calls to Service B per request. Fix: Batch APIs, events, or API composition at the gateway.

3. Shared database: Multiple services read/write the same DB table. Fix: Each service owns its data; others request via API or events.

4. Too many services: CRUD microservices (UserMicroservice with 3 endpoints). Overhead exceeds benefit. Fix: Keep services at bounded-context granularity, not function granularity.

5. Missing idempotency: Retries cause duplicate orders/payments. Fix: Idempotency keys in every state-changing operation.

6. Synchronous event notification: Using REST calls instead of events for async workflows. Fix: Use Kafka events for order→payment→notification flow.

7. Ignoring eventual consistency: UI shows stale data without feedback. Fix: Optimistic UI updates, polling, SSE.

8. Logging without correlation: Impossible to trace a request across services. Fix: Correlation IDs, structured logging, Zipkin.