Distributed Tracing with Spring Cloud Sleuth + Zipkin + Kafka

Modern microservices architectures rely on multiple distributed components working together to deliver seamless functionality. However, troubleshooting errors, debugging bottlenecks, or understanding the flow of requests across services in such environments is often challenging. This is where distributed tracing becomes invaluable.

Spring Cloud Sleuth, coupled with Zipkin, provides a straightforward way to trace requests across microservices. When you add Kafka as the messaging backbone, these tools work together to propagate trace information seamlessly through event streams, giving you complete visibility into service interactions.

This guide explores how distributed tracing works with Sleuth, Zipkin, and Kafka. We’ll cover trace and span propagation, Kafka integration, using Zipkin for trace visualization, and correlating traces with logs for precise debugging.

What is Distributed Tracing and Why It Matters
Trace ID and Span ID Propagation
Kafka Integration for Trace Propagation
Viewing Traces in Zipkin
Correlating Traces with Logs
Summary

What is Distributed Tracing and Why It Matters

What is Distributed Tracing?

Distributed tracing tracks the flow of requests across multiple services and components in a system. It assigns trace IDs to requests and tracks their progression through the system, providing insights into latency, bottlenecks, and failed service calls.

Why It’s Important

Understand Service Interactions: Trace the full lifecycle of requests across your microservices.
Identify Performance Bottlenecks: Visualize latency at each service level.
Debug Failures Faster: Pinpoint the exact service or component causing issues using correlated logs and traces.
Monitor SLAs: Ensure that service-level agreements are respected in multistep workflows.

Example: When a user submits an order, the trace might span multiple services, including Order Service, Inventory Service, and Payment Service, giving you a holistic view of the request flow.

Tools for Distributed Tracing

Spring Cloud Sleuth: Adds trace management capabilities, such as generating trace IDs and span IDs.
Zipkin: Collects and visualizes trace data.
Kafka: Ensures trace propagation across microservices connected by event streaming.

Trace ID and Span ID Propagation

What Are Trace IDs and Span IDs?

Trace ID: A unique identifier assigned to a request as it travels through the system.
Span ID: A unique identifier for a single unit of work within a trace. Each service or component creates a new span under the same trace.

How Trace Propagation Works

Trace and span IDs are embedded in HTTP headers (e.g., X-B3-TraceId, X-B3-SpanId) or Kafka message headers.
Sleuth intercepts incoming requests, assigns IDs, and propagates them downstream.
Processing in each service generates spans, which are reported to a tracing backend like Zipkin.

Example in Spring Boot

When using Sleuth, propagating trace and span IDs is automatic.

Add Spring Cloud Sleuth to your project:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

Trace data is automatically added to requests:

2025-06-14 12:34:56 [INFO] [trace-id=abc123] [span-id=xyz456] Order placed successfully.

Spring Sleuth ensures these trace IDs appear in logs and HTTP requests.

Customizing Sleuth Trace IDs

You can customize the default tracing behavior:

spring.sleuth.trace-id128=true
spring.sleuth.sampler.probability=0.5

This configuration enables 128-bit trace IDs and adjusts sampling to 50%, reducing overhead in production.

Kafka Integration for Trace Propagation

Kafka introduces challenges in distributed tracing because messages often move asynchronously between producers and consumers. With Sleuth, however, trace metadata can seamlessly flow through Kafka headers.

How Sleuth Propagates Traces via Kafka

Producer Side: Sleuth intercepts Kafka message creation and attaches trace metadata (trace ID, span ID) as message headers.
Consumer Side: Sleuth retrieves and uses the metadata from Kafka headers to continue the trace.

Setting Up Kafka Tracing with Sleuth

Add Spring Kafka and Sleuth dependencies:

<dependency>
    <groupId>org.springframework.kafka</groupId>
    <artifactId>spring-kafka</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

Example Producer

@Service
public class KafkaProducer {
    private final KafkaTemplate<String, String> kafkaTemplate;

    public KafkaProducer(KafkaTemplate<String, String> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void publishEvent(String topic, String message) {
        kafkaTemplate.send(topic, message);
    }
}

Example Consumer

@Service
public class KafkaConsumer {

    @KafkaListener(topics = "order-events", groupId = "order-consumer")
    public void listen(String message) {
        System.out.println("Traceable message consumed: " + message);
    }
}

Sleuth automatically handles trace context propagation.

Viewing Traces in Zipkin

What is Zipkin?

Zipkin collects traces produced by Sleuth and visualizes them in a web UI, showing the duration, latency, and relationships between spans.

Setting Up Zipkin

Add Zipkin to your Spring Boot project:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>

Configure the Zipkin server URL:

spring.zipkin.base-url=http://localhost:9411

Start the Zipkin server by running:

docker run -d -p 9411:9411 openzipkin/zipkin

Viewing Traces in Zipkin UI

Access Zipkin at http://localhost:9411.
Search by trace ID or service name to view individual span details.
Use Zipkin’s timeline view to inspect bottlenecks and errors.

Correlating Traces with Logs

Why is Logging Important for Tracing?

Logs provide detailed service-level insights that traces alone may not capture. Correlating traces with logs allows granular troubleshooting.

How to Correlate Traces and Logs

Enable Sleuth’s integration with your logging framework (e.g., Logback).
Sleuth automatically appends traceId and spanId to each log entry.
Use these IDs to correlate trace timelines in Zipkin with logs.

Example Log Output:

2025-06-14 12:34:56 [INFO] [trace-id=abc123] [span-id=xyz456] Order confirmed for user John Doe.

Searching Logs by Trace ID

Use Elasticsearch or other logging tools (e.g., Kibana) to search for all logs associated with a specific trace ID:

GET logs/_search
{
  "query": {
    "match": {
      "trace-id": "abc123"
    }
  }
}

This helps pinpoint the exact location and cause of issues.

Summary

Distributed tracing with Spring Cloud Sleuth, Zipkin, and Kafka unlocks powerful debugging and monitoring capabilities in microservices architectures. Here’s a quick recap:

Trace Propagation: Sleuth automatically generates and propagates trace and span IDs across HTTP requests and Kafka messages.
Kafka Integration: Seamlessly attaches trace metadata to Kafka headers, ensuring tracing continuity in asynchronous systems.
Zipkin Visualization: Provides detailed trace analysis to uncover service interactions, latency, and failures.
Log Correlation: Combines trace IDs with logs for deep-dive debugging into specific events.

By implementing distributed tracing, you gain a comprehensive view of your system’s behavior, enabling faster troubleshooting, better user experiences, and improved service reliability. Start leveraging these tools today to bring unmatched observability to your microservices!

Table of Contents