Opentracing

Overview

Opentracing is a vendor-agnostic API to achieve distributed tracing in a microservice architecture.

Distributed tracing tracks a single request through all of its journey, from its source to its destination. This means if we have a frontend that makes a request to a server, and that server then fires off a stored procedure from a database, then returns to the server, and triggers some other events to happen, this will all appear on the trace.

  • normal traces will only track a request through a single application domain.
  • Therefore, we can say that distributed tracing is the stitching of multiple requests across multiple systems. The stitching is often done by one or more correlationIds, and the tracing is often a set of recorded, structured log events across all the systems, stored in a central place.

In OpenTracing, a trace is a directed acyclic graph of Spans with References that may look like this:

[Span A]  ←←←(the root span)
            |
     +------+------+
     |             |
 [Span B]      [Span C] ←←←(Span C is a `ChildOf` Span A)
     |             |
 [Span D]      +---+-------+
               |           |
           [Span E]    [Span F] >>> [Span G] >>> [Span H]
                                       ↑
                                       ↑
                                       ↑
                         (Span G `FollowsFrom` Span F)

This allows us to model how our application calls out to other applications, internal functions, asynchronous jobs, etc. All of these can be modeled as Spans

Span

The “span” is the primary building block of a distributed trace, representing an individual unit of work done in a distributed system.

  • Each component of the distributed system contributes a span- a named, timed operation representing a piece of the workflow.

A Span represents a separate software system communicating over messaging or HTTP.

Spans can (and generally do) contain “References” to other spans, which allows multiple Spans to be assembled into one complete Trace - a visualization of the life of a request as it moves through a distributed system.

Each span encapsulates the following state according to the OpenTracing specification:

  • An operation name
  • A start timestamp and finish timestamp
  • A set of key:value span Tags
  • A set of key:value span Logs
  • A SpanContext

Spans can connect to each other via two types of relationship: ChildOf and FollowsFrom. ChildOf Spans are spans like in our previous example, where our ordering website sent child requests to both our payment system and inventory system. FollowsFrom Spans are just a chain of sequential Spans. So, a FollowsFrom Span is just saying, “I started after this other Span.”

Parts of a Span

Tag

key:value pairs that enable user-defined annotation of spans in order to query, filter, and comprehend trace data.

Examples may include tag keys like db.instance to identify a database host, http.status_code to represent the HTTP response code, or error which can be set to True if the operation represented by the Span fails.

Example

  • db.instance:"customers"
  • db.statement:"SELECT * FROM mytable WHERE foo='bar'"
  • peer.address:"mysql://127.0.0.1:3306/customers"

Log

key:value pairs that are useful for capturing span-specific logging messages and other debugging or informational output from the application itself.

Logs may be useful for documenting a specific moment or event within the span (in contrast to tags which should apply to the span as a whole).

SpanContext

Carries data across process boundaries.

It has two major components:

  1. An implementation-dependent state to refer to the distinct span within a trace
    • i.e., the implementing Tracer’s definition of spanID and traceID
  2. Any Baggage Items
    • These are key:value pairs that cross process-boundaries.
    • These may be useful to have some data available for access throughout the trace.

Example Span:

    t=0            operation name: db_query               t=x

     +-----------------------------------------------------+
     | · · · · · · · · · ·    Span     · · · · · · · · · · |
     +-----------------------------------------------------+

Tags:
- db.instance:"customers"
- db.statement:"SELECT * FROM mytable WHERE foo='bar'"
- peer.address:"mysql://127.0.0.1:3306/customers"

Logs:
- message:"Can't connect to mysql server on '127.0.0.1'(10061)"

SpanContext:
- trace_id:"abc123"
- span_id:"xyz789"
- Baggage Items:
  - special_id:"vsid1738"

Tips for Implementation

  • Use dependency injection where possible. This will make things easily testable and configurable.
  • Follow the idioms of your language and frameworks as much as possible. This will let your team members easily onboard into OpenTracing and tracing in general.
  • Many frameworks provide extensibility points around units of work. For example, Spring Boot has pre- and post-request handlers for web requests. Leverage these as much as possible to save you effort when instrumenting tracing.
  • If you don’t have a framework or can’t use its extensibility points, keep most of the tracing instrumentation as isolated from the business logic as possible. You can use patterns like Decorator and Chain of Responsibility for this, or even aspect-oriented programming. This will make your code’s intent clearer and easier to read. Exceptions to this include when you need to add tags or log an event; these are commonly specific to the business logic your code is executing.

E Resources


Children
  1. Span