dubbo+zipkin调用链监控

收集器抽象

由于zipkin支持http以及kafka两种方式上报数据，所以在配置上需要做下抽象。

AbstractZipkinCollectorConfiguration

主要是针对下面两种收集方式的一些配置上的定义，最核心的是Sender接口的定义，http与kafka是两类完全不同的实现。

public abstract Sender getSender();

其次是协助性的构造函数，主要是配合构建收集器所需要的一些参数。

zipkinUrl

如果是http收集，那么对应的是zipkin api域名，如果是kafka，对应的是kafka集群的地址

topic

仅在收集方式为kafka是有效，http时传空值即可。

public AbstractZipkinCollectorConfiguration(String serviceName,String zipkinUrl,String topic){      this.zipkinUrl=zipkinUrl;      this.serviceName=serviceName;      this.topic=topic;      this.tracing=this.tracing();  }

配置上报方式,这里统一采用异常上传，并且配置上报的超时时间。

protected AsyncReporter<Span> spanReporter() {      return AsyncReporter              .builder(getSender())              .closeTimeout(500, TimeUnit.MILLISECONDS)              .build(SpanBytesEncoder.JSON_V2);  }

下面这两方法，是配合应用构建span使用的。

注意那个sampler()方法，默认是什么也不做的意思，我们要想看到数据就需要配置成Sampler.ALWAYS_SAMPLE，这样才能真正将数据上报到zipkin服务器。

protected Tracing tracing() {      this.tracing= Tracing              .newBuilder()              .localServiceName(this.serviceName)              .sampler(Sampler.ALWAYS_SAMPLE)              .spanReporter(spanReporter())              .build();      return this.tracing;  }  protected Tracing getTracing(){      return this.tracing;  }

HttpZipkinCollectorConfiguration

主要是实现getSender方法，可以借用OkHttpSender这个对象来快速构建，api版本采用v2。

public class HttpZipkinCollectorConfiguration extends AbstractZipkinCollectorConfiguration {      public HttpZipkinCollectorConfiguration(String serviceName,String zipkinUrl) {          super(serviceName,zipkinUrl,null);      }      @Override      public Sender getSender() {          return OkHttpSender.create(super.getZipkinUrl()+"/api/v2/spans");      }  }

OkHttpSender这个类需要引用这个包

<dependency>      <groupId>io.zipkin.reporter2</groupId>      <artifactId>zipkin-sender-okhttp3</artifactId>      <version>${zipkin-reporter2.version}</version>  </dependency>

KafkaZipkinCollectorConfiguration

同样也是实现getSender方法

public class KafkaZipkinCollectorConfiguration extends AbstractZipkinCollectorConfiguration {      public KafkaZipkinCollectorConfiguration(String serviceName,String zipkinUrl,String topic) {          super(serviceName,zipkinUrl,topic);      }      @Override      public Sender getSender() {          return KafkaSender                  .newBuilder()                  .bootstrapServers(super.getZipkinUrl())                  .topic(super.getTopic())                  .encoding(Encoding.JSON)                  .build();      }  }

KafkaSender这个类需要引用这个包：

<dependency>      <groupId>io.zipkin.reporter2</groupId>      <artifactId>zipkin-sender-kafka11</artifactId>      <version>${zipkin-reporter2.version}</version>  </dependency>

收集器工厂

由于上面创建了两个收集器配置类，使用时只能是其中之一，所以实际运行的实例需要根据配置来动态生成。ZipkinCollectorConfigurationFactory就是负责生成收集器实例的。

private final AbstractZipkinCollectorConfiguration zipkinCollectorConfiguration;  @Autowired  public ZipkinCollectorConfigurationFactory(TraceConfig traceConfig){      if(Objects.equal("kafka", traceConfig.getZipkinSendType())){          zipkinCollectorConfiguration=new KafkaZipkinCollectorConfiguration(                  traceConfig.getApplicationName(),                  traceConfig.getZipkinUrl(),                  traceConfig.getZipkinKafkaTopic());      }      else {          zipkinCollectorConfiguration = new HttpZipkinCollectorConfiguration(                  traceConfig.getApplicationName(),                  traceConfig.getZipkinUrl());      }  }

通过构建函数将我们的配置类TraceConfig注入进来，然后根据发送方式来构建实例。另外提供一个辅助函数：

public Tracing getTracing(){      return this.zipkinCollectorConfiguration.getTracing();  }

过滤器

在dubbo的过滤器中实现数据上传的功能逻辑相对简单，一般都在invoke方法执行前记录数据，然后方法执行完成后再次记录数据。这个逻辑不变，有变化的是数据上报的实现，上一个版本是通过发http请求实现需要编码，现在可以直接借用brave所提供的span来帮助我们完成，有两重要的方法:

finish

方法源码如下，在完成的时候会填写上完成的时间并上报数据，这一般应用于同步调用场景。

public void finish(TraceContext context, long finishTimestamp) {      MutableSpan span = this.spanMap.remove(context);      if(span != null && !this.noop.get()) {          synchronized(span) {              span.finish(Long.valueOf(finishTimestamp));              this.reporter.report(span.toSpan());          }      }  }

flush 与上面finish方法的不同点在于，在报数据时没有完成时间，这应该是适用于一些异步调用但不关心结果的场景，比如dubbo所提供的oneway方式调用。

public void flush(TraceContext context) {      MutableSpan span = this.spanMap.remove(context);      if(span != null && !this.noop.get()) {          synchronized(span) {              span.finish((Long)null);              this.reporter.report(span.toSpan());          }      }  }

消费者

做为消费方，有一个核心功能就是将traceId以及spanId传递到服务提供方，这里还是通过dubbo提供的附加参数方式实现。

@Override  public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {      if(!RpcTraceContext.getTraceConfig().isEnabled()){          return invoker.invoke(invocation);      }      ZipkinCollectorConfigurationFactory zipkinCollectorConfigurationFactory=              SpringContextUtils.getApplicationContext().getBean(ZipkinCollectorConfigurationFactory.class);      Tracer tracer= zipkinCollectorConfigurationFactory.getTracing().tracer();      if(null==RpcTraceContext.getTraceId()){          RpcTraceContext.start();          RpcTraceContext.setTraceId(IdUtils.get());          RpcTraceContext.setParentId(null);          RpcTraceContext.setSpanId(IdUtils.get());      }      else {          RpcTraceContext.setParentId(RpcTraceContext.getSpanId());          RpcTraceContext.setSpanId(IdUtils.get());      }      TraceContext traceContext= TraceContext.newBuilder()              .traceId(RpcTraceContext.getTraceId())              .parentId(RpcTraceContext.getParentId())              .spanId(RpcTraceContext.getSpanId())              .sampled(true)              .build();      Span span=tracer.toSpan(traceContext).start();      invocation.getAttachments().put(RpcTraceContext.TRACE_ID_KEY, String.valueOf(span.context().traceId()));      invocation.getAttachments().put(RpcTraceContext.SPAN_ID_KEY, String.valueOf(span.context().spanId()));      Result result = invoker.invoke(invocation);      span.finish();      return result;  }

提供者

@Override      public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {          if(!RpcTraceContext.getTraceConfig().isEnabled()){              return invoker.invoke(invocation);          }          Map<String, String> attaches = invocation.getAttachments();          if (!attaches.containsKey(RpcTraceContext.TRACE_ID_KEY)){              return invoker.invoke(invocation);          }          Long traceId = Long.valueOf(attaches.get(RpcTraceContext.TRACE_ID_KEY));          Long spanId = Long.valueOf(attaches.get(RpcTraceContext.SPAN_ID_KEY));          attaches.remove(RpcTraceContext.TRACE_ID_KEY);          attaches.remove(RpcTraceContext.SPAN_ID_KEY);          RpcTraceContext.start();          RpcTraceContext.setTraceId(traceId);          RpcTraceContext.setParentId(spanId);          RpcTraceContext.setSpanId(IdUtils.get());          ZipkinCollectorConfigurationFactory zipkinCollectorConfigurationFactory=                  SpringContextUtils.getApplicationContext().getBean(ZipkinCollectorConfigurationFactory.class);          Tracer tracer= zipkinCollectorConfigurationFactory.getTracing().tracer();          TraceContext traceContext= TraceContext.newBuilder()                  .traceId(RpcTraceContext.getTraceId())                  .parentId(RpcTraceContext.getParentId())                  .spanId(RpcTraceContext.getSpanId())                  .sampled(true)                  .build();          Span span = tracer.toSpan(traceContext).start();          Result result = invoker.invoke(invocation);          span.finish();          return result;      }

异常流程

上面无论是消费者的过滤器还是服务提供者的过滤器，均未考虑服务在调用invoker.invoke时出错的场景，如果出错，后面的span.finish方法将不会按预期执行，也就记录不了信息。所以需要针对此问题做优化：可以在finally块中执行finish方法。

try {      result = invoker.invoke(invocation);  }  finally {      span.finish();  }

消费者在调用服务时，异步调用问题

上面过滤器中调用span.finish都是基于同步模式，而由于dubbo除了同步调用外还提供了两种调用方式

异步调用通过callback机制的异步
oneway

只发起请求并不等待结果的异步调用，无callback一说

针对上面两类异步再加上同步调用，我们要想准确记录服务真正的时间，需要在消费方的过滤器中做如下处理：

创建一个用于回调的处理类，它的主要目的是为了在回调成功时记录时间，这里无论是成功还是失败。

private class AsyncSpanCallback implements ResponseCallback{      private Span span;      public AsyncSpanCallback(Span span){          this.span=span;      }      @Override      public void done(Object o) {          span.finish();      }      @Override      public void caught(Throwable throwable) {          span.finish();      }  }

再在调用invoke方法时,如果是oneway方式，则调用flush方法结果，如果是同步则直接调用finish方法，如果是异步则在回调时调用finish方法。

Result result = null;  boolean isOneway = RpcUtils.isOneway(invoker.getUrl(), invocation);  try {      result = invoker.invoke(invocation);  }  finally {      if(isOneway) {          span.flush();      }      else if(!isAsync) {          span.finish();      }  }

欢迎工作一到五年的Java工程师朋友们加入Java架构开发： 855835163
群内提供免费的Java架构学习资料（里面有高可用、高并发、高性能及分布式、Jvm性能调优、Spring源码，MyBatis，Netty,Redis,Kafka,Mysql,Zookeeper,Tomcat,Docker,Dubbo,Nginx等多个知识点的架构资料）合理利用自己每一分每一秒的时间来学习提升自己，不要再用"没有时间“来掩饰自己思想上的懒惰！趁年轻，使劲拼，给未来的自己一个交代！

来源：oschina

链接：https://my.oschina.net/u/3959468/blog/2236850

标签

Zipkin

Dubbo

Kafka