阿里TTL+Log4j2+MDC实现轻量级日志链路追踪:告别日志大海捞针的烦恼
日志排查的痛点
在我们的日常开发和运维工作中,经常遇到这样的场景:
- 线上出问题了,需要快速定位是哪个用户的请求出了问题
- 查看日志时发现一堆请求混在一起,分不清哪个是哪个
- 需要追踪一个请求从进入系统到结束的完整链路
- 分布式系统中,一个请求经过多个服务,日志分散在各处
传统的日志记录方式往往只能看到零散的信息,无法形成完整的请求链路视图。
解决方案思路
今天我们要解决的,就是如何用阿里TTL + Log4j2 + MDC实现轻量级的日志链路追踪。
核心思路是:
- 请求链路追踪:为每个请求生成唯一标识
- 上下文传递:在请求处理过程中保持追踪标识
- 日志关联:将追踪标识添加到每条日志中
- 跨线程传递:确保异步处理时追踪信息不丢失
技术选型
- 阿里TTL(TransmittableThreadLocal):解决线程池中ThreadLocal传递问题
- Log4j2:高性能日志框架
- MDC(Mapped Diagnostic Context):日志诊断上下文
- Spring Boot:快速集成
核心实现思路
1. 依赖配置
首先在项目中添加必要的依赖:
<dependencies>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>transmittable-thread-local</artifactId>
<version>2.14.2</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-log4j2</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<!-- 排除默认的logback -->
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
2. 链路追踪上下文管理
创建链路追踪上下文管理类:
public class TraceContext {
private static final String TRACE_ID = "traceId";
private static final String SPAN_ID = "spanId";
private static final String PARENT_SPAN_ID = "parentSpanId";
// 使用TTL包装ThreadLocal,确保跨线程传递
private static final TransmittableThreadLocal<Map<String, String>> context =
new TransmittableThreadLocal<Map<String, String>>() {
@Override
protected Map<String, String> initialValue() {
return new HashMap<>();
}
};
/**
* 设置追踪ID
*/
public static void setTraceId(String traceId) {
context.get().put(TRACE_ID, traceId);
// 同时设置到MDC中,供日志框架使用
MDC.put(TRACE_ID, traceId);
}
/**
* 获取追踪ID
*/
public static String getTraceId() {
return context.get().get(TRACE_ID);
}
/**
* 设置跨度ID
*/
public static void setSpanId(String spanId) {
context.get().put(SPAN_ID, spanId);
MDC.put(SPAN_ID, spanId);
}
/**
* 获取跨度ID
*/
public static String getSpanId() {
return context.get().get(SPAN_ID);
}
/**
* 设置父跨度ID
*/
public static void setParentSpanId(String parentSpanId) {
context.get().put(PARENT_SPAN_ID, parentSpanId);
MDC.put(PARENT_SPAN_ID, parentSpanId);
}
/**
* 获取父跨度ID
*/
public static String getParentSpanId() {
return context.get().get(PARENT_SPAN_ID);
}
/**
* 获取所有上下文信息
*/
public static Map<String, String> getAllContext() {
return new HashMap<>(context.get());
}
/**
* 清除上下文
*/
public static void clear() {
context.remove();
MDC.clear();
}
/**
* 生成新的追踪ID
*/
public static String generateTraceId() {
return UUID.randomUUID().toString().replace("-", "");
}
/**
* 生成新的跨度ID
*/
public static String generateSpanId() {
return String.valueOf(System.nanoTime());
}
}
3. 请求拦截器
创建请求拦截器来初始化追踪上下文:
@Component
@Order(1) // 确保优先级最高
public class TraceInterceptor implements HandlerInterceptor {
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
// 从请求头中获取追踪ID(支持分布式调用链)
String traceId = request.getHeader("X-Trace-Id");
if (StringUtils.isEmpty(traceId)) {
traceId = TraceContext.generateTraceId();
}
String spanId = TraceContext.generateSpanId();
TraceContext.setTraceId(traceId);
TraceContext.setSpanId(spanId);
// 将追踪ID添加到响应头,便于前端调试
response.setHeader("X-Trace-Id", traceId);
return true;
}
@Override
public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) {
// 请求完成后清除上下文,避免内存泄漏
TraceContext.clear();
}
}
4. 拦截器注册
注册拦截器:
@Configuration
public class WebConfig implements WebMvcConfigurer {
@Autowired
private TraceInterceptor traceInterceptor;
@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(traceInterceptor)
.addPathPatterns("/**") // 拦截所有请求
.excludePathPatterns("/health", "/actuator/**"); // 排除健康检查等接口
}
}
5. 线程池增强
创建增强的线程池来处理异步任务中的上下文传递:
@Configuration
public class ThreadPoolConfig {
@Bean("traceableExecutor")
public Executor traceableExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(20);
executor.setQueueCapacity(200);
executor.setThreadNamePrefix("traceable-");
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.initialize();
// 包装线程池,使其支持TTL
return TtlExecutors.getTtlExecutor(executor);
}
@Bean("traceableScheduledExecutor")
public ScheduledExecutorService traceableScheduledExecutor() {
ScheduledThreadPoolExecutor scheduledExecutor = new ScheduledThreadPoolExecutor(5);
// 包装定时任务线程池
return TtlExecutors.getTtlScheduledExecutor(scheduledExecutor);
}
}
6. 异步方法增强
对于使用@Async注解的异步方法,需要特别处理:
@Service
public class BusinessService {
@Autowired
@Qualifier("traceableExecutor")
private Executor traceableExecutor;
/**
* 使用自定义线程池的异步方法
*/
@Async("traceableExecutor")
public CompletableFuture<String> processAsync(String data) {
// 这里可以直接使用TraceContext获取追踪信息
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.getSpanId();
log.info("异步处理开始, traceId: {}, spanId: {}, data: {}", traceId, spanId, data);
// 业务处理逻辑
String result = processData(data);
log.info("异步处理完成, traceId: {}, spanId: {}, result: {}", traceId, spanId, result);
return CompletableFuture.completedFuture(result);
}
/**
* 手动提交任务到支持追踪的线程池
*/
public void submitTask(String data) {
traceableExecutor.execute(() -> {
// 在新线程中恢复追踪上下文
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.generateSpanId();
TraceContext.setSpanId(spanId);
try {
log.info("任务执行开始, traceId: {}, spanId: {}", traceId, spanId);
// 业务逻辑
processTask(data);
log.info("任务执行完成, traceId: {}, spanId: {}", traceId, spanId);
} finally {
// 清理当前线程的上下文
TraceContext.clear();
}
});
}
private String processData(String data) {
// 模拟业务处理
return "processed_" + data;
}
private void processTask(String data) {
// 模拟任务处理
log.info("处理任务: {}", data);
}
}
7. 日志配置
配置Log4j2,将追踪信息输出到日志中:
<!-- log4j2-spring.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %-5level [%X{traceId:-}] [%X{spanId:-}] %logger{36} - %msg%n"/>
</Console>
<RollingFile name="RollingFile" fileName="logs/app.log"
filePattern="logs/app-%d{yyyy-MM-dd}-%i.log.gz">
<PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %-5level [%X{traceId:-}] [%X{spanId:-}] %logger{36} - %msg%n"/>
<Policies>
<TimeBasedTriggeringPolicy/>
<SizeBasedTriggeringPolicy size="100MB"/>
</Policies>
<DefaultRolloverStrategy max="10"/>
</RollingFile>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="Console"/>
<AppenderRef ref="RollingFile"/>
</Root>
</Loggers>
</Configuration>
8. 全局异常处理器
在异常处理中也包含追踪信息:
@RestControllerAdvice
@Slf4j
public class GlobalExceptionHandler {
@ExceptionHandler(Exception.class)
public ResponseEntity<ErrorResponse> handleException(Exception e) {
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.getSpanId();
log.error("系统异常, traceId: {}, spanId: {}, error: {}", traceId, spanId, e.getMessage(), e);
ErrorResponse errorResponse = new ErrorResponse();
errorResponse.setCode("SYSTEM_ERROR");
errorResponse.setMessage("系统异常,请联系管理员");
errorResponse.setTraceId(traceId);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(errorResponse);
}
@ExceptionHandler(BusinessException.class)
public ResponseEntity<ErrorResponse> handleBusinessException(BusinessException e) {
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.getSpanId();
log.warn("业务异常, traceId: {}, spanId: {}, code: {}, message: {}",
traceId, spanId, e.getCode(), e.getMessage());
ErrorResponse errorResponse = new ErrorResponse();
errorResponse.setCode(e.getCode());
errorResponse.setMessage(e.getMessage());
errorResponse.setTraceId(traceId);
return ResponseEntity.badRequest().body(errorResponse);
}
}
9. Feign客户端集成
如果使用Feign进行服务间调用,需要传递追踪信息:
@Configuration
public class FeignConfig {
@Bean
public RequestInterceptor traceIdRequestInterceptor() {
return requestTemplate -> {
String traceId = TraceContext.getTraceId();
if (traceId != null) {
requestTemplate.header("X-Trace-Id", traceId);
}
};
}
}
10. 使用示例
在业务代码中使用追踪功能:
@RestController
@RequestMapping("/api/user")
@Slf4j
public class UserController {
@Autowired
private UserService userService;
@GetMapping("/{id}")
public ResponseEntity<User> getUser(@PathVariable Long id) {
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.getSpanId();
log.info("开始查询用户信息, traceId: {}, spanId: {}, userId: {}", traceId, spanId, id);
try {
User user = userService.getUserById(id);
log.info("查询用户信息成功, traceId: {}, spanId: {}, userId: {}, result: {}",
traceId, spanId, id, user != null ? "found" : "not found");
return ResponseEntity.ok(user);
} catch (Exception e) {
log.error("查询用户信息失败, traceId: {}, spanId: {}, userId: {}, error: {}",
traceId, spanId, id, e.getMessage(), e);
throw e;
}
}
@PostMapping
public ResponseEntity<User> createUser(@RequestBody User user) {
String traceId = TraceContext.getTraceId();
String spanId = TraceContext.getSpanId();
log.info("开始创建用户, traceId: {}, spanId: {}, userData: {}", traceId, spanId, user);
User createdUser = userService.createUser(user);
log.info("创建用户成功, traceId: {}, spanId: {}, userId: {}",
traceId, spanId, createdUser.getId());
return ResponseEntity.ok(createdUser);
}
}
优势分析
相比传统的日志记录方式,这种方案的优势明显:
- 请求追踪:每条日志都包含追踪ID,轻松关联同一请求的所有日志
- 跨线程传递:使用TTL确保异步处理时追踪信息不丢失
- 轻量级:无需引入复杂的APM系统,成本低
- 性能好:对系统性能影响小
- 易集成:可以逐步在现有系统中引入
注意事项
- 内存管理:及时清理ThreadLocal,避免内存泄漏
- 性能影响:虽然影响很小,但在高并发场景下仍需关注
- 日志格式:统一日志格式,便于后续分析
- 安全性:避免在日志中记录敏感信息
- 测试验证:充分测试跨线程传递的场景
总结
通过阿里TTL + Log4j2 + MDC的技术组合,我们可以构建一个轻量级但功能强大的日志链路追踪系统。这不仅能大幅提升问题排查效率,还能帮助我们更好地理解系统的运行状态。
在实际项目中,建议根据具体需求进行定制化开发,并建立相应的日志规范和分析流程。
服务端技术精选,专注分享后端开发实战技术,助力你的技术成长!
标题:阿里TTL+Log4j2+MDC实现轻量级日志链路追踪:告别日志大海捞针的烦恼
作者:jiangyi
地址:http://jiangyi.space/articles/2026/01/12/1768217231337.html
0 评论