文章 555
评论 5
浏览 199998
SpringBoot + 数据库连接池假死排查:HikariCP 获取连接超时?连接泄漏检测 + 线程 Dump 分析

SpringBoot + 数据库连接池假死排查:HikariCP 获取连接超时?连接泄漏检测 + 线程 Dump 分析

一、问题背景:线上故障的惊魂时刻

某日下午,线上系统突然出现大量接口超时,错误日志显示:

Could not get JDBC Connection; nested exception is 
java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, 
request timed out after 30000ms.

更诡异的是:数据库服务器正常,网络连接正常,应用日志中却没有任何 SQL 执行超时。

这正是数据库连接池"假死"现象的典型特征。


二、核心概念:HikariCP 连接池机制

2.1 连接池工作原理

┌────────────────────────────────────────────────────────────────┐
│                  HikariCP 连接池架构                           │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│   应用线程                                                      │
│       │                                                        │
│       ▼                                                        │
│   ┌────────────────────────────────────────────────────────┐   │
│   │              Connection 请求                           │   │
│   └────────────────────────────────────────────────────────┘   │
│                           │                                    │
│                           ▼                                    │
│   ┌────────────────────────────────────────────────────────┐   │
│   │                   连接池管理器                           │   │
│   │  ┌──────────┐  ┌──────────┐  ┌──────────┐             │   │
│   │  │ Conn-1   │  │ Conn-2   │  │ Conn-3   │  ...        │   │
│   │  │ ACTIVE   │  │ IDLE     │  │ ACTIVE   │             │   │
│   │  └──────────┘  └──────────┘  └──────────┘             │   │
│   │                                                       │   │
│   │  totalConnections: 20                                │   │
│   │  activeConnections: 18                               │   │
│   │  idleConnections: 2                                  │   │
│   │  threadsAwaitingConnection: 5                        │   │
│   └────────────────────────────────────────────────────────┘   │
│                           │                                    │
│                           ▼                                    │
│   ┌────────────────────────────────────────────────────────┐   │
│   │                    MySQL 数据库                          │   │
│   └────────────────────────────────────────────────────────┘   │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

2.2 连接泄漏的成因分析

┌────────────────────────────────────────────────────────────────┐
│                   连接泄漏成因分析                               │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. 未关闭连接                                                  │
│     try {                                                       │
│         Connection conn = ds.getConnection();                    │
│         // 业务逻辑                                             │
│         // 忘记 conn.close()                                    │
│     } catch (...) {}                                            │
│                                                                 │
│  2. 连接未正确归还                                              │
│     try {                                                       │
│         Connection conn = ds.getConnection();                    │
│         if (condition) {                                        │
│             return;  // 早期返回,连接未关闭                     │
│         }                                                       │
│         conn.close();                                           │
│     } catch (...) {}                                            │
│                                                                 │
│  3. 长事务占用连接                                              │
│     BEGIN TRANSACTION;                                          │
│     // 处理大量数据,耗时过长                                    │
│     // 其他请求等待连接                                          │
│     COMMIT;                                                     │
│                                                                 │
│  4. 异常处理不当                                                │
│     try {                                                       │
│         Connection conn = ds.getConnection();                    │
│         throw new RuntimeException();                           │
│         conn.close();                                           │
│     } catch (Exception e) {                                     │
│         // 连接未关闭                                           │
│     }                                                          │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

2.3 HikariCP 配置参数解析

参数默认值说明
maximumPoolSize10连接池最大连接数
minimumIdle10空闲时保留的最小连接数
connectionTimeout30000ms获取连接超时时间
idleTimeout600000ms空闲连接超时时间
maxLifetime1800000ms连接最大生命周期
connectionTestQuery-连接测试 SQL
leakDetectionThreshold0 (禁用)泄漏检测阈值
poolNameHikariPool连接池名称

三、实现方案:连接泄漏检测 + 线程 Dump 分析

3.1 方案架构设计

┌────────────────────────────────────────────────────────────────┐
│                  连接池健康监控架构                             │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────────┐    ┌──────────────────┐                │
│  │   泄漏检测       │───▶│   告警通知        │                │
│  └──────────────────┘    └──────────────────┘                │
│           │                                                     │
│           ▼                                                     │
│  ┌──────────────────┐    ┌──────────────────┐                │
│  │   线程 Dump      │───▶│   堆栈分析        │                │
│  └──────────────────┘    └──────────────────┘                │
│           │                                                     │
│           ▼                                                     │
│  ┌──────────────────┐    ┌──────────────────┐                │
│  │   连接池监控     │───▶│   性能指标        │                │
│  └──────────────────┘    └──────────────────┘                │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

3.2 HikariCP 连接池配置

@Configuration
public class HikariConfig {

    @Bean
    @ConfigurationProperties(prefix = "spring.datasource.hikari")
    public DataSource dataSource() {
        HikariConfigSource config = new HikariConfigSource();
        
        // 基础配置
        config.setPoolName("OrderPool");
        config.setMaximumPoolSize(20);
        config.setMinimumIdle(5);
        
        // 超时配置
        config.setConnectionTimeout(30000);
        config.setIdleTimeout(600000);
        config.setMaxLifetime(1800000);
        
        // 泄漏检测配置(关键!)
        config.setLeakDetectionThreshold(60000);  // 60秒未归还视为泄漏
        
        // 连接测试
        config.setConnectionTestQuery("SELECT 1");
        
        // 其他优化配置
        config.setAutoCommit(true);
        config.setConnectionInitSql("SELECT 1");
        
        return new HikariDataSource(config);
    }
}

3.3 连接泄漏检测工具

@Component
@Slf4j
public class ConnectionLeakDetector {

    @Autowired
    private DataSource dataSource;

    private ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);

    /**
     * 启动连接泄漏检测
     */
    @PostConstruct
    public void start() {
        // 每分钟检查一次连接池状态
        scheduler.scheduleAtFixedRate(this::checkForLeaks, 1, 1, TimeUnit.MINUTES);
    }

    @PreDestroy
    public void stop() {
        scheduler.shutdown();
    }

    private void checkForLeaks() {
        if (!(dataSource instanceof HikariDataSource)) {
            return;
        }

        HikariDataSource hikariDataSource = (HikariDataSource) dataSource;
        HikariPool pool = hikariDataSource.getHikariPoolMXBean();

        if (pool == null) {
            return;
        }

        try {
            PoolStats stats = pool.getPoolStats();
            
            log.info("HikariCP Pool Stats: active={}, idle={}, total={}, waiting={}", 
                stats.getActiveConnections(),
                stats.getIdleConnections(),
                stats.getTotalConnections(),
                stats.getThreadsAwaitingConnection());

            // 检测潜在泄漏
            if (stats.getThreadsAwaitingConnection() > 0) {
                log.warn("Connection pool is under pressure! {} threads waiting for connection", 
                    stats.getThreadsAwaitingConnection());
                
                // 触发线程 Dump 分析
                triggerThreadDump();
            }

            // 监控连接泄漏
            if (stats.getActiveConnections() > hikariDataSource.getMaximumPoolSize() * 0.8) {
                log.error("Connection pool usage exceeds 80%! Active: {}, Max: {}",
                    stats.getActiveConnections(), 
                    hikariDataSource.getMaximumPoolSize());
                
                // 发送告警
                sendAlert(stats);
            }

        } catch (Exception e) {
            log.error("Failed to check connection pool", e);
        }
    }

    /**
     * 触发线程 Dump
     */
    private void triggerThreadDump() {
        StringBuilder dump = new StringBuilder();
        
        Map<Thread, StackTraceElement[]> allStacks = Thread.getAllStackTraces();
        
        dump.append("\n========== Connection-Related Thread Dump ==========\n");
        
        for (Map.Entry<Thread, StackTraceElement[]> entry : allStacks.entrySet()) {
            Thread thread = entry.getKey();
            StackTraceElement[] stack = entry.getValue();
            
            // 只关注等待数据库连接的线程
            String threadInfo = thread.getName() + " (state=" + thread.getState() + 
                ", daemon=" + thread.isDaemon() + ")";
            
            // 检查是否在等待连接
            boolean waitingForConnection = Arrays.stream(stack)
                .anyMatch(s -> s.toString().contains("Hikari") || 
                              s.toString().contains("getConnection"));
            
            if (waitingForConnection || thread.getName().contains("Hikari")) {
                dump.append("\n").append(threadInfo).append("\n");
                for (StackTraceElement element : stack) {
                    dump.append("    at ").append(element).append("\n");
                }
            }
        }
        
        log.error(dump.toString());
    }

    private void sendAlert(PoolStats stats) {
        // 发送告警(可对接钉钉、企业微信等)
        log.error("ALERT: Connection pool exhaustion detected! Active: {}, Waiting: {}",
            stats.getActiveConnections(),
            stats.getThreadsAwaitingConnection());
    }
}

3.4 线程 Dump 分析工具

@Component
@Slf4j
public class ThreadDumpAnalyzer {

    /**
     * 获取完整线程 Dump
     */
    public String generateFullDump() {
        StringBuilder dump = new StringBuilder();
        dump.append("\n");
        dump.append("================================================================================\n");
        dump.append("                        FULL THREAD DUMP\n");
        dump.append("================================================================================\n");
        dump.append("Generated at: ").append(LocalDateTime.now()).append("\n");
        dump.append("Total threads: ").append(Thread.activeCount()).append("\n\n");

        Map<Thread, StackTraceElement[]> allStacks = Thread.getAllStackTraces();
        
        // 按线程状态分组统计
        Map<Thread.State, List<Thread>> threadsByState = allStacks.keySet().stream()
            .collect(Collectors.groupingBy(Thread::getState));

        dump.append("Thread State Summary:\n");
        for (Map.Entry<Thread.State, List<Thread>> entry : threadsByState.entrySet()) {
            dump.append("  ").append(entry.getKey()).append(": ").append(entry.getValue().size()).append("\n");
        }
        dump.append("\n");

        // 详细线程信息
        for (Map.Entry<Thread, StackTraceElement[]> entry : allStacks.entrySet()) {
            Thread thread = entry.getKey();
            StackTraceElement[] stack = entry.getValue();

            dump.append("--------------------------------------------------------------------------------\n");
            dump.append('"').append(thread.getName()).append('"');
            dump.append(" nid=").append(thread.getId());
            dump.append(" state=").append(thread.getState());
            dump.append(" daemon=").append(thread.isDaemon());
            dump.append(" priority=").append(thread.getPriority());
            dump.append("\n");

            // Thread group
            ThreadGroup group = thread.getThreadGroup();
            if (group != null) {
                dump.append("    java.lang.ThreadGroup: ").append(group.getName());
                dump.append(" [maximumActiveThreads=").append(group.activeCount()).append("]\n");
            }

            // Lock info
            dump.append("    java.lang.Thread@").append(Integer.toHexString(thread.hashCode())).append("\n");

            // Stack trace
            for (StackTraceElement element : stack) {
                dump.append("        at ");
                dump.append(element.getClassName()).append(".");
                dump.append(element.getMethodName());
                dump.append("(").append(element.getFileName());
                if (element.getLineNumber() >= 0) {
                    dump.append(":").append(element.getLineNumber());
                }
                dump.append(")\n");
            }

            dump.append("\n");
        }

        return dump.toString();
    }

    /**
     * 分析数据库连接相关线程
     */
    public String analyzeDatabaseThreads() {
        StringBuilder analysis = new StringBuilder();
        analysis.append("\n");
        analysis.append("================================================================================\n");
        analysis.append("                    DATABASE CONNECTION ANALYSIS\n");
        analysis.append("================================================================================\n\n");

        Map<Thread, StackTraceElement[]> allStacks = Thread.getAllStackTraces();

        // 1. 等待连接的线程
        List<Thread> waitingThreads = new ArrayList<>();
        // 2. 持有连接的线程
        List<Thread> holdingThreads = new ArrayList<>();
        // 3. HikariCP 内部线程
        List<Thread> hikariThreads = new ArrayList<>();

        for (Map.Entry<Thread, StackTraceElement[]> entry : allStacks.entrySet()) {
            Thread thread = entry.getKey();
            String threadName = thread.getName().toLowerCase();
            StackTraceElement[] stack = entry.getValue();

            if (threadName.contains("hikari")) {
                hikariThreads.add(thread);
            }

            boolean waitingForConn = Arrays.stream(stack)
                .anyMatch(s -> s.toString().contains("getConnection") ||
                              s.toString().contains("Hikari"));

            boolean holdingConn = Arrays.stream(stack)
                .anyMatch(s -> s.toString().contains("java.sql") ||
                              s.toString().contains("Connection") ||
                              s.toString().contains("PreparedStatement"));

            if (waitingForConn && thread.getState() == Thread.State.WAITING) {
                waitingThreads.add(thread);
            }

            if (holdingConn) {
                holdingThreads.add(thread);
            }
        }

        // 输出分析结果
        analysis.append("1. HikariCP Pool Threads:\n");
        analysis.append("   Count: ").append(hikariThreads.size()).append("\n");
        for (Thread t : hikariThreads) {
            analysis.append("   - ").append(t.getName())
                .append(" (state=").append(t.getState()).append(")\n");
        }

        analysis.append("\n2. Threads WAITING for connection:\n");
        analysis.append("   Count: ").append(waitingThreads.size()).append("\n");
        for (Thread t : waitingThreads) {
            analysis.append("   - ").append(t.getName())
                .append(" (state=").append(t.getState()).append(")\n");
            for (StackTraceElement s : t.getStackTrace()) {
                analysis.append("       at ").append(s).append("\n");
            }
        }

        analysis.append("\n3. Threads HOLDING database connections:\n");
        analysis.append("   Count: ").append(holdingThreads.size()).append("\n");
        for (Thread t : holdingThreads) {
            analysis.append("   - ").append(t.getName())
                .append(" (state=").append(t.getState()).append(")\n");
            // 只输出前5个堆栈
            StackTraceElement[] stack = t.getStackTrace();
            for (int i = 0; i < Math.min(5, stack.length); i++) {
                analysis.append("       at ").append(stack[i]).append("\n");
            }
        }

        return analysis.toString();
    }

    /**
     * 检测死锁
     */
    public boolean detectDeadlock() {
        ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
        long[] deadlockedThreads = threadMXBean.findDeadlockedThreads();
        
        if (deadlockedThreads != null && deadlockedThreads.length > 0) {
            log.error("DEADLOCK DETECTED! {} threads in deadlock", deadlockedThreads.length);
            return true;
        }
        
        return false;
    }
}

3.5 连接池监控指标

@Component
@Slf4j
public class HikariPoolMetrics {

    private final MeterRegistry meterRegistry;
    private final DataSource dataSource;

    public HikariPoolMetrics(MeterRegistry meterRegistry, DataSource dataSource) {
        this.meterRegistry = meterRegistry;
        this.dataSource = dataSource;
        
        registerMetrics();
        scheduleMetricsCollection();
    }

    private void registerMetrics() {
        // 连接池大小指标
        Gauge.builder("hikari.connections.active", this, HikariPoolMetrics::getActiveConnections)
            .register(meterRegistry);

        Gauge.builder("hikari.connections.idle", this, HikariPoolMetrics::getIdleConnections)
            .register(meterRegistry);

        Gauge.builder("hikari.connections.total", this, HikariPoolMetrics::getTotalConnections)
            .register(meterRegistry);

        Gauge.builder("hikari.connections.waiting", this, HikariPoolMetrics::getWaitingThreads)
            .register(meterRegistry);

        Gauge.builder("hikari.connections.max", this, HikariPoolMetrics::getMaxConnections)
            .register(meterRegistry);
    }

    private void scheduleMetricsCollection() {
        Executors.newSingleThreadScheduledExecutor().scheduleAtFixedRate(() -> {
            try {
                collectMetrics();
            } catch (Exception e) {
                log.error("Failed to collect pool metrics", e);
            }
        }, 0, 30, TimeUnit.SECONDS);
    }

    private void collectMetrics() {
        if (dataSource instanceof HikariDataSource) {
            HikariDataSource hds = (HikariDataSource) dataSource;
            HikariPoolMXBean poolBean = hds.getHikariPoolMXBean();
            
            if (poolBean != null) {
                log.debug("Pool stats - active: {}, idle: {}, waiting: {}",
                    poolBean.getActiveConnections(),
                    poolBean.getIdleConnections(),
                    poolBean.getThreadsAwaitingConnection());
            }
        }
    }

    private int getActiveConnections() {
        if (dataSource instanceof HikariDataSource) {
            HikariPoolMXBean bean = ((HikariDataSource) dataSource).getHikariPoolMXBean();
            return bean != null ? bean.getActiveConnections() : 0;
        }
        return 0;
    }

    private int getIdleConnections() {
        if (dataSource instanceof HikariDataSource) {
            HikariPoolMXBean bean = ((HikariDataSource) dataSource).getHikariPoolMXBean();
            return bean != null ? bean.getIdleConnections() : 0;
        }
        return 0;
    }

    private int getTotalConnections() {
        if (dataSource instanceof HikariDataSource) {
            HikariPoolMXBean bean = ((HikariDataSource) dataSource).getHikariPoolMXBean();
            return bean != null ? bean.getTotalConnections() : 0;
        }
        return 0;
    }

    private int getWaitingThreads() {
        if (dataSource instanceof HikariDataSource) {
            HikariPoolMXBean bean = ((HikariDataSource) dataSource).getHikariPoolMXBean();
            return bean != null ? bean.getThreadsAwaitingConnection() : 0;
        }
        return 0;
    }

    private int getMaxConnections() {
        if (dataSource instanceof HikariDataSource) {
            return ((HikariDataSource) dataSource).getMaximumPoolSize();
        }
        return 0;
    }
}

3.6 连接泄漏追踪拦截器

@Component
@Aspect
@Slf4j
public class ConnectionLeakTrackingAspect {

    @Autowired
    private DataSource dataSource;

    private final ConcurrentHashMap<String, ConnectionTrace> connectionTraces = new ConcurrentHashMap<>();

    @Around("execution(* javax.sql.DataSource.getConnection(..))")
    public Object trackConnectionAcquisition(ProceedingJoinPoint joinPoint) throws Throwable {
        String traceId = UUID.randomUUID().toString();
        long startTime = System.currentTimeMillis();
        
        // 获取调用堆栈
        StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();
        String caller = formatStackTrace(stackTrace);
        
        ConnectionTrace trace = ConnectionTrace.builder()
            .traceId(traceId)
            .acquireTime(startTime)
            .caller(caller)
            .threadName(Thread.currentThread().getName())
            .build();
        
        connectionTraces.put(traceId, trace);
        
        try {
            Connection connection = (Connection) joinPoint.proceed();
            
            // 包装连接以跟踪关闭
            return new TrackedConnection(connection, traceId, () -> {
                connectionTraces.remove(traceId);
            });
            
        } catch (Exception e) {
            connectionTraces.remove(traceId);
            throw e;
        }
    }

    /**
     * 报告未关闭的连接
     */
    @Scheduled(fixedDelay = 60000)
    public void reportUnclosedConnections() {
        long now = System.currentTimeMillis();
        long threshold = 30000; // 30秒未关闭

        List<ConnectionTrace> leaks = connectionTraces.values().stream()
            .filter(t -> (now - t.getAcquireTime()) > threshold)
            .collect(Collectors.toList());

        if (!leaks.isEmpty()) {
            log.error("=== Connection Leak Detected! ===");
            for (ConnectionTrace trace : leaks) {
                log.error("Leaked connection: traceId={}, duration={}ms, caller={}", 
                    trace.getTraceId(),
                    now - trace.getAcquireTime(),
                    trace.getCaller());
            }
        }
    }

    private String formatStackTrace(StackTraceElement[] stackTrace) {
        // 获取调用者的堆栈信息
        for (int i = 2; i < Math.min(stackTrace.length, 10); i++) {
            StackTraceElement element = stackTrace[i];
            if (!element.getClassName().contains("Hikari")) {
                return element.getClassName() + "." + element.getMethodName() + 
                       "(" + element.getFileName() + ":" + element.getLineNumber() + ")";
            }
        }
        return "Unknown";
    }

    @Builder
    @Data
    private static class ConnectionTrace {
        private String traceId;
        private long acquireTime;
        private String caller;
        private String threadName;
    }

    /**
     * 包装连接以跟踪关闭
     */
    private static class TrackedConnection implements InvocationHandler {
        private final Connection delegate;
        private final String traceId;
        private final Runnable onClose;

        public TrackedConnection(Connection delegate, String traceId, Runnable onClose) {
            this.delegate = delegate;
            this.traceId = traceId;
            this.onClose = onClose;
        }

        @Override
        public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
            if (method.getName().equals("close")) {
                onClose.run();
            }
            return method.invoke(delegate, args);
        }
    }
}

四、配置文件示例

server:
  port: 8080

spring:
  application:
    name: hikari-connection-pool-demo
  datasource:
    url: jdbc:mysql://localhost:3306/example_db?useSSL=false&serverTimezone=UTC
    username: root
    password: password
    driver-class-name: com.mysql.cj.jdbc.Driver
    type: com.zaxxer.hikari.HikariDataSource
    hikari:
      pool-name: OrderHikariPool
      maximum-pool-size: 20
      minimum-idle: 5
      connection-timeout: 30000
      idle-timeout: 600000
      max-lifetime: 1800000
      connection-test-query: SELECT 1
      # 泄漏检测(生产环境建议开启)
      leak-detection-threshold: 60000

management:
  endpoints:
    web:
      exposure:
        include: health, info, prometheus, metrics
  metrics:
    tags:
      application: ${spring.application.name}

logging:
  level:
    com.zaxxer.hikari: DEBUG
    com.zaxxer.hikari.pool: TRACE

五、线程 Dump 分析实战

5.1 常见连接池假死场景

场景一:连接未关闭导致泄漏

"HikariPool-1 housekeeper" #50 daemon prio=5 os_prio=31 tid=0x00007f8a9c01a800
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()

"http-nio-8080-exec-1" #30 daemon prio=5 os_prio=31 tid=0x00007f8a9c03c800
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait()
    at java.lang.Object.wait(Object.java:502)
    at com.zaxxer.hikari.pool.PoolBase.tryEndTask()

"application-thread" #25 prio=5 os_prio=31 tid=0x00007f8a9b80f800
   java.lang.Thread.State: RUNNABLE
    at com.mysql.cj.jdbc.ConnectionImpl.prepareStatement()
    at sun.reflect.NativeMethodAccessorImpl.invoke0()
    
分析:应用线程持有连接但未释放

场景二:长事务占用连接

"batch-processing-thread" #42 prio=5 os_prio=31 tid=0x00007f8a9c04a800
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.zaxxer.hikari.pool.HikariPool.createTimeoutException()
    
    java.sql.SQLException: Connection is not available, request timed out after 30000ms
    
分析:批量处理线程占用连接时间过长

5.2 快速诊断命令

# 生成线程 Dump
jstack -l <pid> > thread_dump.txt

# 查看 HikariCP 线程
grep -A 20 "HikariPool" thread_dump.txt

# 统计线程状态
grep "java.lang.Thread.State" thread_dump.txt | sort | uniq -c

# 查看 WAITING 线程的完整堆栈
grep -B 5 "WAITING" thread_dump.txt | grep -A 10 "HTTP"

# 检测死锁
jstack -ld <pid> | grep -A 10 "deadlock"

六、监控告警配置

6.1 Prometheus 告警规则

groups:
- name: hikari_connection_pool_alerts
  rules:
  - alert: HikariPoolConnectionTimeout
    expr: hikari_connections_waiting > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "HikariCP 连接获取超时"
      description: "有 {{ $value }} 个线程在等待获取数据库连接"
  
  - alert: HikariPoolExhaustion
    expr: hikari_connections_active / hikari_connections_max > 0.9
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "HikariCP 连接池耗尽"
      description: "连接池使用率超过 90%,活跃: {{ $value }}"
  
  - alert: HikariPoolLeakDetected
    expr: increase(hikari_connections_leak[5m]) > 0
    for: 1m
    labels:
      severity: error
    annotations:
      summary: "检测到数据库连接泄漏"
      description: "HikariCP 检测到连接泄漏"

七、最佳实践建议

7.1 连接使用规范

规范说明
使用 try-with-resources自动关闭连接,推荐方式
及时关闭业务完成后立即关闭连接
避免长事务尽量缩短事务持有时间
异常处理确保在 finally 中关闭连接
连接复用不要在循环中反复获取连接

7.2 连接池配置建议

配置项推荐值说明
maximumPoolSizeCPU核心数×2根据数据库并发能力调整
minimumIdlemaximumPoolSize的50%保持适量预热连接
connectionTimeout30秒足够等待时间
idleTimeout10分钟非高峰时段回收
maxLifetime30分钟防止连接老化
leakDetectionThreshold60秒生产环境建议开启

7.3 监控检查清单

  1. 定期检查连接池状态
  2. 配置连接泄漏检测
  3. 设置超时告警
  4. 定期生成线程 Dump 分析
  5. 监控慢查询
  6. 检查数据库连接数限制

互动话题

您在排查数据库连接池问题时有哪些经验?是否遇到过连接泄漏导致的故障?欢迎在评论区分享!


标题:SpringBoot + 数据库连接池假死排查:HikariCP 获取连接超时?连接泄漏检测 + 线程 Dump 分析
作者:jiangyi
地址:http://jiangyi.space/articles/2026/07/02/1782639122068.html
公众号:服务端技术精选

服务端开发博客:后端架构、高并发、性能优化与微服务实战教程

取消