SpringBoot + 位图(Bitmap)+ HyperLogLog:亿级用户签到、UV 统计极致优化

引言

最近在做用户签到系统重构时遇到了一个棘手问题:每天要处理上亿用户的签到数据,传统的MySQL方案已经扛不住了。后来发现Redis的位图和HyperLogLog简直是为这种场景量身定制的神器。

很多同学可能还不知道,Redis除了基本的String、Hash这些数据结构,还有专门针对大数据统计的高级数据结构。今天就来聊聊如何用位图和HyperLogLog,让亿级用户的统计变得轻而易举。

为什么传统方案会崩?

MySQL的局限性

面对亿级用户签到,传统MySQL方案的问题:

存储空间爆炸

  • 每个用户每天一条记录,1亿用户就是1亿条数据
  • 按照每条记录1KB计算,一天就要100GB
  • 一个月下来就是3TB,成本惊人

查询性能瓶颈

  • SELECT COUNT(DISTINCT user_id) WHERE date = '2024-01-01' 这种查询会全表扫描
  • 即使加索引,大数据量下性能依然很差
  • 多维度统计(月活、年活)更加困难

并发写入压力

  • 高峰期大量用户同时签到
  • 数据库连接池被打满
  • 事务锁竞争严重

Redis高级数据结构登场

位图(Bitmap)的核心优势

位图是用二进制位来存储状态的数据结构:

用户ID:  1   2   3   4   5   6   7   8
签到状态: 1   0   1   1   0   1   0   1
         ↓   ↓   ↓   ↓   ↓   ↓   ↓   ↓
内存存储: 1   0   1   1   0   1   0   1  (每个用户只占1位)

惊人的压缩效果

  • 1亿用户只需要12.5MB内存 (100,000,000 ÷ 8 ÷ 1024 ÷ 1024)
  • 相比MySQL节省了上千倍的存储空间
  • 查询速度从秒级降到毫秒级

HyperLogLog的统计魔法

HyperLogLog专门解决基数统计问题:

传统Set存储: {"user1", "user2", "user3", ...}  → 存储所有元素
HyperLogLog: 只存储哈希值的统计特征 → 固定12KB

核心原理

  • 通过哈希函数将用户ID映射到大空间
  • 统计哈希值前导零的数量来估算基数
  • 用概率算法实现超低内存占用

整体架构设计

我们的亿级用户统计架构:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   用户签到请求   │───▶│   Redis位图存储   │───▶│   实时统计查询   │
│                 │    │   (Bitmap)       │    │                 │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│  日活统计    │      │  月活统计    │      │  留存分析    │
│ (HyperLogLog)│      │ (位图运算)   │      │ (位图交集)   │
└─────────────┘      └─────────────┘      └─────────────┘

核心设计要点

1. 位图存储策略

// 用户签到位图设计
public class SignInBitmapManager {
    
    // 按日期分key,避免单key过大
    private static final String DAILY_SIGNIN_KEY = "signin:daily:{date}";
    
    // 按用户ID作为位偏移量
    public void userSignIn(String date, Long userId) {
        String key = DAILY_SIGNIN_KEY.replace("{date}", date);
        // SETBIT key userId 1
        redisTemplate.opsForValue().setBit(key, userId, true);
    }
    
    // 统计当日签到人数
    public Long getDailySignInCount(String date) {
        String key = DAILY_SIGNIN_KEY.replace("{date}", date);
        // BITCOUNT key
        return redisTemplate.execute((RedisCallback<Long>) connection -> 
            connection.stringCommands().bitCount(key.getBytes()));
    }
}

2. HyperLogLog统计方案

// UV统计管理器
public class UvStatisticsManager {
    
    // 按天统计UV
    private static final String DAILY_UV_KEY = "uv:daily:{date}";
    
    // 按月统计UV
    private static final String MONTHLY_UV_KEY = "uv:monthly:{year}:{month}";
    
    public void recordUserVisit(String date, String userId) {
        String dailyKey = DAILY_UV_KEY.replace("{date}", date);
        // PFADD dailyKey userId
        redisTemplate.opsForHyperLogLog().add(dailyKey, userId);
    }
    
    public Long getDailyUv(String date) {
        String key = DAILY_UV_KEY.replace("{date}", date);
        // PFCOUNT key
        return redisTemplate.opsForHyperLogLog().size(key);
    }
    
    // 合并多天UV统计
    public Long getMultiDayUv(List<String> dates) {
        List<String> keys = dates.stream()
            .map(date -> DAILY_UV_KEY.replace("{date}", date))
            .collect(Collectors.toList());
        
        String tempKey = "uv:temp:" + System.currentTimeMillis();
        try {
            // PFMERGE tempKey key1 key2 key3...
            redisTemplate.opsForHyperLogLog().union(tempKey, keys.toArray(new String[0]));
            return redisTemplate.opsForHyperLogLog().size(tempKey);
        } finally {
            redisTemplate.delete(tempKey);
        }
    }
}

3. 多维度统计设计

// 综合统计服务
@Service
public class UserStatisticsService {
    
    // 日活统计 (DAU)
    public Long getDailyActiveUsers(String date) {
        return signInBitmapManager.getDailySignInCount(date);
    }
    
    // 月活统计 (MAU) - 使用位图OR运算
    public Long getMonthlyActiveUsers(String year, String month) {
        List<String> dateKeys = generateMonthDates(year, month);
        return bitmapOperations.unionCount(dateKeys);
    }
    
    // 留存率分析 - 使用位图AND运算
    public Double getRetentionRate(String baseDate, String targetDate) {
        Long baseCount = getDailyActiveUsers(baseDate);
        if (baseCount == 0) return 0.0;
        
        Long retainedCount = bitmapOperations.intersectionCount(
            Arrays.asList(baseDate, targetDate));
            
        return (double) retainedCount / baseCount * 100;
    }
    
    // 新用户统计 - 使用HyperLogLog差集估算
    public Long getNewUsers(String date) {
        Long todayUv = uvStatisticsManager.getDailyUv(date);
        Long yesterdayCumulative = getCumulativeUv(getPreviousDate(date));
        return Math.max(0, todayUv - yesterdayCumulative);
    }
}

关键实现细节

1. 位图操作工具类

@Component
public class BitmapOperations {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    // 位图交集统计
    public Long intersectionCount(List<String> keys) {
        if (keys.isEmpty()) return 0L;
        if (keys.size() == 1) {
            return redisTemplate.execute((RedisCallback<Long>) connection -> 
                connection.stringCommands().bitCount(keys.get(0).getBytes()));
        }
        
        String tempKey = "bitmap:temp:" + System.currentTimeMillis();
        try {
            // BITOP AND tempKey key1 key2 key3...
            byte[][] keyBytes = keys.stream()
                .map(String::getBytes)
                .toArray(byte[][]::new);
                
            redisTemplate.execute((RedisCallback<Long>) connection -> {
                connection.stringCommands().bitOp(BitOperation.AND, 
                    tempKey.getBytes(), keyBytes);
                return null;
            });
            
            return redisTemplate.execute((RedisCallback<Long>) connection -> 
                connection.stringCommands().bitCount(tempKey.getBytes()));
        } finally {
            redisTemplate.delete(tempKey);
        }
    }
    
    // 位图并集统计
    public Long unionCount(List<String> keys) {
        if (keys.isEmpty()) return 0L;
        if (keys.size() == 1) {
            return redisTemplate.execute((RedisCallback<Long>) connection -> 
                connection.stringCommands().bitCount(keys.get(0).getBytes()));
        }
        
        String tempKey = "bitmap:temp:" + System.currentTimeMillis();
        try {
            byte[][] keyBytes = keys.stream()
                .map(String::getBytes)
                .toArray(byte[][]::new);
                
            redisTemplate.execute((RedisCallback<Long>) connection -> {
                connection.stringCommands().bitOp(BitOperation.OR, 
                    tempKey.getBytes(), keyBytes);
                return null;
            });
            
            return redisTemplate.execute((RedisCallback<Long>) connection -> 
                connection.stringCommands().bitCount(tempKey.getBytes()));
        } finally {
            redisTemplate.delete(tempKey);
        }
    }
}

2. 签到服务实现

@Service
@Transactional
public class SignInService {
    
    @Autowired
    private SignInBitmapManager signInBitmapManager;
    
    @Autowired
    private UvStatisticsManager uvStatisticsManager;
    
    public SignInResult userSignIn(Long userId, String date) {
        try {
            // 1. 记录位图签到
            signInBitmapManager.userSignIn(date, userId);
            
            // 2. 记录UV统计
            uvStatisticsManager.recordUserVisit(date, String.valueOf(userId));
            
            // 3. 更新用户连续签到天数
            Integer continuousDays = calculateContinuousSignInDays(userId, date);
            
            // 4. 发放签到奖励(异步处理)
            rewardService.grantSignInReward(userId, continuousDays);
            
            return SignInResult.success(continuousDays);
            
        } catch (Exception e) {
            log.error("用户签到失败: userId={}, date={}", userId, date, e);
            throw new ServiceException("签到失败", e);
        }
    }
    
    private Integer calculateContinuousSignInDays(Long userId, String currentDate) {
        // 从位图中查询连续签到天数
        int continuousDays = 0;
        String current = currentDate;
        
        while (true) {
            if (signInBitmapManager.isUserSignedIn(current, userId)) {
                continuousDays++;
                current = DateUtils.getPreviousDay(current);
            } else {
                break;
            }
        }
        
        return continuousDays;
    }
}

3. 统计查询接口

@RestController
@RequestMapping("/api/statistics")
public class StatisticsController {
    
    @Autowired
    private UserStatisticsService userStatisticsService;
    
    @GetMapping("/dau")
    public ResponseEntity<Long> getDailyActiveUsers(
            @RequestParam String date) {
        Long dau = userStatisticsService.getDailyActiveUsers(date);
        return ResponseEntity.ok(dau);
    }
    
    @GetMapping("/mau")
    public ResponseEntity<Long> getMonthlyActiveUsers(
            @RequestParam String year,
            @RequestParam String month) {
        Long mau = userStatisticsService.getMonthlyActiveUsers(year, month);
        return ResponseEntity.ok(mau);
    }
    
    @GetMapping("/retention")
    public ResponseEntity<RetentionStats> getRetentionRate(
            @RequestParam String baseDate,
            @RequestParam String targetDate) {
        Double rate = userStatisticsService.getRetentionRate(baseDate, targetDate);
        RetentionStats stats = new RetentionStats(baseDate, targetDate, rate);
        return ResponseEntity.ok(stats);
    }
    
    @GetMapping("/new-users")
    public ResponseEntity<Long> getNewUsers(@RequestParam String date) {
        Long newUsers = userStatisticsService.getNewUsers(date);
        return ResponseEntity.ok(newUsers);
    }
}

业务场景应用

1. 用户签到系统

// 签到奖励规则
@Data
public class SignInRewardRule {
    private Integer minContinuousDays;  // 最低连续天数
    private Integer maxContinuousDays;  // 最高连续天数
    private List<RewardItem> rewards;   // 奖励物品
}

// 连续签到奖励
public class ContinuousSignInReward {
    private static final List<SignInRewardRule> REWARD_RULES = Arrays.asList(
        new SignInRewardRule(1, 6, Arrays.asList(new RewardItem("积分", 10))),
        new SignInRewardRule(7, 14, Arrays.asList(new RewardItem("积分", 20), new RewardItem("优惠券", 1))),
        new SignInRewardRule(15, 30, Arrays.asList(new RewardItem("积分", 50), new RewardItem("VIP体验卡", 1)))
    );
    
    public List<RewardItem> calculateRewards(Integer continuousDays) {
        return REWARD_RULES.stream()
            .filter(rule -> continuousDays >= rule.getMinContinuousDays() 
                         && continuousDays <= rule.getMaxContinuousDays())
            .findFirst()
            .map(SignInRewardRule::getRewards)
            .orElse(Collections.emptyList());
    }
}

2. 活跃度分析

// 用户活跃度等级
public enum UserActivityLevel {
    INACTIVE(0, 0),      // 不活跃 (0天)
    LOW(1, 6),          // 低活跃 (1-6天)
    MEDIUM(7, 14),      // 中活跃 (7-14天)
    HIGH(15, 29),       // 高活跃 (15-29天)
    VERY_HIGH(30, 365); // 超高活跃 (30天以上)
    
    private final int minDays;
    private final int maxDays;
    
    public static UserActivityLevel fromSignInDays(int days) {
        return Arrays.stream(values())
            .filter(level -> days >= level.minDays && days <= level.maxDays)
            .findFirst()
            .orElse(INACTIVE);
    }
}

3. 数据报表生成

@Service
public class ReportGenerationService {
    
    public UserActivityReport generateMonthlyReport(String year, String month) {
        UserActivityReport report = new UserActivityReport();
        
        // 基础数据
        report.setYear(year);
        report.setMonth(month);
        report.setTotalUsers(userService.getTotalUserCount());
        
        // 活跃数据
        report.setMonthlyActiveUsers(userStatisticsService.getMonthlyActiveUsers(year, month));
        report.setDailyAverages(calculateDailyAverages(year, month));
        
        // 留存数据
        report.setRetentionRates(calculateRetentionRates(year, month));
        
        // 趋势分析
        report.setTrendAnalysis(performTrendAnalysis(year, month));
        
        return report;
    }
    
    private Map<String, Double> calculateRetentionRates(String year, String month) {
        Map<String, Double> retentionRates = new HashMap<>();
        String baseDate = year + "-" + month + "-01";
        
        // 计算次日、7日、30日留存率
        retentionRates.put("day1", userStatisticsService.getRetentionRate(baseDate, 
            DateUtils.addDays(baseDate, 1)));
        retentionRates.put("day7", userStatisticsService.getRetentionRate(baseDate, 
            DateUtils.addDays(baseDate, 7)));
        retentionRates.put("day30", userStatisticsService.getRetentionRate(baseDate, 
            DateUtils.addDays(baseDate, 30)));
            
        return retentionRates;
    }
}

最佳实践建议

1. 内存优化策略

@Configuration
public class RedisBitmapConfig {
    
    // 位图数据过期策略
    @PostConstruct
    public void setupBitmapExpiration() {
        // 日统计数据保留90天
        scheduleExpirationTask("signin:daily:*", Duration.ofDays(90));
        
        // 月统计数据保留2年
        scheduleExpirationTask("signin:monthly:*", Duration.ofDays(730));
        
        // UV统计数据保留1年
        scheduleExpirationTask("uv:daily:*", Duration.ofDays(365));
    }
    
    // 冷数据归档到数据库
    @Scheduled(cron = "0 0 2 * * ?")  // 每天凌晨2点执行
    public void archiveColdData() {
        // 将30天前的统计数据归档到MySQL
        archiveService.archiveHistoricalStatistics(30);
    }
}

2. 性能监控

@Component
@Slf4j
public class BitmapMonitor {
    
    @EventListener
    public void handleBitmapOperation(BitmapOperationEvent event) {
        // 记录操作耗时
        log.info("位图操作: key={}, operation={}, duration={}ms, result={}", 
            event.getKey(), event.getOperation(), event.getDuration(), event.getResult());
        
        // 监控内存使用
        if (event.getKey().startsWith("signin:daily:")) {
            monitorDailyBitmapSize(event.getKey());
        }
        
        // 上报监控指标
        MeterRegistry registry = Metrics.globalRegistry;
        registry.timer("bitmap.operation", 
            "operation", event.getOperation(),
            "key_type", getKeyType(event.getKey()))
            .record(event.getDuration(), TimeUnit.MILLISECONDS);
    }
    
    private void monitorDailyBitmapSize(String key) {
        Long size = redisTemplate.execute((RedisCallback<Long>) connection -> 
            connection.stringCommands().bitCount(key.getBytes()));
        log.info("每日签到位图大小: key={}, users={}", key, size);
    }
}

3. 容错处理

@Service
public class BitmapFallbackService {
    
    // Redis故障时的降级方案
    public Long getDailyActiveUsersFallback(String date) {
        try {
            // 从数据库查询历史统计数据
            return statisticsRepository.getHistoricalDau(date);
        } catch (Exception e) {
            log.error("降级查询失败: date={}", date, e);
            return 0L;  // 返回默认值
        }
    }
    
    // 数据一致性检查
    @Scheduled(fixedRate = 3600000)  // 每小时检查一次
    public void checkDataConsistency() {
        String today = DateUtils.getCurrentDate();
        Long bitmapCount = userStatisticsService.getDailyActiveUsers(today);
        Long dbCount = statisticsRepository.getActualSignInCount(today);
        
        if (Math.abs(bitmapCount - dbCount) > bitmapCount * 0.01) {  // 1%误差阈值
            log.warn("数据不一致: bitmap={}, db={}, date={}", bitmapCount, dbCount, today);
            // 触发数据修复流程
            triggerDataRepair(today);
        }
    }
}

预期效果

通过位图和HyperLogLog方案,我们可以实现:

  • 存储优化:相比MySQL节省99%以上的存储空间
  • 性能提升:查询响应时间从秒级降到毫秒级
  • 成本降低:大幅减少服务器和数据库资源消耗
  • 扩展性强:轻松应对用户量增长,支持亿级用户规模
  • 功能丰富:支持复杂的多维度统计分析

这套方案不仅解决了当前的性能瓶颈,还为未来的业务扩展打下了坚实基础,是处理大数据统计场景的最优选择。


欢迎关注公众号"服务端技术精选",获取更多技术干货!


标题:SpringBoot + 位图(Bitmap)+ HyperLogLog:亿级用户签到、UV 统计极致优化
作者:jiangyi
地址:http://jiangyi.space/articles/2026/02/11/1770615453702.html
公众号:服务端技术精选
    评论
    0 评论
avatar

取消