用户访问量统计总失真？这6种Redis方案让你数据精准到个位数！

2025-12-21 / 2025-12-21 / Note

作为一名后端开发，经历过太多用户访问量统计的"惨案"：

某电商大促期间，Redis计数器被并发冲爆，实际订单10万，统计却只显示了7万，老板差点把我祭天
某内容平台UV统计用MySQL，每秒1万PV直接打挂数据库，用户投诉页面打不开
某社交App用Redis自增统计DAU，结果用户刷新页面一次就+1，DAU虚高300%，被投资人质疑数据造假

用户访问量统计，看似简单，实则暗藏杀机。今天就结合自己从日活1万到500万的踩坑经历，跟大家聊聊Redis统计到底该怎么玩，让你数据精准到个位数！

一、用户访问量统计到底是个啥？为啥这么难搞？

用户访问量统计的核心就是：准确记录每个用户的每一次访问，既不能多也不能少。

为啥统计这么难？

并发量大：大促期间每秒几万次访问，传统数据库直接被打挂
准确性要求高：少统计1个用户，老板都要找你谈话
实时性要求强：活动刚开始，就要看到实时数据
维度多：PV、UV、IP、DAU、MAU、留存率，每个都要统计

二、6种Redis统计方案，总有一款适合你！

就像打游戏有不同的英雄，Redis统计也有不同的方案，选对英雄才能carry全场。

方案1：String自增计数器 - 最简单的PV统计

适合场景：页面访问量、接口调用次数统计

核心原理：利用Redis的INCR原子操作，保证并发安全

代码实现：

@Service
public class PVCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 统计页面PV
     */
    public long incrementPageView(String pageId) {
        String key = "pv:page:" + LocalDate.now() + ":" + pageId;
        return redisTemplate.opsForValue().increment(key);
    }
    
    /**
     * 获取页面PV
     */
    public long getPageView(String pageId) {
        String key = "pv:page:" + LocalDate.now() + ":" + pageId;
        String value = redisTemplate.opsForValue().get(key);
        return value == null ? 0 : Long.parseLong(value);
    }
    
    /**
     * 设置过期时间，自动清理历史数据
     */
    public void setExpire(String pageId, long seconds) {
        String key = "pv:page:" + LocalDate.now() + ":" + pageId;
        redisTemplate.expire(key, Duration.ofSeconds(seconds));
    }
}

// 使用示例
@RestController
public class PageViewController {
    
    @Autowired
    private PVCounterService pvCounterService;
    
    @GetMapping("/api/page/{pageId}/view")
    public Map<String, Object> viewPage(@PathVariable String pageId) {
        // 增加页面访问量
        long currentPV = pvCounterService.incrementPageView(pageId);
        
        return Map.of(
            "pageId", pageId,
            "pv", currentPV,
            "message", "页面访问成功"
        );
    }
}

方案2：HyperLogLog - UV统计神器

适合场景：UV、IP统计，占用内存极小，误差率<1%

核心原理：利用概率算法，用12KB内存统计2^64个元素

代码实现：

@Service
public class UVCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 添加UV（用户ID）
     */
    public void addUV(String date, String userId) {
        String key = "uv:daily:" + date;
        redisTemplate.opsForHyperLogLog().add(key, userId);
    }
    
    /**
     * 获取UV数量
     */
    public long getUV(String date) {
        String key = "uv:daily:" + date;
        return redisTemplate.opsForHyperLogLog().size(key);
    }
    
    /**
     * 合并多天的UV（计算周UV、月UV）
     */
    public long mergeUV(List<String> dates, String targetKey) {
        String[] keys = dates.stream()
            .map(date -> "uv:daily:" + date)
            .toArray(String[]::new);
        
        redisTemplate.opsForHyperLogLog().union(targetKey, keys);
        return redisTemplate.opsForHyperLogLog().size(targetKey);
    }
}

// 使用示例
@RestController
public class UVController {
    
    @Autowired
    private UVCounterService uvCounterService;
    
    @GetMapping("/api/uv/daily/{date}")
    public Map<String, Object> getDailyUV(@PathVariable String date) {
        long uv = uvCounterService.getUV(date);
        return Map.of(
            "date", date,
            "uv", uv,
            "accuracy", "误差率<1%"
        );
    }
    
    @PostMapping("/api/uv/add")
    public Map<String, Object> addUV(@RequestParam String userId) {
        String today = LocalDate.now().toString();
        uvCounterService.addUV(today, userId);
        return Map.of("message", "UV记录成功");
    }
}

方案3：Bitmap - 按位存储用户访问状态

适合场景：精确统计UV，可以知道具体哪些用户访问过

核心原理：用二进制位表示用户是否访问过，极大节省内存

代码实现：

@Service
public class BitmapCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 记录用户访问（使用用户ID作为bit位）
     */
    public boolean markUserVisited(String date, long userId) {
        String key = "bitmap:uv:" + date;
        return redisTemplate.opsForValue().setBit(key, userId, true);
    }
    
    /**
     * 检查用户是否访问过
     */
    public boolean isUserVisited(String date, long userId) {
        String key = "bitmap:uv:" + date;
        return redisTemplate.opsForValue().getBit(key, userId);
    }
    
    /**
     * 获取UV数量（统计bitmap中1的个数）
     */
    public long getUV(String date) {
        String key = "bitmap:uv:" + date;
        return redisTemplate.execute((RedisCallback<Long>) connection -> 
            connection.bitCount(key.getBytes()));
    }
    
    /**
     * 计算两天的共同UV（交集）
     */
    public long getCommonUV(String date1, String date2) {
        String key1 = "bitmap:uv:" + date1;
        String key2 = "bitmap:uv:" + date2;
        String destKey = "bitmap:intersect:" + date1 + "_" + date2;
        
        redisTemplate.execute((RedisCallback<Long>) connection -> {
            connection.bitOp(RedisStringCommands.BitOperation.AND, 
                destKey.getBytes(), key1.getBytes(), key2.getBytes());
            return connection.bitCount(destKey.getBytes());
        });
        
        return redisTemplate.execute((RedisCallback<Long>) connection -> 
            connection.bitCount(destKey.getBytes()));
    }
}

方案4：Sorted Set - 带时间戳的访问统计

适合场景：需要按时间排序的访问统计，如最近活跃用户

核心原理：用score存储时间戳，天然有序

代码实现：

@Service
public class SortedSetCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 记录用户访问时间
     */
    public boolean recordUserVisit(String userId) {
        String key = "zset:active_users:" + LocalDate.now();
        double score = System.currentTimeMillis();
        return redisTemplate.opsForZSet().add(key, userId, score);
    }
    
    /**
     * 获取最近N分钟内的活跃用户
     */
    public Set<String> getRecentActiveUsers(int minutes) {
        String key = "zset:active_users:" + LocalDate.now();
        long endTime = System.currentTimeMillis();
        long startTime = endTime - (minutes * 60 * 1000L);
        
        return redisTemplate.opsForZSet().rangeByScore(key, startTime, endTime);
    }
    
    /**
     * 获取用户活跃度排行（访问次数）
     */
    public Set<ZSetOperations.TypedTuple<String>> getTopActiveUsers(int topN) {
        String key = "zset:user_activity:" + LocalDate.now();
        return redisTemplate.opsForZSet().reverseRangeWithScores(key, 0, topN - 1);
    }
    
    /**
     * 增加用户活跃度分数
     */
    public Double incrementUserActivity(String userId, double score) {
        String key = "zset:user_activity:" + LocalDate.now();
        return redisTemplate.opsForZSet().incrementScore(key, userId, score);
    }
}

方案5：Hash - 多维度统计

适合场景：需要统计多个维度的访问数据

核心原理：用Hash存储多个统计字段，减少key数量

代码实现：

@Service
public class HashCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 记录用户多维度访问数据
     */
    public void recordUserStats(String userId, String pageId) {
        String date = LocalDate.now().toString();
        String key = "stats:user:" + userId + ":" + date;
        
        HashOperations<String, String, String> hashOps = redisTemplate.opsForHash();
        
        // 增加PV
        hashOps.increment(key, "pv", 1);
        
        // 记录访问页面集合（用Set去重）
        String pagesKey = "stats:user_pages:" + userId + ":" + date;
        redisTemplate.opsForSet().add(pagesKey, pageId);
        
        // 更新UV（基于页面数）
        long uv = redisTemplate.opsForSet().size(pagesKey);
        hashOps.put(key, "uv", String.valueOf(uv));
    }
    
    /**
     * 获取用户今日统计
     */
    public Map<String, String> getUserStats(String userId) {
        String date = LocalDate.now().toString();
        String key = "stats:user:" + userId + ":" + date;
        
        return redisTemplate.opsForHash().entries(key);
    }
    
    /**
     * 批量获取用户统计（用于排行榜）
     */
    public List<Map<String, Object>> getUserRanking(String metric, int topN) {
        String pattern = "stats:user:*:" + LocalDate.now();
        Set<String> keys = redisTemplate.keys(pattern);
        
        return keys.stream()
            .map(key -> {
                Map<String, String> stats = redisTemplate.opsForHash().entries(key);
                String userId = key.split(":")[2];
                return Map.of(
                    "userId", userId,
                    metric, stats.getOrDefault(metric, "0")
                );
            })
            .sorted((a, b) -> Long.compare(
                Long.parseLong((String) b.get(metric)), 
                Long.parseLong((String) a.get(metric))
            ))
            .limit(topN)
            .collect(Collectors.toList());
    }
}

方案6：Lua脚本 - 原子操作保证数据一致性

适合场景：需要原子操作的复杂统计逻辑

核心原理：用Lua脚本保证多个操作的原子性

代码实现：

@Service
public class LuaCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    /**
     * 原子操作：记录用户访问并更新多个统计
     */
    public List<Long> recordUserVisitWithLua(String userId, String pageId) {
        String luaScript = """
            -- 参数：userId, pageId, currentTime
            local userId = ARGV[1]
            local pageId = ARGV[2]
            local currentTime = ARGV[3]
            local date = string.sub(currentTime, 1, 10)
            
            -- 1. 增加PV
            local pvKey = "lua:pv:" .. date .. ":" .. pageId
            local pv = redis.call('INCR', pvKey)
            
            -- 2. 记录UV（使用Set去重）
            local uvKey = "lua:uv:" .. date .. ":" .. pageId
            local uvAdded = redis.call('SADD', uvKey, userId)
            local uv = redis.call('SCARD', uvKey)
            
            -- 3. 记录用户活跃度
            local userKey = "lua:user_activity:" .. date
            redis.call('ZINCRBY', userKey, 1, userId)
            
            -- 4. 设置过期时间（7天）
            redis.call('EXPIRE', pvKey, 604800)
            redis.call('EXPIRE', uvKey, 604800)
            redis.call('EXPIRE', userKey, 604800)
            
            return {pv, uv}
            """;
        
        DefaultRedisScript<List> script = new DefaultRedisScript<>();
        script.setScriptText(luaScript);
        script.setResultType(List.class);
        
        String currentTime = LocalDateTime.now().toString();
        return (List<Long>) redisTemplate.execute(
            script, Collections.emptyList(), userId, pageId, currentTime
        );
    }
    
    /**
     * 获取排行榜（原子操作）
     */
    public List<Map<String, Object>> getRankingWithLua(String metric, int topN) {
        String luaScript = """
            local metric = ARGV[1]
            local topN = tonumber(ARGV[2])
            local date = string.sub(ARGV[3], 1, 10)
            
            local key = "lua:" .. metric .. ":" .. date
            local results = redis.call('ZREVRANGE', key, 0, topN-1, 'WITHSCORES')
            
            local ranking = {}
            for i = 1, #results, 2 do
                table.insert(ranking, {userId = results[i], score = results[i+1]})
            end
            
            return ranking
            """;
        
        DefaultRedisScript<List> script = new DefaultRedisScript<>();
        script.setScriptText(luaScript);
        script.setResultType(List.class);
        
        String currentTime = LocalDateTime.now().toString();
        return (List<Map<String, Object>>) redisTemplate.execute(
            script, Collections.emptyList(), metric, String.valueOf(topN), currentTime
        );
    }
}

三、实战案例：某电商大促访问量统计血泪史

下面分享一个真实的电商大促期间，从Redis统计崩溃到丝滑的完整过程。

阶段1：单机Redis直接被打挂（日活1万）

问题：用MySQL统计访问量，每秒1000次写入，数据库CPU飙升到100%

解决方案：

切换到Redis String计数器
设置合理过期时间
增加Redis连接池

代码优化：

// 优化前：直接写入MySQL
@GetMapping("/api/count")
public void count() {
    jdbcTemplate.update("UPDATE page_stats SET pv = pv + 1 WHERE page_id = ?", pageId);
}

// 优化后：Redis + 异步持久化
@GetMapping("/api/count")
public Map<String, Object> count() {
    String key = "pv:page:" + LocalDate.now() + ":homepage";
    long currentPV = redisTemplate.opsForValue().increment(key);
    
    // 异步写入MySQL（每1000次批量写入）
    if (currentPV % 1000 == 0) {
        asyncService.persistToDB(key, currentPV);
    }
    
    return Map.of("pv", currentPV);
}

阶段2：Redis集群解决单机瓶颈（日活10万）

问题：单机Redis内存不足，QPS达到2万开始卡顿

解决方案：

部署Redis集群（3主3从）
使用一致性哈希分片
增加本地缓存

集群配置：

spring:
  redis:
    cluster:
      nodes:
        - redis-node1:6379
        - redis-node2:6379
        - redis-node3:6379
        - redis-node4:6379
        - redis-node5:6379
        - redis-node6:6379
      max-redirects: 3
    lettuce:
      pool:
        max-active: 100
        max-idle: 50
        min-idle: 10

阶段3：多级缓存架构（日活100万）

问题：Redis集群网络延迟，统计接口响应时间>500ms

解决方案：

本地Caffeine缓存 + Redis集群
使用Lua脚本减少网络往返
增加统计异步化

多级缓存实现：

@Service
public class MultiLevelCounterService {
    
    private final Cache<String, Long> localCache = Caffeine.newBuilder()
        .maximumSize(10000)
        .expireAfterWrite(1, TimeUnit.MINUTES)
        .build();
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    public long getPageView(String pageId) {
        String key = "pv:page:" + LocalDate.now() + ":" + pageId;
        
        // 1. 先查本地缓存
        Long localPV = localCache.getIfPresent(key);
        if (localPV != null) {
            return localPV;
        }
        
        // 2. 再查Redis
        String redisPV = redisTemplate.opsForValue().get(key);
        if (redisPV != null) {
            long pv = Long.parseLong(redisPV);
            localCache.put(key, pv);
            return pv;
        }
        
        return 0;
    }
}

阶段4：亿级架构（日活500万）

问题：Redis内存爆炸，统计维度越来越多

终极解决方案：

冷热数据分离（7天内数据在Redis，历史数据在HBase）
使用HyperLogLog优化UV统计
实时+离线统计结合

冷热分离架构：

@Service
public class UltimateCounterService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    @Autowired
    private HBaseService hbaseService;
    
    /**
     * 实时统计（Redis）
     */
    public Map<String, Object> getRealTimeStats(String pageId) {
        String date = LocalDate.now().toString();
        
        // 实时PV
        String pvKey = "realtime:pv:" + date + ":" + pageId;
        long pv = redisTemplate.opsForValue().increment(pvKey);
        
        // 实时UV（HyperLogLog）
        String uvKey = "realtime:uv:" + date + ":" + pageId;
        redisTemplate.opsForHyperLogLog().add(uvKey, getCurrentUserId());
        long uv = redisTemplate.opsForHyperLogLog().size(uvKey);
        
        return Map.of(
            "pageId", pageId,
            "pv", pv,
            "uv", uv,
            "timestamp", System.currentTimeMillis()
        );
    }
    
    /**
     * 历史统计（HBase）
     */
    public Map<String, Object> getHistoryStats(String pageId, String date) {
        return hbaseService.getPageStats(pageId, date);
    }
    
    /**
     * 每日数据迁移（凌晨执行）
     */
    @Scheduled(cron = "0 0 1 * * ?")
    public void migrateData() {
        String yesterday = LocalDate.now().minusDays(1).toString();
        
        // 1. 从Redis获取数据
        Set<String> keys = redisTemplate.keys("realtime:*:" + yesterday + "*");
        
        // 2. 批量写入HBase
        for (String key : keys) {
            Map<String, String> stats = extractStatsFromRedis(key);
            hbaseService.savePageStats(stats);
            
            // 3. 清理Redis（保留7天）
            if (isExpired(key)) {
                redisTemplate.delete(key);
            }
        }
    }
}

四、7个避坑指南，90%的人都踩过这些坑！

坑1：Redis内存溢出

问题：统计数据无限增长，Redis内存被耗尽

解决方案：

// 设置合理的过期时间
String key = "pv:page:" + LocalDate.now() + ":" + pageId;
redisTemplate.opsForValue().set(key, "0", Duration.ofDays(7));

// 使用内存淘汰策略
// redis.conf: maxmemory-policy allkeys-lru

坑2：计数器并发问题

问题：多个请求同时更新，数据不一致

解决方案：

// 使用Lua脚本保证原子性
String luaScript = "return redis.call('INCR', KEYS[1])";
DefaultRedisScript<Long> script = new DefaultRedisScript<>(luaScript, Long.class);
Long count = redisTemplate.execute(script, Collections.singletonList(key));

坑3：HyperLogLog误差过大

问题：UV统计误差>1%，老板不满意

解决方案：

// 误差率=1.04/√m，m=2^14时误差≈0.8%
// 使用PFADD前合并小数据集
String key = "uv:merged:" + date;
redisTemplate.opsForHyperLogLog().union(key, Arrays.asList(uvKey1, uvKey2));

坑4：Bitmap用户ID过大

问题：用户ID是UUID，无法用Bitmap

解决方案：

// 建立用户ID到连续数字的映射
Map<String, Long> userIdMapping = new HashMap<>();
Long userIndex = userIdMapping.computeIfAbsent(userId, k -> generateUserIndex());

// 使用映射后的数字作为bit位
redisTemplate.opsForValue().setBit("bitmap:uv:" + date, userIndex, true);

坑5：Redis集群数据倾斜

问题：某些key访问量特别大，单个节点压力过大

解决方案：

// 使用哈希分片
int shard = userId.hashCode() % 10;
String key = "pv:page:" + date + ":" + pageId + ":" + shard;

坑6：统计维度过多

问题：每个维度一个key，key数量爆炸

解决方案：

// 使用Hash存储多维度
String key = "stats:page:" + date + ":" + pageId;
Map<String, String> stats = Map.of(
    "pv", "1000",
    "uv", "800", 
    "ip", "500",
    "bounce_rate", "0.3"
);
redisTemplate.opsForHash().putAll(key, stats);

坑7：实时统计延迟

问题：统计接口响应慢，用户体验差

解决方案：

// 异步统计 + 本地缓存
@Async
public void asyncRecordVisit(String pageId, String userId) {
    // 异步记录，接口立即返回
    CompletableFuture.runAsync(() -> {
        recordVisit(pageId, userId);
    });
}

五、5个核心监控指标，让你的统计稳如老狗！

1. Redis内存使用率监控

@Component
public class RedisMetrics {
    
    @Scheduled(fixedRate = 60000)
    public void monitorMemory() {
        MemoryUsageInfo info = redisTemplate.execute(
            connection -> connection.serverCommands().info("memory")
        );
        
        long usedMemory = Long.parseLong(info.getProperty("used_memory"));
        long maxMemory = Long.parseLong(info.getProperty("maxmemory"));
        
        if (maxMemory > 0) {
            double usageRate = (double) usedMemory / maxMemory;
            if (usageRate > 0.8) {
                alertService.sendAlert("Redis内存使用率超过80%");
            }
        }
    }
}

2. 统计延迟监控

@Component
public class StatsLatencyMonitor {
    
    private final MeterRegistry meterRegistry;
    
    public void recordVisit(String pageId) {
        Timer.Sample sample = Timer.start(meterRegistry);
        try {
            // 统计逻辑
            redisTemplate.opsForValue().increment("pv:" + pageId);
        } finally {
            sample.stop(meterRegistry.timer("stats.record.latency", "page", pageId));
        }
    }
}

3. 数据准确性校验

@Service
public class DataAccuracyChecker {
    
    /**
     * 定期校验Redis和数据库数据一致性
     */
    @Scheduled(cron = "0 0 */6 * * ?")
    public void checkAccuracy() {
        String date = LocalDate.now().minusDays(1).toString();
        
        // 1. 从Redis获取数据
        long redisPV = getRedisPV(date);
        long redisUV = getRedisUV(date);
        
        // 2. 从数据库获取数据
        long dbPV = getDatabasePV(date);
        long dbUV = getDatabaseUV(date);
        
        // 3. 计算误差率
        double pvError = Math.abs(redisPV - dbPV) / (double) dbPV;
        double uvError = Math.abs(redisUV - dbUV) / (double) dbUV;
        
        // 4. 告警
        if (pvError > 0.01 || uvError > 0.01) {
            alertService.sendAccuracyAlert(pvError, uvError);
        }
    }
}

六、性能指标参考（实测数据）

经过500万日活用户的真实验证：

方案	内存占用	QPS支持	误差率	适用场景
String计数器	极低	10万+	0%	PV统计
HyperLogLog	12KB	5万+	<1%	UV统计
Bitmap	中等	3万+	0%	精确UV
Sorted Set	较高	2万+	0%	排行榜
Hash	中等	5万+	0%	多维度
Lua脚本	极低	8万+	0%	复杂逻辑

七、总结：Redis统计的3个黄金法则

选对数据结构：PV用String，UV用HyperLogLog，排行榜用Sorted Set
设置合理过期：7天内的数据放Redis，历史数据放冷存储
监控到位：内存使用率、统计延迟、数据准确性一个都不能少

Redis用户访问量统计不是简单的计数，而是一个完整的统计体系。设计好了，数据精准到个位数；设计不好，老板天天找你谈话。

记住一句话：Redis统计做得好是神器，做不好就是定时炸弹！

现在，你的用户访问量统计还会失真吗？

标题：用户访问量统计总失真？这6种Redis方案让你数据精准到个位数！
作者：jiangyi
地址：http://jiangyi.space/articles/2025/12/21/1766304293518.html

一、用户访问量统计到底是个啥？为啥这么难搞？
二、6种Redis统计方案，总有一款适合你！
方案1：String自增计数器 - 最简单的PV统计
方案2：HyperLogLog - UV统计神器
方案3：Bitmap - 按位存储用户访问状态
方案4：Sorted Set - 带时间戳的访问统计
方案5：Hash - 多维度统计
方案6：Lua脚本 - 原子操作保证数据一致性
三、实战案例：某电商大促访问量统计血泪史
阶段1：单机Redis直接被打挂（日活1万）
阶段2：Redis集群解决单机瓶颈（日活10万）
阶段3：多级缓存架构（日活100万）
阶段4：亿级架构（日活500万）
四、7个避坑指南，90%的人都踩过这些坑！
坑1：Redis内存溢出
坑2：计数器并发问题
坑3：HyperLogLog误差过大
坑4：Bitmap用户ID过大
坑5：Redis集群数据倾斜
坑6：统计维度过多
坑7：实时统计延迟
五、5个核心监控指标，让你的统计稳如老狗！
1. Redis内存使用率监控
2. 统计延迟监控
3. 数据准确性校验
六、性能指标参考（实测数据）
七、总结：Redis统计的3个黄金法则

0 评论