刚使用 InfluxDB, 碰到个“数据丢失问题”
从 mysql
中拷贝了 400w 条日志,插入到 influxdb
后, 数据只剩下了 70w 左右。
插入 influxdb 时使用的是批量插入方式, golang
的官方客户端。
刚使用 influxdb
,不知道这个问题是 influxdb
的特性,还是自己的 bug,抑或有什么配置可以设置
一些配置:
> SHOW RETENTION POLICIES ON mydb;
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 true
> select count(*) from alg_read_log;
name: alg_read_log
time count_content_id count_id count_user_id
---- ---------------- -------- -------------
0 736999 736999 736999
1
kkfnui OP 刚再用小量数据做测试, 批量插入 20 条数据,结果只剩 7 条数据。
|
2
psnail 2017-12-07 19:27:28 +08:00 1
看下丢失的都是些什么数据,和没有丢失的数据的区别是什么
|
3
kkfnui OP 插入二十条数据时
ID userid content time 13 4706513 851759 2017-12-06 17:00:00 +0000 UTC 8 5143055 4813070 2017-12-06 17:00:00 +0000 UTC 6 2439692 4434993 2017-12-06 17:00:00 +0000 UTC 4 5261489 2098786 2017-12-06 17:00:00 +0000 UTC 10 6185307 11177143 2017-12-06 17:00:00 +0000 UTC 12 6173230 37959 2017-12-06 17:00:00 +0000 UTC 20 6647995 4964641 2017-12-06 17:00:00 +0000 UTC 9 5388686 9843194 2017-12-06 17:00:00 +0000 UTC 15 4975601 4978180 2017-12-06 17:00:00 +0000 UTC 1 5203741 4554768 2017-12-06 17:00:00 +0000 UTC 18 876162 11164468 2017-12-06 17:00:00 +0000 UTC 11 4239087 11164092 2017-12-06 17:00:00 +0000 UTC 5 5864698 4978165 2017-12-06 17:00:00 +0000 UTC 7 4461930 7139200 2017-12-06 17:00:00 +0000 UTC 2 6226807 168513 2017-12-06 17:00:00 +0000 UTC 14 3687226 460986 2017-12-06 17:00:00 +0000 UTC 16 1444384 11119087 2017-12-06 17:00:00 +0000 UTC 19 3244527 5674265 2017-12-06 17:00:00 +0000 UTC 17 6528511 11162885 2017-12-06 17:00:00 +0000 UTC 3 5810244 10774121 2017-12-06 17:00:00 +0000 UTC > select * from alg_read_log; name: alg_read_log time alg_id alg_scene cat_id content_id id user_id ---- ------ --------- ------ ---------- -- ------- 1512579600000000000 -1 sc_d 0 11164092 11423908711164092 4239087 1512579600000000000 1 sc_d 0 4434993 624396924434993 2439692 1512579600000000000 10 sc_d 0 37959 12617323037959 6173230 1512579600000000000 2 sc_d 2 11177143 10618530711177143 6185307 1512579600000000000 21 sc_d 0 11164468 1887616211164468 876162 1512579600000000000 23 sc_d 0 5674265 1932445275674265 3244527 1512579600000000000 25 sc_d 0 4978165 558646984978165 5864698 |
4
kkfnui OP 插入二十五条数据时
17 6528511 11162885 2017-12-06 17:00:00 +0000 UTC 24 5721905 11164092 2017-12-06 17:00:00 +0000 UTC 7 4461930 7139200 2017-12-06 17:00:00 +0000 UTC 11 4239087 11164092 2017-12-06 17:00:00 +0000 UTC 16 1444384 11119087 2017-12-06 17:00:00 +0000 UTC 13 4706513 851759 2017-12-06 17:00:00 +0000 UTC 6 2439692 4434993 2017-12-06 17:00:00 +0000 UTC 14 3687226 460986 2017-12-06 17:00:00 +0000 UTC 9 5388686 9843194 2017-12-06 17:00:00 +0000 UTC 4 5261489 2098786 2017-12-06 17:00:00 +0000 UTC 19 3244527 5674265 2017-12-06 17:00:00 +0000 UTC 2 6226807 168513 2017-12-06 17:00:00 +0000 UTC 23 841747 7230377 2017-12-06 17:00:00 +0000 UTC 5 5864698 4978165 2017-12-06 17:00:00 +0000 UTC 20 6647995 4964641 2017-12-06 17:00:00 +0000 UTC 1 5203741 4554768 2017-12-06 17:00:00 +0000 UTC 8 5143055 4813070 2017-12-06 17:00:00 +0000 UTC 22 4582157 11160521 2017-12-06 17:00:00 +0000 UTC 21 3495482 919711 2017-12-06 17:00:00 +0000 UTC 15 4975601 4978180 2017-12-06 17:00:00 +0000 UTC 3 5810244 10774121 2017-12-06 17:00:00 +0000 UTC 10 6185307 11177143 2017-12-06 17:00:00 +0000 UTC 25 5266737 11109417 2017-12-06 17:00:00 +0000 UTC 12 6173230 37959 2017-12-06 17:00:00 +0000 UTC 18 876162 11164468 2017-12-06 17:00:00 +0000 UTC > select * from alg_read_log; name: alg_read_log time alg_id alg_scene cat_id content_id id user_id ---- ------ --------- ------ ---------- -- ------- 1512579600000000000 -1 sc_d 0 10774121 3581024410774121 5810244 1512579600000000000 1 sc_d 0 11109417 25526673711109417 5266737 1512579600000000000 10 sc_d 0 37959 12617323037959 6173230 1512579600000000000 2 sc_d 2 11177143 10618530711177143 6185307 1512579600000000000 21 sc_d 0 11164468 1887616211164468 876162 1512579600000000000 23 sc_d 0 5674265 1932445275674265 3244527 1512579600000000000 25 sc_d 0 11160521 22458215711160521 4582157 |
6
psnail 2017-12-07 19:55:16 +08:00 1
influxdb 插入数据格式为 insert measurement,tag=value field=value timestamp
是按照时间存储的 对于 measurement tagkey, tagvalue 和 timestamp 一样的,field 会被最新的值替换 |
7
kkfnui OP |
8
rrfeng 2017-12-07 20:53:49 +08:00 via Android 1
可以这么理解,influxdb 主键就是时间戳
|
9
rswl 2017-12-07 21:53:32 +08:00 1
相同的时间戳会覆盖
|
11
qyvlik 2019-10-11 14:08:48 +08:00
同样遇到了这个问题,针对 time 做了特殊处理,确保 time 都是不一样的,但是从 mysql 导出数据插入到 influxdb,还是丢失了一半的数据。
|