admin管理员组文章数量:1034019
TiDB 可观测性解读(二)丨算子执行信息性能诊断案例分享(下)
换个日期,查询慢了 160 倍还没跑出来
实际案例
这是另一个客户的实际案例:查询 12 月 20 日 数据的 SQL 语句仅耗时 25 分钟便完成,而查询 12 月 21 日 数据的 SQL 语句却运行了超过 40 小时仍未完成,最终只能手动终止。以下是这两条 SQL 语句及其对应的执行信息:
代码语言:javascript代码运行次数:0运行复制【12.20日】
explain analyze INSERT INTO t1 (col0, col1)
SELECT
str_to_date('20241219', '%Y%m%d') AS col0,
count(1)
FROM t0 WHERE
col0 = str_to_date('20241219', '%Y%m%d')
AND col2 = '1';
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| id | task | estRows | actRows | operator info | execution info | memory | disk |
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| Insert_1 | root | 0 | 0 | N/A | time:25m21.1s, loops:1, prepare: 25m21.1s, insert:62.2µs, lock_keys: {time:3.28ms, region:1, keys:1, slowest_rpc: {total: 0.003s, region_id: 615664606, store: 214.194.4.93:20160, tikv_wall_time: 230.1µs}, lock_rpc:2.811716ms, rpc_count:1} | 29.2 KB | N/A |
| └─Projection_7 | root | 1 | 1 | col0 -> Column#209, Column#208 | time:25m21.1s, loops:2, Concurrency:OFF | 380 Bytes| N/A |
| └─HashAgg_16 | root | 1 | 1 | funcs:count(Column#215) -> Column#208 | time:25m21.1s, loops:2, partial_worker:{wall_time:25m21.069799539s, concurrency:5, task_num:82364, tot_wait:2h6m36.573780845s, tot_exec:8.745753026s, tot_time:2h6m45.348679761s, max:25m21.069763913s, p95:25m21.069763913s}, final_worker:{wall_time:25m21.069835493s, concurrency:5, task_num:5, tot_wait:2h6m45.348923309s, tot_exec:18.591µs, tot_time:2h6m45.348944905s, max:25m21.069801614s, p95:25m21.069801614s} | 260.2 KB | N/A |
| └─IndexLookUp_17 | root | 1 | 84339969 | N/A | time:25m19.4s, loops:82365, index_task: {total_time: 25m20.7s, fetch_handle: 7m18.7s, build: 23.5ms, wait: 18m2s}, table_task: {total_time: 2h6m24.2s, num: 5813, concurrency: 5}, next: {wait_index: 7.63ms, wait_table_lookup_build: 5.74s, wait_table_lookup_resp: 25m10.7s} | 52.6 MB | N/A |
| ├─IndexRangeScan_13(Build) | cop[tikv] | 59335552.00 | 118921714 | table:t0, index:idx_index0(col0, col3, col4), range:[2024-12-19,2024-12-19], keep order:false, stats:pseudo | time:1m0.5s, loops:116574, cop_task: {num: 3457, max: 359.8ms, min: 1.13ms, avg: 51.7ms, p95: 121.5ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 1m54s, tot_wait: 373ms, rpc_num: 3458, rpc_time: 2m58.7s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 2ms}, scan_detail: {total_process_keys: 118921714, total_process_keys_size: 14309548264, total_keys: 119003503, get_snapshot_time: 635.1ms, rocksdb: {key_skipped_count: 120441534, block: {cache_hit_count: 212755, read_count: 23658, read_byte: 538.1 MB, read_time: 179.2ms}}} | N/A | N/A |
| └─HashAgg_9(Probe) | cop[tikv] | 1 | 84339969 | funcs:count(1) -> Column#215 | time:1h47m18.5s, loops:90847, cop_task: {num: 91439604, max: 4.05s, min: 450µs, avg: 2.5ms, p95: 9.62ms, max_proc_keys: 9094, p95_proc_keys: 2, tot_proc: 3m46.3s, tot_wait: 1h36m3.1s, rpc_num: 91439649, rpc_time: 61h20m32.2s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 86ms}, scan_detail: {total_process_keys: 118921714, total_process_keys_size: 272099421438, total_keys: 353720316, get_snapshot_time: 3h23m51s, rocksdb: {delete_skipped_count: 4, key_skipped_count: 234799057, block: {cache_hit_count: 2970751374, read_count: 15457191, read_byte: 268.8 GB, read_time: 4m11.8s}}} | N/A | N/A |
| └─Selection_15 | cop[tikv] | 59335.55 | 110035144 | eq(t0.col2, "1") | tikv_task:{proc max:321ms, min:0s, avg: 193.5µs, p80:0s, p95:0s, iters:91440577, tasks:91439604} | N/A | N/A |
| └─TableRowIDScan_14 | cop[tikv] | 59335552.00 | 118921714 | table:t0, keep order:false, stats:pseudo | tikv_task:{proc max:321ms, min:0s, avg: 183.3µs, p80:0s, p95:0s, iters:91440577, tasks:91439604} | N/A | N/A |
+------------------------+-----------+------------+--------------------------------------------------+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
【12.21日】
explain analyze INSERT INTO t1 (col0, col1)
SELECT
str_to_date('20241220', '%Y%m%d') AS col0,
count(1)
FROM t0 WHERE
col0 = str_to_date('20241220', '%Y%m%d')
AND col2 = '1';
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| id | task | estRows | actRows | operator info | execution info | memory | disk |
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| Insert_1 | root | 0 | 0 | N/A | time:40h30m50.9s, loops:1 | 0 Bytes | N/A |
| └─Projection_7 | root | 1 | 0 | col0 -> Column#209, Column#208 | time:40h30m50.9s, loops:1, Concurrency:OFF | 8.24 KB | N/A |
| └─HashAgg_16 | root | 1 | 0 | funcs:count(Column#215) -> Column#208 | time:40h30m50.9s, loops:1, partial_worker:{wall_time:40h30m50.867278633s, concurrency:5, task_num:141733, tot_wait:202h33m46.814913611s, tot_exec:27.470849055s, tot_time:202h34m14.336192096s, max:40h30m50.867251785s, p95:40h30m50.867251785s}, final_worker:{wall_time:40h30m50.867835193s, concurrency:5, task_num:5, tot_wait:202h34m14.336323864s, tot_exec:587.154µs, tot_time:202h34m14.336916359s, max:40h30m50.867823183s, p95:40h30m50.867823183s} | 253.9 KB | N/A |
| └─IndexLookUp_17 | root | 1 | 145134747 | N/A | time:40h30m44.5s, loops:141734, index_task: {total_time: 40h30m50.8s, fetch_handle: 1h26m6.1s, build: 53.4ms, wait: 39h4m44.7s}, table_task: {total_time: 202h30m18s, num: 10576, concurrency: 5}, next: {wait_index: 38.7ms, wait_table_lookup_build: 16m5.5s, wait_table_lookup_resp: 35h58m38.1s} | 54.1 MB | N/A |
| ├─IndexRangeScan_13(Build) | cop[tikv] | 59418221.51 | 216452387 | table:t0, index:idx_index0(col0, col3, col4), range:[2024-12-20,2024-12-20], keep order:false, stats:pseudo | time:1h10m20.7s, loops:212130, cop_task: {num: 6031, max: 4.3s, min: 1.25ms, avg: 68.2ms, p95: 157.5ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 4m2.6s, tot_wait: 1.72s, rpc_num: 6032, rpc_time: 6m51.3s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 4.01s}, tikv_task:{proc max:240ms, min:0s, avg: 35ms, p80:52ms, p95:60ms, iters:235427, tasks:6031}, scan_detail: {total_process_keys: 216452387, total_process_keys_size: 26106920846, total_keys: 216513446, get_snapshot_time: 1.82s, rocksdb: {key_skipped_count: 217250001, block: {cache_hit_count: 85194, read_count: 336650, read_byte: 5.71 GB, read_time: 2.59s}}} | N/A | N/A |
| └─HashAgg_9(Probe) | cop[tikv] | 1 | 145183684 | funcs:count(1) -> Column#215 | time:200h6m9.4s, loops:157469, cop_task: {num: 152652462, max: 5.48s, min: 428.9µs, avg: 3.46ms, p95: 9.78ms, max_proc_keys: 9041, p95_proc_keys: 3, tot_proc: 3h59m54.1s, tot_wait: 9h5m23.6s, rpc_num: 152657420, rpc_time: 138h15m0.2s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 5.94s}, tikv_task:{proc max:468ms, min:0s, avg: 494.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462}, scan_detail: {total_process_keys: 216364520, total_process_keys_size: 340485612010, total_keys: 490792535, get_snapshot_time: 12h12m26.1s, rocksdb: {delete_skipped_count: 1944, key_skipped_count: 274430096, block: {cache_hit_count: 5855418083, read_count: 285117868, read_byte: 4977.3 GB, read_time: 4h18m23.9s}}} | N/A | N/A |
| └─Selection_15 | cop[tikv] | 59418.22 | 207051411 | eq(t0.col2, "1") | tikv_task:{proc max:1.3s, min:0s, avg: 489.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462} | N/A | N/A |
| └─TableRowIDScan_14 | cop[tikv] | 59418221.51 | 232766720 | table:t0, keep order:false, stats:pseudo | tikv_task:{proc max:1.3s, min:0s, avg: 478.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462} | N/A | N/A |
+------------------------+-----------+------------+--------------------------------------------------+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
诊断分析
首先对比两次查询的执行计划,确认执行计划未发生跳变。接着,我们从上至下逐一分析,找出两次查询中主要执行时间差异的来源。可以看到,Insert_1、Projection_7 和 HashAgg_16 的执行时间均在秒级,因此可以排除这些算子的嫌疑。
重点聚焦于 IndexLookUp 算子。在 12 月 21 日的查询中,index_task 的总耗时为 40 小时,但实际获取 rowid 的时间仅为 1 个多小时,其余时间主要消耗在等待 table_task 读取表数据上,具体为 wait:39h4m44.7s。
进一步查看 table_task 的执行细节,以下两个信息值得关注:total_time:202h30m18s 和 concurrency:5。计算可得,平均每个并发任务的耗时为 202h / 5 = 40.4h,与 IndexLookUp 的总耗时基本一致。因此,基本可以确定问题出在 table_task 上,即对应的 HashAgg_9算子。
在 HashAgg_9 算子的 execution info 中,主要包含以下关键信息:
- time: 200h6m9.4s, loops: 157469:该算子的任务类型为 "cop[tikv]",表明其实际在 TiKV 上执行。这里的执行时间反映了 TiDB 等待和读取 Coprocessor 任务返回数据的总耗时。需要注意的是,这个时间并不是实际的 walltime,而是多个并发任务的执行时间累加而成。由于 table_task 的并发度为 5,因此实际算子的 walltime 大约为 200h / 5 = 40h,与总耗时一致。从 TiDB 9.0 版本开始,我们将这种并发执行时间的统计项名称改为 total_time,以便与真正的 walltime 区分开来。
- cop_task: 此部分信息描述了向 TiKV 发送 cop task 的执行情况。其中,rpc_time 和 distsql_concurrency 是两个关键指标:
- rpc_time 是 TiDB 端记录的从发送 cop task 到接收结果的总耗时。
- distsql_concurrency 是单个 table_task 发送 cop_task 的并发度。由于 table_task 的并发度为 5,因此整体的 cop task 并发度为 5 × 15 = 75。
根据这两个指标,我们可以推算出 cop task 的实际 walltime 大约为 138h / 75 ≈ 1.8h,与整体的 40h 相比,占比并不高。
- backoff:这部分描述了 cop task 发送失败时重试的时间。
- tikv_task:这部分是描述的 HashAgg_9 这个算子在 TiKV 上的执行信息,实际时间并不多。
- scan_detail:描述了 TiKV 从 RocksDB 读取数据的信息统计,由于 tikv_task 推算出算子总执行时间并不长,所以这部分信息不影响总体性能。
综合上述信息可以发现,HashAgg_9 的总执行时间并非主要消耗在 cop task 的执行上,而更可能是花费在从 cop task response 中读取数据的过程中。结合 CPU profiling 的结果,可以看到 Go runtime 的 GC(垃圾回收)占用了大约 80% 的 CPU 时间。
进一步观察 heap profiling 的结果,可以最终确认问题根源是 issue44047 ( ):此前,TiDB 为了精确统计 cop task 的 p90 等指标,会缓存所有 cop task 的执行信息。当 cop task 数量极为庞大时,这会导致 TiDB 内存紧张,进而频繁触发 Go 的 GC(垃圾回收),最终引发性能异常。该问题已经在 TiDB 6.5 版本中修复。
在 TiDB 9.0 版本中,我们将进一步丰富算子执行信息,提升系统的可观测性,具体改进包括:
1. 算子执行时间的细化
在 9.0 版本中,TiDB 算子的执行信息新增了 open 和 close 时间。此前,执行信息中的 time 并未涵盖算子初始化和结束执行的时间,这可能导致执行时间显著低于实际耗时(如 issue50377 ( ) 和 issue55957 ( ))。在 9.0 版本中,time 被修正为从进入算子到离开算子的完整 wall time,包括所有子算子的执行时间。其中:
- open:表示算子初始化所需的时间。
- close:表示从算子处理完所有数据到完全结束的时间。
- time 包含了 open 和 close 的时间。
2. 并发执行时间的区分
当算子存在多并发执行时,此前的 execution info 中显示的是并发任务的 wall time 累加值,这可能导致子算子的执行时间大于父算子的执行时间(如 issue56746 ( )),容易引起混淆。在 9.0 版本中,这些累加的时间信息(如 time、open 和 close)将被替换为 total_time、total_open 和 total_close,以更清晰地反映并发执行的真实耗时。
3. TiFlash 执行中的等待时间信息补充
在 TiFlash 的执行信息中,新增了以下等待时间的统计:
- minTSO_wait:记录 MPP Task 等待 TiFlash MinTSO 调度器调度的时间。
- pipeline_breaker_wait:在 TiFlash 的 Pipeline 执行模型中,记录包含 pipeline breaker 算子的 pipeline 等待上游 pipeline 数据的时间。目前主要用于展示包含 Join 算子的 pipeline 等待哈希表构建完成的时间。
- pipeline_queue_wait:在 TiFlash 的 Pipeline 执行模型中,记录 pipeline 在 CPU Task Thread Pool 和 IO Task Thread Pool 中的等待时间。
通过这些改进,TiDB 9.0 版本将为用户提供更全面、更准确的执行信息,帮助更好地诊断和优化查询性能。
TiDB 可观测性解读(二)丨算子执行信息性能诊断案例分享(下)
换个日期,查询慢了 160 倍还没跑出来
实际案例
这是另一个客户的实际案例:查询 12 月 20 日 数据的 SQL 语句仅耗时 25 分钟便完成,而查询 12 月 21 日 数据的 SQL 语句却运行了超过 40 小时仍未完成,最终只能手动终止。以下是这两条 SQL 语句及其对应的执行信息:
代码语言:javascript代码运行次数:0运行复制【12.20日】
explain analyze INSERT INTO t1 (col0, col1)
SELECT
str_to_date('20241219', '%Y%m%d') AS col0,
count(1)
FROM t0 WHERE
col0 = str_to_date('20241219', '%Y%m%d')
AND col2 = '1';
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| id | task | estRows | actRows | operator info | execution info | memory | disk |
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| Insert_1 | root | 0 | 0 | N/A | time:25m21.1s, loops:1, prepare: 25m21.1s, insert:62.2µs, lock_keys: {time:3.28ms, region:1, keys:1, slowest_rpc: {total: 0.003s, region_id: 615664606, store: 214.194.4.93:20160, tikv_wall_time: 230.1µs}, lock_rpc:2.811716ms, rpc_count:1} | 29.2 KB | N/A |
| └─Projection_7 | root | 1 | 1 | col0 -> Column#209, Column#208 | time:25m21.1s, loops:2, Concurrency:OFF | 380 Bytes| N/A |
| └─HashAgg_16 | root | 1 | 1 | funcs:count(Column#215) -> Column#208 | time:25m21.1s, loops:2, partial_worker:{wall_time:25m21.069799539s, concurrency:5, task_num:82364, tot_wait:2h6m36.573780845s, tot_exec:8.745753026s, tot_time:2h6m45.348679761s, max:25m21.069763913s, p95:25m21.069763913s}, final_worker:{wall_time:25m21.069835493s, concurrency:5, task_num:5, tot_wait:2h6m45.348923309s, tot_exec:18.591µs, tot_time:2h6m45.348944905s, max:25m21.069801614s, p95:25m21.069801614s} | 260.2 KB | N/A |
| └─IndexLookUp_17 | root | 1 | 84339969 | N/A | time:25m19.4s, loops:82365, index_task: {total_time: 25m20.7s, fetch_handle: 7m18.7s, build: 23.5ms, wait: 18m2s}, table_task: {total_time: 2h6m24.2s, num: 5813, concurrency: 5}, next: {wait_index: 7.63ms, wait_table_lookup_build: 5.74s, wait_table_lookup_resp: 25m10.7s} | 52.6 MB | N/A |
| ├─IndexRangeScan_13(Build) | cop[tikv] | 59335552.00 | 118921714 | table:t0, index:idx_index0(col0, col3, col4), range:[2024-12-19,2024-12-19], keep order:false, stats:pseudo | time:1m0.5s, loops:116574, cop_task: {num: 3457, max: 359.8ms, min: 1.13ms, avg: 51.7ms, p95: 121.5ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 1m54s, tot_wait: 373ms, rpc_num: 3458, rpc_time: 2m58.7s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 2ms}, scan_detail: {total_process_keys: 118921714, total_process_keys_size: 14309548264, total_keys: 119003503, get_snapshot_time: 635.1ms, rocksdb: {key_skipped_count: 120441534, block: {cache_hit_count: 212755, read_count: 23658, read_byte: 538.1 MB, read_time: 179.2ms}}} | N/A | N/A |
| └─HashAgg_9(Probe) | cop[tikv] | 1 | 84339969 | funcs:count(1) -> Column#215 | time:1h47m18.5s, loops:90847, cop_task: {num: 91439604, max: 4.05s, min: 450µs, avg: 2.5ms, p95: 9.62ms, max_proc_keys: 9094, p95_proc_keys: 2, tot_proc: 3m46.3s, tot_wait: 1h36m3.1s, rpc_num: 91439649, rpc_time: 61h20m32.2s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 86ms}, scan_detail: {total_process_keys: 118921714, total_process_keys_size: 272099421438, total_keys: 353720316, get_snapshot_time: 3h23m51s, rocksdb: {delete_skipped_count: 4, key_skipped_count: 234799057, block: {cache_hit_count: 2970751374, read_count: 15457191, read_byte: 268.8 GB, read_time: 4m11.8s}}} | N/A | N/A |
| └─Selection_15 | cop[tikv] | 59335.55 | 110035144 | eq(t0.col2, "1") | tikv_task:{proc max:321ms, min:0s, avg: 193.5µs, p80:0s, p95:0s, iters:91440577, tasks:91439604} | N/A | N/A |
| └─TableRowIDScan_14 | cop[tikv] | 59335552.00 | 118921714 | table:t0, keep order:false, stats:pseudo | tikv_task:{proc max:321ms, min:0s, avg: 183.3µs, p80:0s, p95:0s, iters:91440577, tasks:91439604} | N/A | N/A |
+------------------------+-----------+------------+--------------------------------------------------+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
【12.21日】
explain analyze INSERT INTO t1 (col0, col1)
SELECT
str_to_date('20241220', '%Y%m%d') AS col0,
count(1)
FROM t0 WHERE
col0 = str_to_date('20241220', '%Y%m%d')
AND col2 = '1';
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| id | task | estRows | actRows | operator info | execution info | memory | disk |
+-------------------------------------+-----------+-------------+-----------+-------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| Insert_1 | root | 0 | 0 | N/A | time:40h30m50.9s, loops:1 | 0 Bytes | N/A |
| └─Projection_7 | root | 1 | 0 | col0 -> Column#209, Column#208 | time:40h30m50.9s, loops:1, Concurrency:OFF | 8.24 KB | N/A |
| └─HashAgg_16 | root | 1 | 0 | funcs:count(Column#215) -> Column#208 | time:40h30m50.9s, loops:1, partial_worker:{wall_time:40h30m50.867278633s, concurrency:5, task_num:141733, tot_wait:202h33m46.814913611s, tot_exec:27.470849055s, tot_time:202h34m14.336192096s, max:40h30m50.867251785s, p95:40h30m50.867251785s}, final_worker:{wall_time:40h30m50.867835193s, concurrency:5, task_num:5, tot_wait:202h34m14.336323864s, tot_exec:587.154µs, tot_time:202h34m14.336916359s, max:40h30m50.867823183s, p95:40h30m50.867823183s} | 253.9 KB | N/A |
| └─IndexLookUp_17 | root | 1 | 145134747 | N/A | time:40h30m44.5s, loops:141734, index_task: {total_time: 40h30m50.8s, fetch_handle: 1h26m6.1s, build: 53.4ms, wait: 39h4m44.7s}, table_task: {total_time: 202h30m18s, num: 10576, concurrency: 5}, next: {wait_index: 38.7ms, wait_table_lookup_build: 16m5.5s, wait_table_lookup_resp: 35h58m38.1s} | 54.1 MB | N/A |
| ├─IndexRangeScan_13(Build) | cop[tikv] | 59418221.51 | 216452387 | table:t0, index:idx_index0(col0, col3, col4), range:[2024-12-20,2024-12-20], keep order:false, stats:pseudo | time:1h10m20.7s, loops:212130, cop_task: {num: 6031, max: 4.3s, min: 1.25ms, avg: 68.2ms, p95: 157.5ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 4m2.6s, tot_wait: 1.72s, rpc_num: 6032, rpc_time: 6m51.3s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 4.01s}, tikv_task:{proc max:240ms, min:0s, avg: 35ms, p80:52ms, p95:60ms, iters:235427, tasks:6031}, scan_detail: {total_process_keys: 216452387, total_process_keys_size: 26106920846, total_keys: 216513446, get_snapshot_time: 1.82s, rocksdb: {key_skipped_count: 217250001, block: {cache_hit_count: 85194, read_count: 336650, read_byte: 5.71 GB, read_time: 2.59s}}} | N/A | N/A |
| └─HashAgg_9(Probe) | cop[tikv] | 1 | 145183684 | funcs:count(1) -> Column#215 | time:200h6m9.4s, loops:157469, cop_task: {num: 152652462, max: 5.48s, min: 428.9µs, avg: 3.46ms, p95: 9.78ms, max_proc_keys: 9041, p95_proc_keys: 3, tot_proc: 3h59m54.1s, tot_wait: 9h5m23.6s, rpc_num: 152657420, rpc_time: 138h15m0.2s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, backoff{regionMiss: 5.94s}, tikv_task:{proc max:468ms, min:0s, avg: 494.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462}, scan_detail: {total_process_keys: 216364520, total_process_keys_size: 340485612010, total_keys: 490792535, get_snapshot_time: 12h12m26.1s, rocksdb: {delete_skipped_count: 1944, key_skipped_count: 274430096, block: {cache_hit_count: 5855418083, read_count: 285117868, read_byte: 4977.3 GB, read_time: 4h18m23.9s}}} | N/A | N/A |
| └─Selection_15 | cop[tikv] | 59418.22 | 207051411 | eq(t0.col2, "1") | tikv_task:{proc max:1.3s, min:0s, avg: 489.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462} | N/A | N/A |
| └─TableRowIDScan_14 | cop[tikv] | 59418221.51 | 232766720 | table:t0, keep order:false, stats:pseudo | tikv_task:{proc max:1.3s, min:0s, avg: 478.8µs, p80:0s, p95:4ms, iters:152653501, tasks:152652462} | N/A | N/A |
+------------------------+-----------+------------+--------------------------------------------------+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
诊断分析
首先对比两次查询的执行计划,确认执行计划未发生跳变。接着,我们从上至下逐一分析,找出两次查询中主要执行时间差异的来源。可以看到,Insert_1、Projection_7 和 HashAgg_16 的执行时间均在秒级,因此可以排除这些算子的嫌疑。
重点聚焦于 IndexLookUp 算子。在 12 月 21 日的查询中,index_task 的总耗时为 40 小时,但实际获取 rowid 的时间仅为 1 个多小时,其余时间主要消耗在等待 table_task 读取表数据上,具体为 wait:39h4m44.7s。
进一步查看 table_task 的执行细节,以下两个信息值得关注:total_time:202h30m18s 和 concurrency:5。计算可得,平均每个并发任务的耗时为 202h / 5 = 40.4h,与 IndexLookUp 的总耗时基本一致。因此,基本可以确定问题出在 table_task 上,即对应的 HashAgg_9算子。
在 HashAgg_9 算子的 execution info 中,主要包含以下关键信息:
- time: 200h6m9.4s, loops: 157469:该算子的任务类型为 "cop[tikv]",表明其实际在 TiKV 上执行。这里的执行时间反映了 TiDB 等待和读取 Coprocessor 任务返回数据的总耗时。需要注意的是,这个时间并不是实际的 walltime,而是多个并发任务的执行时间累加而成。由于 table_task 的并发度为 5,因此实际算子的 walltime 大约为 200h / 5 = 40h,与总耗时一致。从 TiDB 9.0 版本开始,我们将这种并发执行时间的统计项名称改为 total_time,以便与真正的 walltime 区分开来。
- cop_task: 此部分信息描述了向 TiKV 发送 cop task 的执行情况。其中,rpc_time 和 distsql_concurrency 是两个关键指标:
- rpc_time 是 TiDB 端记录的从发送 cop task 到接收结果的总耗时。
- distsql_concurrency 是单个 table_task 发送 cop_task 的并发度。由于 table_task 的并发度为 5,因此整体的 cop task 并发度为 5 × 15 = 75。
根据这两个指标,我们可以推算出 cop task 的实际 walltime 大约为 138h / 75 ≈ 1.8h,与整体的 40h 相比,占比并不高。
- backoff:这部分描述了 cop task 发送失败时重试的时间。
- tikv_task:这部分是描述的 HashAgg_9 这个算子在 TiKV 上的执行信息,实际时间并不多。
- scan_detail:描述了 TiKV 从 RocksDB 读取数据的信息统计,由于 tikv_task 推算出算子总执行时间并不长,所以这部分信息不影响总体性能。
综合上述信息可以发现,HashAgg_9 的总执行时间并非主要消耗在 cop task 的执行上,而更可能是花费在从 cop task response 中读取数据的过程中。结合 CPU profiling 的结果,可以看到 Go runtime 的 GC(垃圾回收)占用了大约 80% 的 CPU 时间。
进一步观察 heap profiling 的结果,可以最终确认问题根源是 issue44047 ( ):此前,TiDB 为了精确统计 cop task 的 p90 等指标,会缓存所有 cop task 的执行信息。当 cop task 数量极为庞大时,这会导致 TiDB 内存紧张,进而频繁触发 Go 的 GC(垃圾回收),最终引发性能异常。该问题已经在 TiDB 6.5 版本中修复。
在 TiDB 9.0 版本中,我们将进一步丰富算子执行信息,提升系统的可观测性,具体改进包括:
1. 算子执行时间的细化
在 9.0 版本中,TiDB 算子的执行信息新增了 open 和 close 时间。此前,执行信息中的 time 并未涵盖算子初始化和结束执行的时间,这可能导致执行时间显著低于实际耗时(如 issue50377 ( ) 和 issue55957 ( ))。在 9.0 版本中,time 被修正为从进入算子到离开算子的完整 wall time,包括所有子算子的执行时间。其中:
- open:表示算子初始化所需的时间。
- close:表示从算子处理完所有数据到完全结束的时间。
- time 包含了 open 和 close 的时间。
2. 并发执行时间的区分
当算子存在多并发执行时,此前的 execution info 中显示的是并发任务的 wall time 累加值,这可能导致子算子的执行时间大于父算子的执行时间(如 issue56746 ( )),容易引起混淆。在 9.0 版本中,这些累加的时间信息(如 time、open 和 close)将被替换为 total_time、total_open 和 total_close,以更清晰地反映并发执行的真实耗时。
3. TiFlash 执行中的等待时间信息补充
在 TiFlash 的执行信息中,新增了以下等待时间的统计:
- minTSO_wait:记录 MPP Task 等待 TiFlash MinTSO 调度器调度的时间。
- pipeline_breaker_wait:在 TiFlash 的 Pipeline 执行模型中,记录包含 pipeline breaker 算子的 pipeline 等待上游 pipeline 数据的时间。目前主要用于展示包含 Join 算子的 pipeline 等待哈希表构建完成的时间。
- pipeline_queue_wait:在 TiFlash 的 Pipeline 执行模型中,记录 pipeline 在 CPU Task Thread Pool 和 IO Task Thread Pool 中的等待时间。
通过这些改进,TiDB 9.0 版本将为用户提供更全面、更准确的执行信息,帮助更好地诊断和优化查询性能。
本文标签: TiDB 可观测性解读(二)丨算子执行信息性能诊断案例分享(下)
版权声明:本文标题:TiDB 可观测性解读(二)丨算子执行信息性能诊断案例分享(下) 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/jiaocheng/1748110133a2254881.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论