slave_rows_search_algorithms参数hash_scan的实现方法

文章由LinuxBoy分享于2019-09-17 11:09:29热评（324）

slave_rows_search_algorithms参数hash_scan的实现方法

slave_rows_search_algorithms由三个值的组合组成：TABLE_SCAN，INDEX_SCAN， HASH_SCAN。

TABLE_SCAN,INDEX_SCAN (默认配置，表示如果有索引就用索引，否则使用全表扫描)

HASH_SCAN可以部分解决无主键表导致的复制延迟问题。

当表上无主键或唯一键时，那么对于在该表上做的DML，如果是以ROW模式复制，则每一个行记录前镜像在备库都可能

产生一次全表扫描（或者二级索引扫描），大多数情况下，这种开销都是非常不可接受的，并且会产生大量的延迟。

hash_scan的实现方法

简单的讲，在apply rows_log_event时，会将 log_event 中对行的更新缓存在两个结构中，分别

是：m_hash, m_distinct_key_list。 m_hash：主要用来缓存更新的行记录的起始位置，

是一个hash表； m_distinct_key_list：如果有索引，则将索引的值push 到m_distinct_key_list，如果表没有索引，

则不使用这个List结构；其中预扫描整个调用过程如下： Log_event::apply_event

Rows_log_event::do_apply_event

Rows_log_event::do_hash_scan_and_update

Rows_log_event::do_hash_row (add entry info of changed records)

if (m_key_index < MAX_KEY) (index used instead of table scan)

Rows_log_event::add_key_to_distinct_keyset ()

当一个event 中包含多个行的更改时，会首先扫描所有的更改，将结果缓存到m_hash中，如果该表有索引，则将索引的值

缓存至m_distinct_key_list List 中，如果没有，则不使用这个缓存结构，

而直接进行全表扫描。

执行stack如下：

#0 handler::ha_delete_row

#1 0x0000000000a4192b in Delete_rows_log_event::do_exec_row

#2 0x0000000000a3a9c8 in Rows_log_event::do_apply_row

#3 0x0000000000a3c1f4 in Rows_log_event::do_scan_and_update

#4 0x0000000000a3c5ef in Rows_log_event::do_hash_scan_and_update

#5 0x0000000000a3d7f7 in Rows_log_event::do_apply_event

#6 0x0000000000a28e3a in Log_event::apply_event

#7 0x0000000000a8365f in apply_event_and_update_pos

#8 0x0000000000a84764 in exec_relay_log_event

#9 0x0000000000a89e97 in handle_slave_sql

#10 0x0000000000e341c3 in pfs_spawn_thread

#11 0x0000003a00a07851 in start_thread ()

#12 0x0000003a006e767d in clone ()

执行过程说明：

Rows_log_event::do_scan_and_update

open_record_scan()

next_record_scan()

if (m_key_index > MAX_KEY)

ha_rnd_next();

else

ha_index_read_map(m_key from m_distinct_key_list)

entry= m_hash->get()

m_hash->del(entry);

do_apply_row()

while (m_hash->size > 0);

从执行过程上可以看出，当使用hash_scan时，只会全表扫描一次，虽然会多次遍历m_hash这个hash表，但是这个扫描

是O(1),所以，代价很小，因此可以降低扫描次数，提高执行效率。

推荐文章：

slave_rows_search_algorithms参数hash_scan的实现方法