? ??在mysql的备库的监控中有一项很重要的指标:Seconds_Behind_Master,这个值是怎么得到的呢?下面从5.1.58的代码中分析一下:
? ? mysql的replication中有2个比较重要的class:Master_info(rpl_mi.h), Relay_log_info(rpl_rli.h),他们分别对应于master,info文件和slave.info文件;很显然,Master_info是io_thread需要的,Relay_log_info是sql_thread需要的。Master_info中有一个变量 clock_diff_with_master,这个值记录着mysql的主库和备库的时间差,可以理解为主备的主机时间差。clock_diff_with_master变量的定义如下:
/* The difference in seconds between the clock of the master and the clock of the slave (second - first). It must be signed as it may be <0 or >0. clock_diff_with_master is computed when the I/O thread starts; for this the I/O thread does a SELECT UNIX_TIMESTAMP() on the master. "how late the slave is compared to the master" is computed like this: clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master */ long clock_diff_with_master;
? ? ?这个变量的注释直接提到了Seconds_Behind_Master的计算方法:clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master。clock_of_slave是slave的当前时间--执行show slave status的当前时间。
? ? ?先看一下clock_diff_with_master的计算:(slave.cc)。执行”start slave;“/“start slave io_thread;”后,会执行start_slave_threads来启动io thread,io thread启动后首先做的就是获取主库的mysql版本和主库的当前时间(mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()"))),获取到主库的当前时间后,用备库的当前时间减去主库的时间,得到clock_diff_with_master。
int start_slave_threads(bool need_slave_mutex, bool wait_for_start, Master_info* mi, const char* master_info_fname, const char* slave_info_fname, int thread_mask) { ... if (thread_mask & SLAVE_IO) error=start_slave_thread(handle_slave_io,lock_io,lock_cond_io, cond_io, &mi->slave_running, &mi->slave_run_id, mi, 1); //high priority, to read the most possible ... } ... /** Slave IO thread entry point. @param arg Pointer to Master_info struct that holds information for the IO thread. @return Always 0. */ pthread_handler_t handle_slave_io(void *arg) { ... thd_proc_info(thd, "Checking master version"); ret= get_master_version_and_clock(mysql, mi); ... } ... static int get_master_version_and_clock(MYSQL* mysql, Master_info* mi) { ... master_res= NULL; if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) && (master_res= mysql_store_result(mysql)) && (master_row= mysql_fetch_row(master_res))) { mi->clock_diff_with_master= (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10)); } else if (is_network_error(mysql_errno(mysql))) { mi->report(WARNING_LEVEL, mysql_errno(mysql), "Get master clock failed with error: %s", mysql_error(mysql)); goto network_err; } else { mi->clock_diff_with_master= 0; /* The "most sensible" value */ sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master, " "do not trust column Seconds_Behind_Master of SHOW " "SLAVE STATUS. Error: %s (%d)", mysql_error(mysql), mysql_errno(mysql)); } ... }
? ? ?last_timestamp_executed_by_SQL_thread则是sql slave thread最近一次执行的event的时间戳(rpl.rli.cc中的last_event_start_time变量)。而这个变量的更新则是在stmt_done()函数中完成的(rpl.rli.cc),相关定义如下:
/* Used by row-based replication to detect that it should not stop at this event, but give it a chance to send more events. The time where the last event inside a group started is stored here. If the variable is zero, we are not in a group (but may be in a transaction). */ time_t last_event_start_time; /** Helper function to do after statement completion. This function is called from an event to complete the group by either stepp