之前写过一篇文章MySQL如何获取binlog的开始时间和结束时间[1],文章里面介绍了如何获取MySQL数据库二进制日志(binlog)的开始时间与结束时间的一些方法。实际应用当中,我们可能还会遇到效率/性能方面的问题。最近对这个问题做了一些研究,这里就介绍一下如何快速获取MySQL二进制日志(b
之前写过一篇文章 MySQL如何获取binlog的开始时间和结束时间 [1] ,文章里面介绍了如何获取MySQL数据库二进制日志(binlog)的开始时间与结束时间的一些方法。实际应用当中,我们可能还会遇到效率/性能方面的问题。最近对这个问题做了一些研究,这里就介绍一下如何快速获取MySQL二进制日志(binlog)的开始时间和结束时间。
我们下来看看当MySQL二进制日志(binlog)的Size很大的时候,获取起开始时间和结束时间,如下测试所示
$?du?-sh?mysql_binlog.000105
1.1G????mysql_binlog.000105
$?time?mysqlbinlog??mysql_binlog.000105?|grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'
#240425??9:20:26?
real????0m34.136s
user????0m25.941s
sys?????0m11.985s
从上面实验可以看出,在MySQL二进制日志(binlog)变大的情况下,这种方法需要34秒,非常低效和耗时,那么我们怎么提升性能呢? 我们改写一下脚本,如下所示
$?time?mysqlbinlog??mysql_binlog.000105?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'
#240425??9:20:26?
real????0m0.010s
user????0m0.006s
sys?????0m0.005s
如上所示,这样改进脚本后,性能效率已经提升到0.01秒,已经相当的高效了。那么获取结束时间能否也可以这样提升呢? 很遗憾的是由于MySQL二进制日志(binlog)的结束时间/滚动时间(Rotate Time)位于文件的末尾,由于管道的一些基本特性,获取MySQL二进制日志(binlog)的结束时间无法通过上面方法来优化,这里不打算介绍Linux管道相关概念,所以我们只需知道这么一个事实。
如果你对Linux管道的一些原理不是很清楚,那么就用实验测试验证一下,如下所示:
$?time?mysqlbinlog??mysql_binlog.000105??|grep?Rotate?|?awk?-F?"server?id"?'{print?$1}'
#240426?11:11:37?
real????0m34.223s
user????0m27.202s
sys?????0m11.551s
$?time?mysqlbinlog??mysql_binlog.000105??|?tail?-10?|?grep?Rotate?|?awk?-F?"server?id"?'{print?$1}'
#240426?11:11:37?
real????0m33.917s
user????0m25.528s
sys?????0m11.395s
那么怎么来优化获取MySQL二进制日志(binlog)的结束时间呢?经过一番观察与实验,我发现一个MySQL二进制日志(binlog)的结束时间,就是下一个二进制日志(binlog)的开始时间。如下实验所示
[mysql@dbtest04?bin_logs]$?ls?-lrt
total?28
-rw-r-----?1?mysql?mysql?207?May??9?15:25?mysql_binlog.000055
-rw-r-----?1?mysql?mysql?207?May??9?15:27?mysql_binlog.000056
-rw-r-----?1?mysql?mysql?207?May?10?11:02?mysql_binlog.000057
-rw-r-----?1?mysql?mysql?207?May?10?11:34?mysql_binlog.000058
-rw-r-----?1?mysql?mysql?207?May?10?11:38?mysql_binlog.000059
-rw-r-----?1?mysql?mysql?157?May?10?11:38?mysql_binlog.000060
-rw-r-----?1?mysql?mysql?246?May?10?11:38?mysql_binlog.index
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000055?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'
#240509?14:48:10?
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000055??|grep?Rotate?|?awk?-F?"server?id"?'{print?$1}'
#240509?15:25:57?
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000056?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'
#240509?15:25:57?
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000056??|grep?Rotate?|?awk?-F?"server?id"?'{print?$1}'
#240509?15:27:37?
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000057?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'
#240509?15:27:37?
[mysql@dbtest04?bin_logs]$?mysqlbinlog??mysql_binlog.000057??|grep?Rotate?|?awk?-F?"server?id"?'{print?$1}'
#240510?11:02:00?
[mysql@dbtest04?bin_logs]$
如果全部符合这个规律的话,那么我们直接用下一个binlog的开始时间作为上一个binlog的结束时间即可,于是我写了一个脚本find_binlog_start_end_time.sh:
#!/bin/bash
#########################################################################################
#???????????????????????????????????????????????????????????????????????????????????????#
#?This?script?is?used?for?get?the?binlog?start?time?and?end?time????????????????????????#
#???????????????????????????????????????????????????????????????????????????????????????#
#########################################################################################
#???????????????????????????????????????????????????????????????????????????????????????#
#?ScriptName????????????:????find_binlog_start_end_time.sh??????????????????????????????#
#?Author????????????????:????Kerry??????????????????????????????????????????????????????#
#?CreateDate????????????:????2024-05-10?????????????????????????????????????????????????#
#?Email?????????????????:[email protected]???????????????????????????????????????#
#***************************************************************************************#
#?参数配置??????????????????????????????????????????????????????????????????????????????#
#---------------------------------------------------------------------------------------#
#?脚本参数???binlog文件存放的路径???????????????????????????????????????????????????????#
#---------------------------------------------------------------------------------------#
#?MYSQLBINLOG????????mysqlbinlog的位置,以防没有设置环境变量????????????????????????????#
#?BINLOG_BASENAME????binlog的前缀名?????????????????????????????????????????????????????#
#---------------------------------------------------------------------------------------#
#?注意事项:????????????????????????????????????????????????????????????????????????????#
#???1:如果维护的MySQL数据库都规范化安装、配置的化,下面很多参数都不需要修改????????????#
#***************************************************************************************#
#?Version????????Modified?Date????????????Description???????????????????????????????????#
#***************************************************************************************#
#?V.1.0??????????2024-05-10????????????创建此脚本???????????????????????????????????????#
#########################################################################################
#?mysqlbinlog的路径,一般无需设置,以防没有设置环境变量时
MYSQLBINLOG="/opt/mysql/mysql8.0/bin/mysqlbinlog"
BINLOG_BASENAME="mysql_binlog"
if?[?$#?=?0?]
then
???echo?"find_binlog_start_end_time.sh?Usage:"
???echo?"for?eg:?find_binlog_start_end_time.sh??/data/mysql/binlogs"
???exit
fi
BINLOG_FILE_PATH=$1
if?[?!?-d?$BINLOG_FILE_PATH?];then
????echo?"the?folder?$BINLOG_FILE_PATH?does?not?exist,?please?check?it!"
????exit?1
fi
index=1
start_time=""
end_time=""
last_binlog_name=""
BINLOG_FILE_NUM=`ls?-lrt?$BINLOG_FILE_PATH?|??grep?$BINLOG_BASENAME?|grep?-v?$BINLOG_BASENAME.index?|?wc?-l`
if?[?$BINLOG_FILE_NUM?-lt?1?];then
????echo?"pelase?check?the?binlog?or?the?parameter?of?this?script"
????exit?1;
fi
cd?$BINLOG_FILE_PATH
for?binlog_file?in?`ls?-rt?|??grep?$BINLOG_BASENAME?|grep?-v?$BINLOG_BASENAME.index`;
??do
??????if?[?$index?-eq?1?];then
?????????start_time=`$MYSQLBINLOG??$binlog_file?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'`
?????????last_binlog_name=$binlog_file
??????else
?????????end_time=`$MYSQLBINLOG??$binlog_file?|?head?-10?|?grep?"Start:?binlog"?|?awk?-F?"server?id"?'{print?$1}'`
?????????echo?"file?name:$last_binlog_name"?,?"start?time:$start_time",?"end?time:$end_time"
?????????if?[?$index?-eq?$BINLOG_FILE_NUM?];then
????????????last_end_time=`$MYSQLBINLOG??$binlog_file?|tail?-10?|?egrep?"Rotate|Stop"?|?awk?-F?"server?id"?'{print?$1}'`
????????????echo?"file?name:$binlog_file"?,?"start?time:$end_time",?"end?time:$last_end_time"
?????????else??
????????????start_time=$end_time
????????????last_binlog_name=$binlog_file
?????????fi
???????fi
??????
??????let?index++
??done
如下测试所示
[mysql@dbtest04?kerry]$?sh??find_binlog_start_end_time.sh?/data/mysql/bin_logs/
file?name:mysql_binlog.000055?,?start?time:#240509?14:48:10?,?end?time:#240509?15:25:57?
file?name:mysql_binlog.000056?,?start?time:#240509?15:25:57?,?end?time:#240509?15:27:37?
file?name:mysql_binlog.000057?,?start?time:#240509?15:27:37?,?end?time:#240510?11:02:00?
file?name:mysql_binlog.000058?,?start?time:#240510?11:02:00?,?end?time:#240510?11:34:01?
file?name:mysql_binlog.000059?,?start?time:#240510?11:34:01?,?end?time:#240510?11:38:24?
file?name:mysql_binlog.000060?,?start?time:#240510?11:38:24?,?end?time:#240510?16:45:34?
file?name:mysql_binlog.000061?,?start?time:#240510?16:45:34?,?end?time:
另外,还有一种比较高效的方法是解析二进制日志的头部信息(此篇文章统统指binlog v 4),因为binlog的头部由固定的4个字节组成,而头部信息的FORMAT_DESCRIPTION_EVENT部分包含了binlog的开始时间,我在搜索/学习相关资料时,结果发现有人已经总结过这方面的内容,而且已经有相关Python脚本或shell脚本了,这里就重复造轮子了,Python脚本来自 MySQL 查询binlog生成时间 [2] 我们来看看实验结果,如下所示
$?python3?check_bintime.py??/data/mysql/bin_logs/mysql_binlog.index
{'file_name':?'mysql_binlog.000055',?'binlog_size':?'207.0?B',?'start_time':?'2024-05-09?14:48:10',?'end_time':?'2024-05-09?15:25:57'}
{'file_name':?'mysql_binlog.000056',?'binlog_size':?'207.0?B',?'start_time':?'2024-05-09?15:25:57',?'end_time':?'2024-05-09?15:27:37'}
{'file_name':?'mysql_binlog.000057',?'binlog_size':?'207.0?B',?'start_time':?'2024-05-09?15:27:37',?'end_time':?'2024-05-10?11:02:00'}
{'file_name':?'mysql_binlog.000058',?'binlog_size':?'207.0?B',?'start_time':?'2024-05-10?11:02:00',?'end_time':?'2024-05-10?11:34:01'}
{'file_name':?'mysql_binlog.000059',?'binlog_size':?'207.0?B',?'start_time':?'2024-05-10?11:34:01',?'end_time':?'2024-05-10?11:38:24'}
{'file_name':?'mysql_binlog.000060',?'binlog_size':?'180.0?B',?'start_time':?'2024-05-10?11:38:24',?'end_time':?'2024-05-10?16:45:34'}
{'file_name':?'mysql_binlog.000061',?'binlog_size':?'157.0?B',?'start_time':?'2024-05-10?16:45:34',?'end_time':?'now'}
shell脚本来自 一种快速取得binlog开始时间的方法 [3] :
[mysql@dbtest04?kerry]$?function?getBinlogStartTime()
>?{
>?????????theFile="$1"
>?????????#取出文件头做分析
>?????????binlogHead=`hexdump?${theFile}?|?head?-1`
>?????????#binlog文件校验
>?????????binlogCrc=`echo?$binlogHead?|?awk?'{print?$1$2$3}'`
>?????????if?[?"${binlogCrc}"?!=?'000000062fe6e69'?];?then
>?????????????????echo?'${theFile}?is?damage.'
>?????????????????exit?1
>?????????fi
>?????????#计算binlog的开始时间
>?????????binlogBeginTimeInt=`echo?$binlogHead?|?awk?'{print?$5$4}'?|?gawk?'?{?printf?strtonum("0x"?$0)}'?`
>?????????binlogBeginTime=`date?-d?"1970-01-01?UTC?$binlogBeginTimeInt?seconds"?"+%F?%T"`
>?????????echo?$binlogBeginTime
>?}
[mysql@dbtest04?kerry]$?getBinlogStartTime?/data/mysql/bin_logs/mysql_binlog.000055
2024-05-09?14:48:10
不过你看其脚本也会发现,解析MySQL的binlog的头部文件,比较容易获取binlog的开始时间,而获取binlog的结束时间/轮转时间就比较麻烦了。所以python脚本中获取binlog结束时间的思路跟我的思路也是一样的。
这里介绍了两种快速获取binlog的开始时间和结束时间的两种方法,这两种方法都非常高效,至于我写的脚本find_binlog_start_end_time.sh目前还比较粗糙,后面有时间再完善补充。
2: https://blog.csdn.net/qq_42768234/article/details/126970988
[3]3: https://blog.csdn.net/shaochenshuo/article/details/120549377