MySQL Binlog--PURGE MASTER LOGS失败

问题背景:

在我们磁盘空间维护策略中,BINLOG的默认保留期限为7天,但当磁盘空间不足时,会根据磁盘空间使用率自动清理超过一定数量的BINLOG。

问题原因:

某服务器上报磁盘空间不足,登录服务器发现BINLOG占用空间过多导致磁盘空间使用率较高,而自动清理BINLOG作业运行正常,但BINLOG没有被及时清理。

手动执行PURGE MASTER LOGS操作,发现执行速度很快,但存在WARNING:

+---------+------+-----------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                         |
+---------+------+-----------------------------------------------------------------------------------------------------------------+
| Warning | 1867 | file ./mysql-bin.011530 was not purged because it was being read by 1 thread(s), purged only 0 out of 51 files. |
+---------+------+-----------------------------------------------------------------------------------------------------------------+

官网有如下解释:

 If you have an active slave that currently is reading one of the log files you are trying to delete, this statement does not delete the log file that is in use or any log files later than that one, but it deletes any earlier log files. A warning message is issued in this situation. However, if a slave is not connected and you happen to purge one of the log files it has yet to read, the slave will be unable to replicate after it reconnects.

在使用PURGE MASTER LOGS清理BINLOG时,如果还有活跃从库需要访问这些BINLOG,那么PURGE会失效。

通过SHOW PROCESSLIST可以发现:

*************************** 11. row ***************************
     Id: 6629525
   User: magpie
   Host: xxx.xxx.xxx.xxx:62054
     db: NULL
Command: Binlog Dump
   Time: 283180
  State: Sending to client
   Info: NULL

而283180/60/60=3.27,即该进程处于“Sending to client”已经3天,查看最早BINLOG也是3天前的,因此可以断定由该进程引起,手动KILL该进程并再次运行PURGE MASTER LOGS,发现BINLOG被正常清理。