How-to fix Mysql slave after relay log corrupted
October 17, 2013 10 Comments
We run into issues like “Relay log read failure: Could not parse relay log event entry” many times on Mysql slave node. Most of time in our cases, it’s due to Linux host crashed unexpectedly.
The detail error is like this:
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: alexzeng.wordpress.com Master_User: mysql_rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: LBMS-bin.000012 Read_Master_Log_Pos: 239005305 Relay_Log_File: LBMS-relay-bin.000004 Relay_Log_Pos: 221245258 Relay_Master_Log_File: LBMS-bin.000012 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1594 Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Skip_Counter: 0 Exec_Master_Log_Pos: 221245113 Relay_Log_Space: 239007065 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1594 Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
In this case, we can check if the bin log at master node is still available.
Relay_Master_Log_File: LBMS-bin.000012 ... Exec_Master_Log_Pos: 221245113
If it’s still available at master node, we can reset slave to that point, and let it recover by itself as follows:
Be careful, using Relay_Master_Log_File value in “show slave status\G” for master_log_file, not Master_Log_File value. They’re the same in this case.
mysql> slave stop; Query OK, 0 rows affected (0.00 sec) mysql> change master to master_log_file='LBMS-bin.000012',master_log_pos=221245113; Query OK, 0 rows affected (0.04 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec)
Check slave status again and again, make sure Exec_Master_Log_Pos is increasing. Seconds_Behind_Master should be decreasing. If master is too busy, the delay may still increase, but it’ll catch up sooner or later.
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: alexzeng.wordpress.com Master_User: mysql_rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: LBMS-bin.000012 Read_Master_Log_Pos: 239014770 Relay_Log_File: LBMS-relay-bin.000002 Relay_Log_Pos: 1905303 Relay_Master_Log_File: LBMS-bin.000012 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 223150164 Relay_Log_Space: 17770064 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 159147 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
I’d like to end this blog with Peter Zaitsev‘s article How SHOW SLAVE STATUS relates to CHANGE MASTER TO words :
It is very important to understand the difference between IO Thread position in Master logs (Master_Log_File:Read_Master_Log_Pos) and SQL Thread position in Master logs (Relay_Master_Log_File:Exec_Master_Log_Pos).
It’s really help us understanding the differences of these values.
This was very helpful to me. Thanks for the info.
Thanks for sharing this tip. It saved my day.
Thank you Alex! You saved me an hour of digging on a day where I don’t have an hour to spare.
Thanks Alex, my partiton gets full and get into that error.
Thanks a lot! You save my day !
good guy thanks!
Worked perfectly for me as well. Thanks!
Thanks for this post – in the past I would always have to restart replication from a still-working slave but this will save a lot of trouble!
Pingback: How To Fix How To Repair Corrupt Master File Table Errors - Windows Vista, Windows 7 & 8
Really useful post. Thank you. Had my slave up and running after power outage in no-time!