background
When a production database executes SQL scripts, it will generally go through a formal approval process before it can be run. But some cases are exceptions. The business department intends to withdraw after making some demands to delete data, or accidentally deletes some data in the operational background, and then finds the DBA team to assist, hoping to recover the data.
After investigation, binlog2sql is a tool used to parse MySQL binlog by Dianping. According to different options, you can get original SQL, rollback SQL, INSERT SQL that removes primary keys, etc., which is suitable for repairing data quickly rollback (flashback) and new master lost data after master-slave switching.
Target
Verify that the binlog2sql tool can quickly recover data.
step
Preparation
Install the binlog2sql tool.
> git clone /danfengcao/ && cd binlog2sql # > yum install python3-pip # > whereis pip # > pip3.6 install -r > pip install -r
The MySQL server configures the following parameters. Please note that binlog2sql only supports row format.
[mysqld] server_id = 1 log_bin = /var/log/mysql/ max_binlog_size = 1G binlog_format = row binlog_row_image = full
Specifies the database user authorization to execute the script.
-- SELECT Permissions:Query information_schema.COLUMNS -- REPLICATION SLAVE:pass BINLOG_DUMP Agreement acquisition binlog content -- REPLICATION CLIENT:implement SHOW MASTER STATUS Get binlog information GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO user
Prepare a user table user and fill it with 1W pieces of data.
CREATE TABLE `user` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(10) DEFAULT NULL, `gmt_create` date DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 DELIMITER $$ CREATE PROCEDURE InsertRandomData() BEGIN DECLARE i INT DEFAULT 1; DECLARE randomName CHAR(10); DECLARE randomDate DATE; WHILE i <= 10000 DO -- Generate random name (Random string) SET randomName = CONCAT( CHAR(FLOOR(RAND() * 26) + 65), CHAR(FLOOR(RAND() * 26) + 65), CHAR(FLOOR(RAND() * 26) + 65), CHAR(FLOOR(RAND() * 26) + 65), CHAR(FLOOR(RAND() * 26) + 65) ); -- Generate random日期 (2013-11-11 Start,Random range is approximately one year) SET randomDate = DATE_ADD('2023-01-01', INTERVAL FLOOR(RAND() * 365) DAY); -- Insert data INSERT INTO `user` (`name`, `gmt_create`) VALUES (randomName, randomDate); SET i = i + 1; END WHILE; END$$ DELIMITER ; -- Call stored procedures CALL InsertRandomData();
Check out the total number of data larger than November, with a total of 363 items.
mysql > SELECT count(*) FROM user WHERE gmt_create > '2023-11-01 00:00:00'; +----------+ | count(*) | +----------+ | 363 | +----------+
Simulated by mistake, assuming that the data after November was deleted around 15:30.
mysql > DELETE FROM user WHERE gmt_create > '2023-11-01 00:00:00';
Recover data
Check the status of the main library binlog, the latest file is mysql-bin.000003.
-- Used in lower version SHOW MASTER STATUS; mysql > SHOW BINARY LOGS; +------------------+-----------+-----------+ | Log_name | File_size | Encrypted | +------------------+-----------+-----------+ | mysql-bin.000001 | 1871 | No | | mysql-bin.000002 | 181 | No | | mysql-bin.000003 | 917878 | No | +------------------+-----------+-----------+ 3 rows in set (0.04 sec)
Filter out SQL that needs to be rolled back. People who misoperate generally know the approximate misoperation time. We first filter it according to the time.
shell> python binlog2sql/ -h address -P port -u user -p'password' -d library -t table name --start-file='mysql-bin.000003' --start-datetime='2023-11-02 15:00:00' --stop-datetime='2023-11-02 16:00:00' > /tmp/
Output:
DELETE FROM `test`.`user` WHERE `gmt_create`='2023-11-01 00:00:00' AND `id`=1351 AND `name`='TPUDJ' LIMIT 1; #start 105311 end 262311 time 2023-11-02 15:31:10
DELETE FROM `test`.`user` WHERE `gmt_create`='2023-11-01 00:00:00' AND `id`=1352 AND `name`='YKIIS' LIMIT 1; #start 105311 end 262311 time 2023-11-02 15:31:10
...
DELETE FROM `test`.`user` WHERE `gmt_create`='2023-12-31 00:00:00' AND `id`=1714 AND `name`='SHKBC' LIMIT 1; #start 105311 end 265754 time 2023-11-02 15:31:10
According to the location information, it can be judged that the misoperated SQL comes from the same transaction, with the exact location between 105311-265754, filter according to the location, and use the -B option to generate rollback SQL.
shell> python binlog2sql/ -h address -P port -u user -p'password' -d library -t table name --start-file='mysql-bin.000003' --start-position=105311 --stop-position=265754 -B > /tmp/
Output:
INSERT INTO `test`.`user`(`gmt_create`, `id`, `name`) VALUES ('2023-11-01 00:00:00', 1351, 'TPUDJ'); #start 105311 end 262311 time 2023-11-02 15:31:10
INSERT INTO `test`.`user`(`gmt_create`, `id`, `name`) VALUES ('2023-11-01 00:00:00', 1352, 'YKIIS'); #start 105311 end 262311 time 2023-11-02 15:31:10
...
INSERT INTO `test`.`user`(`gmt_create`, `id`, `name`) VALUES ('2023-12-31 00:00:00', 1714, 'SHKBC'); #start 105311 end 265754 time 2023-11-02 15:31:10
Result verification
Confirm whether the total number of rollback SQL rows corresponds to the 363 mistakenly deleted.
shell> wc -l /tmp/
363 /tmp/
Confirm with the business party that the rollback SQL is no problem and execute the rollback statement. Log in to MySQL and confirm that the rollback is successful.
shell> mysql -haddress -Pport -uuser -p'password' < /tmp/ mysql> SELECT count(*) FROM user WHERE gmt_create > '2023-11-01 00:00:00'; +----------+ | count(*) | +----------+ | 363 | +----------+
in conclusion
binlog2sql is suitable for online recovery of misoperated data, but is not applicable to the following situations:
- Data recovery is recommended to be controlled within 50W, the larger the data volume, the more statements generated in reverse. If this value exceeds this value, the recovery time may exceed 15 minutes.
- DDL recovery operation is not supported. Because even in row mode, binlog does not record changes in each row of data for DDL operations. To implement DDL fast rollback, the MySQL source code must be modified so that the old data is backed up before executing DDL. Alibaba Lin Xiaobin’s team submitted patch to MySQL official, and the relevant implementation plans can be viewed.Discussion and implementation of MySQL flashback solution。
- According to the official statement, it is recommended to use the binlog2sql tool for online recall data, and the mysqlbinlog tool for offline resolution. The MySQL flashback feature was first developed by Alibaba Peng Lixun.
This is the article about MySQL using binlog2sql tool to realize online data recovery function. For more related content related to MySQL binlog2sql data recovery, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!