Having data loss incidents in a production database is nothing unheard of. The mishaps can happen due to one thing going wrong after another; as infamous incidents of GitHub or Gitlab
https://about.gitlab.com/blog/2017/02/01/gitlab-dot-com-database-incident/
https://github.blog/2018-10-30-oct21-post-incident-analysis
Or they can happen due to some simple misjudgment; Like
Running DELETE Query with incorrect WHERE clause;
Let's call this event MISHAP.
Fortunately, in most organizations, there is some provision for creating backups for important databases. But backups are done at particular intervals and often we need the database in a state just before the MISHAP took place and the latest backup available might be a week old.
To achieve this most modern databases provide Point In Time Recovery.
Point In Time Recovery (PITR)
Databases generally contain binlogs which help in master-slave replication and also to recover databases.
Consider binlogs as a serial log of changes that happen in each transaction to the database. Each log event has a position associated with it. Before any transaction is committed, the changes it makes are logged to these binlogs.
These logs are kept along with the actual data of the database. For actual querying purposes, these logs are not needed. Hence these are removed at certain intervals.
So to recover a database to any point in time
Get the last backup of the database and the binlog position when the backup was taken.
Load the backup and apply the events in log files from the position backup was taken to the state where you want your database to be. (Let’s say just before the point before the MISHAP took place)
PITR FOR MYSQL
We will try to recover a MYSQL Database to a particular point in time.
A few things about binlogs in MYSQL
MYSQL has binlogs enabled by default. Binlogs are kept for 30 days and the max size of a binlog file is kept at 1GB
They are stored in /var/lib/mysql/binlog.000X files and mysqlbinlog can be used for analyzing them.
Taking Backup
mysqldump is very frequently used for taking db backups. In our case we need to store the binlog position as well, at the time backup is being taken.
The current binlog position can be checked using “SHOW MASTER STATUS” command; at the moment binlog position is 914
mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
File: binlog.000002
Position: 914
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set:
1 row in set, 1 warning (0.00 sec)
Taking Backup: Adding the “ — source-data=2” option adds a comment in the dump: the name and position of the log at the time. In our case, the current binlog file is “binlog.000002” and the current log position is “914”. Let's store the backup in “backup_2023–01–01.sql”.
NOTE: This option locks all the tables while the dump is being generated.
mysqldump -u <user> -p --source-data=2 <database> > backup_2023-01-01.sql
>>> backup_2023-01-01.sql
-- binary log coordinates are added by --source-data=2
--
-- Position to start replication or point-in-time recovery from
--
-- CHANGE REPLICATION SOURCE TO SOURCE_LOG_FILE='binlog.000002', SOURCE_LOG_POS=914;
...... REMAINING DUMP .....
Applying Binlog
Let's assume the MISHAP was a TRUNCATE command that was run on a table by mistake. So we need to find the log position for this command.
mysqlbinlog utility parses binary data in logs in human-readable format.
We can narrow our search space using “ — start-datetime” and “ — start-position” arguments. We are using the “-v” flag to generate human-friendly SQL statements from the log.
mysqlbinlog binlog.000002 -v --start-position 914 > logdump
In the log output “# at <number>” specifies the start position of a log event. Here we can see at “# at 1588” the MISHAP of “TRUNCATE table user” happened. So we need to apply logs from #914 (log position at the time of backup) to #1588 (log position before execution of MISHAP)
# at 1428
#240302 23:23:01 server id 1 end_log_pos 1480 CRC32 0xc79cb871 Write_rows: table id 118 flags: STMT_END_F
BINLOG '
fWfjZRMBAAAAPAAAAJQFAAAAAHYAAAAAAAEABnRlc3RkYgAEdXNlcgACAw8C/AMCAQEAAgP8/wDA
Ibzq
fWfjZR4BAAAANAAAAMgFAAAAAHYAAAAAAAEAAgAC/wADAAAACgBhZnRlcnRlc3Qxcbicxw==
'/*!*/;
### INSERT INTO `testdb`.`user`
### SET
### @1=3
### @2='aftertest1'
# at 1480
#240302 23:23:01 server id 1 end_log_pos 1511 CRC32 0x91e89a91 Xid = 55
COMMIT/*!*/;
# at 1511
#240302 23:25:07 server id 1 end_log_pos 1588 CRC32 0x35d54bcd Anonymous_GTID last_committed=5 sequence_number=6 rbr_only=no original_committed_timestamp=1709402107329034 immediate_commit_timestamp=1709402107329034 transaction_length=177
# original_commit_timestamp=1709402107329034 (2024-03-02 23:25:07.329034 IST)
# immediate_commit_timestamp=1709402107329034 (2024-03-02 23:25:07.329034 IST)
/*!80001 SET @@session.original_commit_timestamp=1709402107329034*//*!*/;
/*!80014 SET @@session.original_server_version=80300*//*!*/;
/*!80014 SET @@session.immediate_server_version=80300*//*!*/;
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 1588
#240302 23:25:07 server id 1 end_log_pos 1688 CRC32 0x8d039859 Query thread_id=11 exec_time=0 error_code=0 Xid = 64
SET TIMESTAMP=1709402107/*!*/;
truncate table user
/*!*/;
Let's store these log events in a patch file which we will apply over the last restored backup
mysqlbinlog -d <DATABASE> binlog.000002 --start-position 914 --stop-position 1588 > patch.sql
Restore the full backup and apply the patch file over it. The DB should be restored without the MISHAP
mysql -u <user> -p <database> < backup_2023-01-01.sql
mysql -u <user> -p <database> < patch.sql
NOTE: mysqlbinlog can take multiple log files as arguments. In that case“ — start-position” applies to the position in the first log file provided and “ — end-position” applies to the last log file provided in the arguments.
So if the dump was taken at file binlog.003 position 350 and MISHAP happened at binlog.005 position 700, patch will look like
mqsqlbinlog -d <database> binlog.003 binlog.004 binlog.005 --start-position 350 --stop-position 700
MYSQL PITR: https://dev.mysql.com/doc/refman/8.0/en/point-in-time-recovery-positions.html