Skip to content

When the binlog file is larger than 4G, data loss occurs #1366

@dongwenpeng

Description

@dongwenpeng

Hi, When the binlog file is larger than 4G, data loss occurs.

MySQL allows a single binlog file to be larger than 4G, and in the MySQL source code, the binlog event end_log_pos field type is defined as uint32, with a maximum storage of 4G. Therefore, when a single binlog file is larger than 4G, the end_log_pos field storage will overflow and be used cyclically.

Try to use mysqlbinlog to parse a binlog file larger than 4G, example:

#202409 20:06:42 server id 100  end_log_pos 4294954865 CRC32 0x5e0195d9         Update_rows: table id 266
# at 4294954865
#202409 20:06:42 server id 100  end_log_pos 4294962881 CRC32 0x2f0b79cc         Update_rows: table id 266
# at 4294962881
#202409 20:06:42 server id 100  end_log_pos 3601 CRC32 0x2f367261       Update_rows: table id 266
# at 4294970897
#202409 20:06:42 server id 100  end_log_pos 11617 CRC32 0xb1ac6949      Update_rows: table id 266

When end_log_pos reaches 4294962881, the next event end_log_pos overflows to 3601 and becomes smaller.

Why data is lost

In the handleRowsEvent method of the gh-ost code, which is used to process a DML event, there is a logic used to judge: if the currently received event end_log_pos value is less than or equal to the last executed event end_log_pos value, the current event will be ignored. When the binlog file is larger than 4G, all events larger than 4G will be discarded due to event end_log_pos overflow storage.

// StreamEvents
func (this *GoMySQLReader) handleRowsEvent(ev *replication.BinlogEvent, rowsEvent *replication.RowsEvent, entriesChannel chan<- *BinlogEntry) error {
	if this.currentCoordinates.SmallerThanOrEquals(&this.LastAppliedRowsEventHint) {
		this.migrationContext.Log.Debugf("Skipping handled query at %+v", this.currentCoordinates)
		return nil
	}
        ...
        // Execute event
	...
        // Record the position information of the last executed event
	this.LastAppliedRowsEventHint = this.currentCoordinates
	return nil
}

The reason why binlog is larger than 4G

When there is a big row in the data table, too many chunks at a time may cause a large transaction (assuming the transaction is larger than 4G), or the business itself has a large transaction. A transaction is written to a single binlog file.

Group commit and binlog

Under the group submission mechanism, the submission of a large transaction (assuming the transaction is larger than 4G) will refresh the binlog cache with N transactions of the same group and write them to the same binlog file. DML changes currently being made to the DDL table may also appear in this group and be written to the binlog file together.

Fix

I understand that the logic of this code is just an optimization item to avoid repeated execution of events when MySQLreader retries. Can I remove this code to fix the above problem?

if this.currentCoordinates.SmallerThanOrEquals(&this.LastAppliedRowsEventHint) {
	this.migrationContext.Log.Debugf("Skipping handled query at %+v", this.currentCoordinates)
	return nil
}

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions