In MariaDB 10.0 and above, when both
innodb_flush_log_at_trx_commit=1 (the default) is set and the binary log is enabled, there is now one less sync to disk inside InnoDB during commit (2 syncs shared between a group of transactions instead of 3).
Durability of commits is not decreased — this is because even if the server crashes before the commit is written to disk by InnoDB, it will be recovered from the binary log at next server startup (and it is guaranteed that sufficient information is synced to disk so that such a recovery is always possible).
The old behavior, with 3 syncs to disk per (group) commit (and consequently lower performance), can be selected with the new
innodb_flush_log_at_trx_commit=3 option. There is normally no benefit to doing this, however there are a couple of edge cases to be aware of.
innodb_flush_log_at_trx_commit=1 is set and the binary log is enabled, but
sync_binlog=0 is set, then commits are not guaranteed durable inside InnoDB after commit. This is because if
sync_binlog=0 is set and if the server crashes, then transactions that were not flushed to the binary log prior to the crash will be missing from the binary log.
In this specific scenario,
innodb_flush_log_at_trx_commit=3 can be set to ensure that transactions will be durable in InnoDB, even if they are not necessarily durable from the perspective of the binary log.
One should be aware that if
sync_binlog=0 is set, then a crash is nevertheless likely to cause transactions to be missing from the binary log. This will cause the binary log and InnoDB to be inconsistent with each other. This is also likely to cause any replication slaves to become inconsistent, since transactions are replicated through the binary log. Thus it is recommended to set
sync_binlog=1. With the group commit improvements introduced in MariaDB 5.3, this setting has much less penalty in recent versions compared to older versions of MariaDB and MySQL.
Mariabackup and Percona XtraBackup only see transactions that have been flushed to the redo log. With the group commit improvements, there may be a small delay (defined by the
binlog_commit_wait_usec system variable) between when a commit happens and when the commit will be included in a backup.
Note that the backup will still be fully consistent with itself and the binary log. This problem is normally not an issue in practice. A backup usually takes a long time to complete (relative to the 1 second or so that
binlog_commit_wait_usec is normally set to), and a backup usually includes a lot of transactions that were committed during the backup. With this in mind, it is not generally noticeable if the backup does not include transactions that were committed during the last 1 second or so of the backup process. It is just mentioned here for completeness.
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.