One of databases running on Oracle 12.1 was hanging for some minutes. The top wait event was log file sync. This wait event is posted whenever a user session issues COMMIT or ROLLBACK. A message is then sent to LGWR to flush logs associated with the transaction from the log buffer to redo log files. Redo storage slowness can cause high number of this wait event too. Ultimate blocker in either case is LGWR.
In most cases (but not always), high commit frequency is the cause and should be resolved (decreased) on the application side. That was not the case this time. The reason was the fact that a session (just a single one) issued lots DML statements which generated huge amount of redo. Once the session issued commit, it took quite a long for LGWR to flush redo from the log buffer to the disk. That stalled all other sessions because LGWR was busy and committing sessions were stuck as well. The transaction logic must be changed in the application to issue intermittent commits and not to commit only once when it ends.
Troubleshooting Slowness Associated with Log Activity (again briefly)
Posted in Oracle.