Recently, I encountered an issue on a DBCS Data Guard standby database where the alert log reported the error ORA-29771. After seeing this error, I checked the Data Guard status and noticed that the standby had developed a significant apply lag, even though redo apply was still running.
In this blog, I’ll explain the symptoms I observed, how I investigated the cause of the ORA-29771 error, and the steps I took to restore normal standby performance.
I started by examining the standby alert log to confirm the ORA-29771 occurrence and identify any related background process failures:
The above image shows the LMON/LMHB messages, and it also indicates that a trace file was generated for this process. The related trace file is shown below:
This is the most direct workaround. We can increase the timeout for the process that is failing to communicate with LMON using a hidden parameter.
According to Doc ID 2862283.1, I verified that patch 31396214 had been applied to this DBCS database.
Since patch 31961214 has already been successfully applied to the standby database home directory, I need to configure the following hidden parameter on the database.
This parameter controls the maximum time (in centiseconds, or 1/100 of a second) an Active Data Guard query will wait for a parse lock before timing out. I changed this parameter to 100 seconds (10000 centiseconds). This accommodates our observed 70-second wait time and provides a 30-second buffer.
ALTER SYSTEM SET "_adg_parselock_timeout" = 10000 SCOPE=BOTH;
This parameter is a control introduced by the patch to address standby query parsing issues. Setting it to a non-zero value enables the fix. The recommended non-zero value is usually 10 or more seconds.
ALTER SYSTEM SET "_bug31961214_standby_parse_limit_seconds" = 10 SCOPE=BOTH;
Now, we can see that these parameters have been configured on the standby database using the following select statement.
SELECT NAME, VALUE FROM V$SPPARAMETER WHERE NAME IN ('_adg_parselock_timeout', '_bug31961214_standby_parse_limit_seconds');
If a single MRP worker is trying to apply a massive redo block, it can lock up CPU resources and trigger the error.
I confirmed that the apply_lag was about 5 hours, indicating that the standby was significantly behind the primary, as shown below.
I checked the application rate as well
I stopped the MRP and then restarted it with a parallel setting of 4, which has now dramatically increased the apply rate.
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION PARALLEL 4;
Conclusion
ORA-29771 indicates that a critical Oracle background process unexpectedly terminated. In a Data Guard standby environment, this can lead to performance degradation and increased apply lag even when redo apply remains active. By reviewing the alert logs, checking Data Guard status, validating system resource health, and restarting MRP when necessary, you can effectively identify the cause and restore normal standby performance.