Download the latest Gartner® report: “Use this checklist to ensure your data is ready for the era of agentic AI”

Troubleshooting a DomU Upgrade Failure on Exacs X9M-2 (OL7 → OL8 Upgrade)

Rajesh Madhavarao Nov 28, 2025 10:00:00 PM

Dec 23, 2024 12:00:00 AM

Oracle E-Business Suite Upgrade Options

Jan 25, 2023 12:00:00 AM

Oracle Database 19c Pre-Upgrade Checklist – Eclipsys

Nov 24, 2025 3:10:14 PM

Upgrade Oracle 19c Autonomous Database to 26ai

Performing an OS upgrade on an ExaCS environment is a critical maintenance task. Recently, we performed a DomU patch on our Exadata X9M-2 cluster to migrate from Oracle Linux 7.9 to Oracle Linux 8.10. While the process is generally automated, specific environmental variables can introduce challenges.

This post walks through the failure we encountered, root cause analysis, and how we manually recovered the patching process.

Cluster Details

Component	Current Version	Target Version
Cluster Image	22.1.25.0.0.240710	24.1.16.0.0.250905
OS	OL 7.9	OL 8.10
Infra Version	25.1.4.0.0.250612	25.1.4.0.0.250612
Storage Version	25.1.4.0.0.250612	25.1.4.0.0.250612

1. Pre-Check Failures — Custom Packages

The initial patchmgr pre-check failed due to custom RPMs that were manually installed.

The system generated a cleanup script:

/var/log/cellos/remove_unknown_packages.201125214544.sh

The script contained multiple lines such as:

rpm -e --nodeps <package-name>

We reviewed the list carefully, removed all custom packages, and then the pre-check passed successfully.

2. Patching Started — Node 1 Upgraded, but CRS Failed to Start

During patching, patchmgr started upgrading Node 1 from Node 2 (as expected).
Node 1 successfully upgraded to Oracle Linux 8.10, but the process failed before CRS could start.

To analyze the failure, we reviewed the logs on Node 2:

/u02/dbserver.patch.zip_exadata_ol8_24.1.16.0.0.250905_Linux-x86-64.zip/
dbserver_patch_251020/patchmgr_log*/exadevdb-01.sub*.log

Log Snippet — Insufficient Root Filesystem Space

[1764094796][2025-11-25 18:20:07 +0000][INFO][./dbnodeupdate.sh][CheckFreeSpace][] v_fs:/,v_free_space:2729,v_fs_size:3200
[1764094796][2025-11-25 18:20:07 +0000][INFO][./dbnodeupdate.sh][DiaryEntry][] Entering PrintGenError Insufficient free space in file system '/'. The minimum required free space is 3200M. But available free space is 2729M. Cleanup before proceeding the actual upgrade.
[1764094796][2025-11-25 18:20:07 +0000][ERROR][./dbnodeupdate.sh][PrintGenError][] Insufficient free space in file system '/'. The minimum required free space is 3200M. But available free space is 2729M. Cleanup before proceeding the actual upgrade.
[1764094796][2025-11-25 18:20:07 +0000][INFO][./dbnodeupdate.sh][DiaryEntry][] Entering UpdateDbnodeupdateStatFile failed;
[1764094796][2025-11-25 18:20:07 +0000][INFO][./dbnodeupdate.sh][UpdateDbnodeupdateStatFile][] /opt/oracle.SupportTools/.tmp.dbnodeupdate.state
[1764094796][2025-11-25 18:20:07 +0000][INFO][./dbnodeupdate.sh][UpdateDbnodeupdateStatFile][] Copying /opt/oracle.SupportTools/.dbnodeupdate.state to /opt/oracle.SupportTools/.tmp.dbnodeupdate.state

--------------------------------------------------------------------------------------------------------------------------------

Insufficient free space in file system '/'.
Required free space: 3200 MB
Available free space: 2729M

Although the pre-check passed, the available space (~2.7 GB) was too close to the minimum and suggesting that it filled up during the upgrade.

We cleaned up the space and restarted the patch from console, but it failed again, which led us to investigate the node state.

3. Checking Node State After Failure

Node 1 — Partially Upgraded

Node 1 was upgraded, but CRS was not enabled.

Check the current image and kernel:

[root@exadevdb-01 dbnu]# imageinfo

Kernel version: 5.4.17-2136.343.5.5.el8uek.x86_64

Uptrack kernel version: 5.4.17-2136.346.6.el8uek.x86_64

Image kernel version: 5.4.17-2136.343.5.5.el8uek

Image version: 24.1.16.0.0.250905

Image activated: 2025-11-25 18:19:53 +0000

Image status: success

Exadata software version: 24.1.16.0.0.250905

Node type: GUEST

System partition: /dev/mapper/VGExaDb-LVDbSys1

[root@exadevdb-01 cellos]# imagehistory

Version : 22.1.25.0.0.240710

Exadata Live Update Version : n/a

Image activation date : 2024-08-22 16:07:13 +0000

Imaging mode : fresh

Imaging status : success

Version : 24.1.16.0.0.250905

Exadata Live Update Version : n/a

Image activation date : 2025-11-25 18:19:53 +0000

Imaging mode : patch

Imaging status : success

Check CRS/HA services:

[root@exadevdb-01 ~]# . oraenv

ORACLE_SID = [root] ? +ASM1

[root@exadevdb-01 ~]# crsctl config has

CRS-4621: Oracle High Availability Services autostart is disabled.

However, the Oracle High Availability Services (OHAS) were disabled and down.

Node 2 - Still on Old Version

Since the upgrade is driven from Node 2 to Node 1, we confirmed that Node 2 was still running the legacy OS and kernel, waiting for the process to complete.

[root@exadevdb-02 ~]# imageinfo

Kernel version: 4.14.35-2047.528.2.4.el7uek.x86_64 #2 SMP Tue Feb 27 20:52:58 PST 2024 x86_64

Uptrack kernel version: 4.14.35-2047.537.4.el7uek.x86_64 #2 SMP Fri May 31 15:52:44 PDT 2024 x86_64

Image kernel version: 4.14.35-2047.528.2.4.el7uek

Image version: 22.1.25.0.0.240710

Image activated: 2024-08-22 16:07:10 +0000

Image status: success

Node type: GUEST

System partition on device: /dev/mapper/VGExaDb-LVDbSys1

4. Resuming Patching Manually

Because Node 1 was already on the correct OS version, but the stack configuration was incomplete, we used the dbnodeupdate.sh utility to manually finish the post-upgrade steps.

Step 1: Prepare the Tooling. On Node 1, we navigated to the patch directory, created a temporary folder, and extracted the dbnodeupdate.zip.

[root@exadevdb-01 ~]# cd /u02/dbserver.patch.zip_exadata_ol8_24.1.16.0.0.250905_Linux-x86-64.zip/dbserver_patch_251020

[root@exadevdb-01 dbserver_patch_251020]# ls -alrt * dbnodeupdate*

-rw-r--r-- 1 root root 8444358 Oct 24 01:13 dbnodeupdate.zip

[root@exadevdb-01 ~]# mkdir dbnu

[root@exadevdb-01 ~]# cp dbnodeupdate.zip dbnu

[root@exadevdb-01 ~]#cd dbnu

[root@exadevdb-01 dbnu]# unzip dbnodeupdate.zip

Archive: dbnodeupdate.zip

inflating: CheckHWnFWProfile

inflating: check_stack.sh

inflating: uek5_upgrade-roce.table

inflating: yq

[root@exadevdb-01 dbnu]#

Step 2: Execute Post-Patch Steps We ran the script with the -c (continue/complete) flag and specified the target version

[root@exadevdb-01 dbnu]# ./dbnodeupdate.sh -c -q -t 24.1.16.0.0.250905

(*) 2025-11-27 18:46:51: Initializing logfile /var/log/cellos/dbnodeupdate.log

##########################################################################################################################

# #

# Guidelines for using dbnodeupdate.sh (rel. 25.251020): #

# #

# - Prerequisites for usage: #

# 1. Refer to dbnodeupdate.sh options. See MOS 1553103.1 #

# 2. Always use the latest release of dbnodeupdate.sh. See patch 21634633 #

# 3. Run the prereq check using the '-v' flag. #

# #

# I.e.: ./dbnodeupdate.sh -u -l /u01/my-iso-repo.zip -v (may see rpm conflicts) #

# #

# - Prerequisite rpm dependency check failures can happen due to customization: #

# - The prereq check detects dependency issues that need to be addressed prior to running a successful update. #

# - Customized rpm packages may fail the built-in dependency check and system updates cannot proceed until resolved. #

# #

# - As part of the update, rpms shipped by Exadata may be removed. #

# #

# - In case of any problem when filing an SR, upload the following: #

# - /var/log/cellos/dbnodeupdate.log #

# - /var/log/cellos/dbnodeupdate.trc #

# - /var/log/cellos/dbnodeupdate.<runid>.diag #

# - where <runid> is the unique number of the failing run. #

# #

##########################################################################################################################

(*) 2025-11-27 18:47:01: Analyzing system configuration.

Active Image version: 24.1.16.0.0.250905

Active Kernel version : 5.4.17-2136.343.5.5.el8uek

Active LVM Name : /dev/mapper/VGExaDb-LVDbSys1

Inactive Image version: 22.1.25.0.0.240710

Inactive LVM Name : /dev/mapper/VGExaDb-LVDbSys2

Current user id : root

Action : finish-post cleanup and enable CRS to auto-start) - running in quiet mode

Shutdown EM agents : Yes

Shutdown stack : No (Currently stack is down)

Logfile : /var/log/cellos/dbnodeupdate.log (runid: 271125184652)

Diagfile : /var/log/cellos/dbnodeupdate.271125184652.diag

Server model : Exadata Virtual Machine

dbnodeupdate.sh rel. : 25.251020 (always check MOS 1553103.1 for the latest release of dbnodeupdate.sh)

Exadata Live Update: No

(*) 2025-11-27 18:48:06: Executing plugin /u02/dbserver.patch.zip_exadata_ol8_24.1.16.0.0.250905_Linux-x86-64.zip/dbserver_patch_251020/dbnu/dbnu-plugin.sh with arguments 271125184652 start-execfinish

(*) 2025-11-27 18:48:08: Running validations. Maximum wait time: 60 minutes.

(*) 2025-11-27 18:48:08: If the node reboots, re-run './dbnodeupdate.sh -c' after the node restarts..

(*) 2025-11-27 18:48:24: EM agent in /u02/app/oracle/product/agent13c/agent_13.5.0.0.0 stopped

(*) 2025-11-27 18:48:25: Service acpid enabled to autostart at boot

(*) 2025-11-27 18:48:26: Not Relinking Oracle homes

(*) 2025-11-27 18:48:26: Executing plugin /u02/dbserver.patch.zip_exadata_ol8_24.1.16.0.0.250905_Linux-x86-64.zip/dbserver_patch_251020/dbnu/dbnu-plugin.sh with arguments 271125184652 before-relink

(*) 2025-11-27 18:48:42: Starting Grid Infrastructure (/u01/app/19.0.0.0/grid)

(*) 2025-11-27 18:51:21: Stack started

(*) 2025-11-27 18:51:26: TFA Started

(*) 2025-11-27 18:51:27: Enabling stack to start at reboot. Disable this when the stack should not start on the next boot

(*) 2025-11-27 18:51:31: Removed obsolete kernel-transition: kernel-transition-3.10.0-0.0.0.2.el7

(*) 2025-11-27 18:51:31: Retained the required kernel-transition package:

(*) 2025-11-27 18:51:31: Disabling diagsnap for Grid Infrastructure versions older than 23c (24900613)

(*) 2025-11-27 18:52:34: All post steps are finished.

Step 3: Verify Success: The script successfully analyzed the configuration, identifying that the active image was 24.1.16.0.0.250905 and the action required was "finish-post cleanup and enable CRS to auto-start".

(*) Starting Grid Infrastructure (/u01/app/19.0.0.0/grid)

(*) Stack started

(*) TFA Started

(*) Enabling the stack to start at reboot.

(*) All post steps are finished.

We verified that the database services were running using ps -ef:

[root@exadevdb-01 dbnu]# ps -ef | grep pmon

grid ... asm_pmon_+ASM1

oracle ... ora_pmon_ERGDEVFS1

5. Completing the Cluster Upgrade

With Node 1 fully operational and the Grid Infrastructure stack running, we returned to the OCI Console. We selected the Retry Apply action for the VM Cluster.

Because Node 1 was now in a healthy, patched state, the automation recognized the completion and proceeded to patch Node 2.

The OCI Work Requests confirmed the successful completion of the "Apply Cloud VM Cluster OS Update" operation shortly thereafter.

Key Takeaways

Strict Space Requirements: The dbnodeupdate The process is strict about root filesystem space (3200M minimum). Even if a pre-check passes, temporary files generated during the upgrade can consume the buffer.
Log Location: Remember that when a node is being patched, the logs for the operation are typically found on the peer node (the node driving the update).
Manual Resume: If an OS update succeeds but the post-patch scripts fail (leaving CRS down), you can often rescue the node using ./dbnodeupdate.sh -c rather than rolling back immediately.

This procedure saved us significant time by allowing us to move forward with the existing OS upgrade rather than attempting a complex rollback.

Oracle Linux EXACS

Dec 23, 2024 12:00:00 AM

Oracle E-Business Suite Upgrade Options

Jan 25, 2023 12:00:00 AM

Oracle Database 19c Pre-Upgrade Checklist – Eclipsys

Nov 24, 2025 3:10:14 PM

Upgrade Oracle 19c Autonomous Database to 26ai

Connect to Oracle Database with VS Code using SQL Developer Extension and SQLcl

Resolving an Oracle ODA DNS Mismatch: A Step-by-Step Guide

Managed Services

Engineered Systems

Cloud

Licensing

Development Services

Industry Solutions

Database

Oracle Systems Portfolio

Application Services

Other Data Services

Success Stories

McMaster University: Exadata Cloud@Customer Case Study

Eclipsys Joins A Brand With Multi-National Reach Following Acquisition By DSP UK

Resources

Life at DSP-Eclipsys

Getting to know us