
Eclipsys has helped McMaster University maximize its investment in the Oracle Exadata Cloud@Customer solution. Read the story here






Upgrading an Oracle Database Appliance (ODA) Bare Metal system to version 19.24 is a significant process that goes far beyond simple patching. It involves a massive operating system migration, specifically transitioning the appliance from Oracle Linux 7 (OL7) to Oracle Linux 8 (OL8). This mandates the Data Preserving Reprovisioning (DPR) process, which essentially rebuilds the server's OS while keeping the Grid Infrastructure (GI) and user data intact on the ASM disk groups.
During a recent production upgrade, we encountered a critical hard stop failure right at the beginning of the post-reprovisioning configuration, specifically during the node restore phase. The root cause was not a bug in the new software, but a subtle "configuration drift" involving DNS, magnified by separated team responsibilities.
The failure occurred immediately after the OS re-image when attempting to restore the system configuration using odacli restore-node -g.
Symptom Reported: "Network Plumbing Error"
The restoration job stalled or failed while configuring network interfaces.
Upon manual investigation, commands like nslookup hung indicate a complete loss of name resolution. This confirmed the DNS service was unreachable or invalid.
The Root Cause: Configuration Drift via Team Silos
In ODA, system configuration is stored in two places. The DPR process relies entirely on the ODA Metadata to restore the system. This is where the issue lay: a mismatch between what the ODA software expected and what the network actually required.
| Component | File/Command | Status Before Fix |
| ODA Metadata (Stale) | cat /opt/oracle/oak/restore/metadata/provisionInstance.json |
"dnsServers" : [ "10.5.1.6", "10.5.1.7" ] (OLD/Invalid IPs) |
| OS Backup (Correct) | cat /opt/oracle/oak/restore/bkp/sysfiles/etc/resolv.conf |
nameserver 10.15.0.1, nameserver 10.15.0.2 (NEW/Valid IPs) |
The Human Factor (Siloed Teams):
The DNS/Network Team retired the old DNS servers (10.5.x.x).
The System Administration Team manually updated the OS file (/etc/resolv.conf) to restore immediate connectivity.
The critical mistake was bypassing the ODA CLI. The odacli command is mandatory to synchronize the DNS change with the internal DCS Metadata. Because this was skipped, the ODA attempted to restore the new OL8 OS with the stale, decommissioned DNS IPs, causing the network plumbing failure.
Never manually edit network configuration files on an ODA.
To prevent this drift, always use the official procedure to update network settings. This ensures the DCS Metadata is updated simultaneously.
The initial GI restore failure was immediately resolved by manually correcting the DNS entries within the ODA's internal metadata. The logs below confirm the successful execution of the core DPR steps after the underlying network issue was fixed.
We first ran the cleanup script to revert the failed state and prepare the node for a fresh GI restore.
The new OL8 base system temporarily loses the required GI and DB software image registrations. Re-running odacli update-repository is a mandatory "bridge" step to relink the new OS to the pre-existing database software clones.
/opt/oracle/dcs/bin/odacli update-repository -f /cohesity_nfs01/upgrade19.24/oda-sm-19.24.0.0.0-240802-server.zip
/opt/oracle/dcs/bin/odacli update-repository -f /cohesity_nfs01/upgrade19.24/odacli-dcs-19.24.0.0.0-240724-GI-19.24.0.0.zip
/opt/oracle/dcs/bin/odacli update-repository -f /cohesity_nfs01/upgrade19.24/odacli-dcs-19.24.0.0.0-240724-DB-19.24.0.0.zip
/opt/oracle/dcs/bin/odacli update-repository -f /cohesity_nfs01/test02_bkp_20251104/serverarchive_test02/serverarchive_test02.zip
provisionInstance.jsonSince the system was down and time was of the essence, the fastest way to get provisioning moving again was to bypass the outdated DCS Metadata and inject the correct DNS information directly into the configuration file used for provisioning.
The file that needed to be updated was located at:
/opt/oracle/oak/restore/metadata/provisionInstance.json
{
odacli restore-node -g)With the DNS metadata corrected, the GI restore executed successfully, which includes restoring the necessary OS users, groups, and the Clusterware stack.
[root@oak ~]# odacli restore-node -g
...
After the Grid Infrastructure restore (-g) is complete, the final step is the database restore (-d).
odacli restore-node -d)With both the network and repository links validated, the final database restore job completed flawlessly.
| Step | Action | Why it's Critical |
| 1. Prevention | Use odacli update-netinterface for all network changes. |
Prevents DNS metadata drift and the initial restore-node -g failure. |
| 2. Bridge | Run odacli update-repository for all clones. |
Critical: Re-establishes the link between the new OL8 OS and the database software images, as warned by cleanup.pl. |
| 3. GI Restore | Run odacli restore-node -g. |
Restores the OS (now OL8) and Grid Infrastructure configuration. |
| 4. DB Restore | Run odacli restore-node -d. |
Completes the DPR process by restoring the databases. |
The ODA's reliance on DCS Metadata to generate configuration files like provisionInstance.json It can become a single point of failure when that metadata is out of sync with the actual operating system network settings.
When faced with a "System Unavailable" crisis during an ODA upgrade due to DNS resolution failure, a direct manual intervention—updating /opt/oracle/oak/restore/metadata/provisionInstance.json—proved to be the necessary emergency measure to immediately restore provisioning capability.
While a manual edit can save the day, it only addresses the symptom in the provisioning file, not the root cause within the DCS Metadata. Moving forward, the critical lesson is to ensure all DNS changes are performed using Oracle's recommended procedure to update the underlying metadata correctly. This provides system consistency, prevents configuration drift, and keeps future patching and provisioning operations running smoothly, making ODA management predictable and stable.
