In Part 2 of this series, I described how I upgraded several Solaris Non-Global Zones using an out-of-place upgrade method. Because the zones did not have enough temporary working space for a direct IPS upgrade, I copied the zone filesystems to a larger storage area, performed the package update there, and then synchronized the updated files back to the original zones.
The approach worked, but it also introduced additional manual steps compared to a standard upgrade. As with many administrative tasks, the technical procedure itself was not the biggest challenge. The real challenge was ensuring each step was performed against the correct zone and filesystem.
During the upgrade process, I made two mistakes that led to some unexpected troubleshooting. First, I accidentally synchronized files from the wrong upgraded zone image, causing one Non-Global Zone to boot with another zone's hostname, IP address, and listener configuration. Later, while correcting the network settings, I mistakenly executed an IP configuration command in the Global Zone instead of the target Non-Global Zone, which immediately disconnected my SSH session and required recovery through the server's ILOM console.
Fortunately, neither issue resulted in data loss, and both were recoverable. This article documents what happened, how the problems were identified, the recovery steps I followed, and the lessons I learned from the experience.
After the zone booted, it became obvious that the operating system configuration had been copied from the wrong zone image.
The database files, control files, and application data still belonged to test2, but several operating system settings belonged to test1. The hostname was incorrect, the IP address did not match the expected configuration, and the listener configuration was pointing to the wrong environment.
The first step was to verify the network configuration inside the affected zone.
zlogin test2 ipadm show-addr dladm show-vnic
The output confirmed that the zone was not using the correct network configuration.
To correct the network settings, I removed the existing IP configuration and recreated it using the proper address assigned to test2.
A quick warning based on my own mistake: double-check that you are inside the correct Non-Global Zone before running any ipadm commands using the zonename command. While fixing test2, I accidentally ran similar commands in the Global Zone and immediately lost network connectivity to the server. I was able to recover the system through ILOM, and I cover that experience later in this post.
# double-check that you are inside the correct Non-Global Zone
# double-check that you are inside the correct Non-Global Zonezonenameipadm delete-ip net0 2>/dev/nullipadm create-ip net0ipadm create-addr -T static -a 10.1.1.128/22 net0/v4ipadm show-addr |
I also found an old disabled address object that no longer belonged to the zone and removed it.
ipadm delete-addr znet0/v4ipadm delete-ip znet0ipadm show-addr |
At this point, the zone was reachable using the correct IP address.
Although the network was fixed, the zone still identified itself as test1 because the hostname information had been copied from the wrong image.
To correct this, I updated the Solaris identity service configuration.
svccfg -s system/identity:node setprop config/nodename = astring: test2svccfg -s system/identity:node refreshsvcadm restart system/identity:node |
To immediately update the current session, I also changed the runtime hostname.
hostname test2#Verification:hostname |
The hostname is now correctly reported as test2.
The final step was updating the local hostname resolution.
vi /etc/hosts |
I replaced the incorrect hostname entries copied from test1 and verified that the correct IP address and hostname for test2 were present.
This ensured that local name resolution, listener configuration, and application services would reference the correct server identity.
While correcting the network configuration, I made another mistake that caused a much larger problem.
Instead of creating the IP address inside the Non-Global Zone, I accidentally executed the commands in the Global Zone:
ipadm create-ip net0ipadm create-addr -T static -a 10.1.1.128/22 net0/v4 |
As soon as the command completed, my SSH session disconnected.
At that moment, I realized I had overwritten the Global Zone network configuration instead of modifying the Non-Global Zone. Since the public IP address of the server had changed, remote connectivity was immediately lost.
Fortunately, the server's ILOM interface was still accessible.
I connected to the server through the ILOM management interface and opened the system console.
After logging in as root, I inspected the network configuration.
ipadm show-addr #The incorrect address was visible on the Global Zone interface.#I removed the mistakenly created address.ipadm delete-addr net0/v4#Then I recreated the original address using the correct production IP.ipadm create-addr -T static -a 10.1.1.126/22 net0/v4#To verify the repair:ipadm show-addr net0/v4 |
Finally, I tested connectivity by pinging the gateway and confirmed that network access had been restored.
Only after verifying connectivity did I disconnect from the ILOM console and reconnect through SSH.
In the end, the upgrade was successful, but during the process, I made a couple of mistakes that created additional work. First, I synchronized the files from the wrong upgraded zone image, which caused the zone to come up with the wrong hostname, IP address, and listener configuration. Then, while fixing the network settings, I accidentally ran the IP configuration commands in the Global Zone and lost my connection to the server.
Fortunately, both issues were recoverable, and no application data was lost. The biggest lesson I learned from this experience is to always double-check the source and destination paths before running any synchronization command, and make sure you are working in the correct zone before making network changes. A few seconds of verification can save a lot of troubleshooting time later. Get in touch with our experts to learn more.