Greetings friends, at VMware Explore Europe, vSphere 8 was finally announced as GA, with an impressive number of more than 18K downloads by then, bear in mind that Initial Availability was announced a month earlier.
As per usual, I was patiently waiting for confirmation that my backup software supported, or at least initially support for the release before I tried to install anything. The good news are that Veeam RnD and QA have performed extensive testing with version 11a P20220302 (build 11.0.1.1261 P20220302) of Veeam Backup & Replication, and they announce that this version is functional with vSphere 8.0.
I wanted to clarify that first thing I did was to upgrade my VCSA to the most up-to-date version for my 7.x deployment. But that din’t helped either.
Exporting VMware Analytics Service data – stuck at 39%
Went to the download page, started the upgrade process as per usual, then it get stucked at this stage, 39%, it is not clear what is happening, tried all the possible logs I could think of:
- var/log/vmware/applmgmt/applmgmt.log
- firstboot, and others
After waiting there for an hour, the process then moves to the next error, just a regular timeout I think, but most important, if you click in the button LOGS, it fails telling you that the vCenter has been shut down, so you can not retrieve anything from there.
The problem seems common and there are other people facing the same issue:
Well, after days of battle with this, and many, many deployments some words from my friend Luciano struck to my mind.
Cleaning/reducing VMware Analytics Service data
It was that simple, there are a few articles on the Internet on how to do this, but didn’t made my life difficult and used an official KB, that contains a simple script:
The article mentions that some Customers had between 185K to 375k files under /var/log/vmware/analytics/prod, to take a quick look on our VCSA, we log in using SSH, and do a quick check:
ls /var/log/vmware/analytics/prod | wc -l
Geez! To my surprise, on my very small environment with 3 Hosts and less than 50 VMs I had 24246 files there, and thinking on how a migration process might take to query and move all of these files, I decided to download the script and run it.
You have the script on that official KB, but in case you are already here and want to move ahead with a copy/paste:
#!/bin/bash # Copyright (c) 2022 VMware, Inc. All rights reserved. # The aim of the script is to delete every file in following pattern: # *VDDK.* # # The default period is set to 180 days, but can be set as an argument. # Ex: ./cleaning_vddk_script.sh 90 (Delete every file older than 90 days). DEFAULT_PERIOD=90 TARGET_DIR="/storage/log/vmware/analytics/prod" PERIOD=${1:-$DEFAULT_PERIOD} if [[ -d "$TARGET_DIR" ]]; then echo "Files before the clean :" find $TARGET_DIR -name "*VDDK.*" -type f -exec wc -l {} + | wc -l echo "===" echo "Start cleaning VDDK files older than $PERIOD days..." find $TARGET_DIR -name "*VDDK.*" -type f -mtime +$PERIOD -delete if [ $? -eq 0 ]; then echo "Clean successful!" else echo "The clean was unsuccessful!" fi echo "===" echo "Files after the clean: " find $TARGET_DIR -name "*VDDK.*" -type f -exec wc -l {} + | wc -l else echo "Directory not found!" fi
Then you can run it with the retention you want to keep, I went all in and just keep the files for the last week, everything else I didn’t mind. So, this is what I did after downloading/creating the file:
chmod +x cleaning_vddk_script.sh ./cleaning_vddk_script.sh 7
And to my surprise, this was the output:
Files before the clean : 24246 === Start cleaning VDDK files older than 7 days... Clean successful! === Files after the clean: 110
Thank you, VMware, now with just 110 files to migrate over, I thought it could be a good idea to trigger the wizard again, deploy a new VCSA, etc. The usual steps I did 16 times these last days. And then, it happened, it passed that moment in just a few minutes:
Not just that, but a few minutes after all the steps were completed, what a successful upgrade it was:
Some other Best Practices to know before upgrading
Thanks to Luciano, I wanted to share a few other best-practices to bear in mind before you trigger an upgrade of your vCenter:
- Have an appliance Backup, snapshot, and native VCSA backup.
- Check that all the DNS are in, plus the PTR as well. This means that you can ping all your VCSA, and ESXi Hostnames from everywhere in the VMware infrastructure, for example, with a simple dig -x IP of any component, it should resolve the FQDN:
dig -x 192.168.1.165 ; <<>> DiG 9.16.27 <<>> -x 192.168.1.165 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7883 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4000 ;; QUESTION SECTION: ;165.1.168.192.in-addr.arpa. IN PTR ;; ANSWER SECTION: 165.1.168.192.in-addr.arpa. 3600 IN PTR esxi-zlon-001.jorgedelacruz.es. ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Wed Nov 16 16:54:09 UTC 2022 ;; MSG SIZE rcvd: 99
- Expire certificates, or expired trusted root certificates. This is never good, and you should keep these in constant check. This KB I am sure it would help you. Your VCSA should look like this:
- Have enough space for the upgrade, thanks Luciano. His blog with the steps here – https://www.provirtualzone.com/how-to-add-extra-space-to-vcenter-for-the-upgrade/
- Having a local NTP is always a good idea, here are some steps to achieve this in Microsoft Windows, in Spanish but it reads easy – https://www.jorgedelacruz.es/2020/11/09/microsoft-como-crear-un-servidor-ntp-en-microsoft-windows-server-dentro-de-nuestra-infraestructura/
That should be it. I truly hope if you have ended up here, you have been able to resolve your upgrade problem and you are on your way to vSphere 8. Thanks for reading.
bala says
had the exact same error , thanks for the fix , should i just restart the upgrade from stage 1 ?
Robbyrob says
FYI this can be run WHILE the upgrade is in progress and stuck at this step!
Thank you so much for your help
Levi says
Thank you so much for this article! I ran the script while the migration was stuck (for the 3rd time!) and as soon as it finished the progress bar started moving again and finished successfully! What a relief!
jorgeuk says
Great news!!! Awesome it worked.