Monday, December 02, 2024

What is the core dump size for ESXi 8.0 U3?

Nine years ago, I wrote the blog "How large is my ESXi core dump partition?". Back then, it was about core dumps in ESXi 5.5. Over the years, a lot has changed in ESXi which is true for core dumps too. 

Let's write a new blog post about the same topic but right now for ESXi 8.0 U3. The behavior should be the same in ESXi 7.0. In this blog post, I will use some data from ESXi 7.0 U3 because we are still running ESXi 7.0 U3 in production and I plan and design the upgrade to vSphere 8. That's why I have ESXi 8.0 U3 just in the lab where some hardware configurations are unavailable. We use ESXi hosts with 1.5 TB RAM in production but I don't have hosts with such memory capacity in my lab.

What is a core dump? It boils down to PSOD. ESXi host Purple Screen of Death (PSOD) happens when VMkernel experiences a critical failure. This can be due to hardware issues, driver problems, deadlock, etc. During the PSOD event, the ESXi hypervisor captures a core dump to help diagnose the cause of the failure. Here’s what happens during this process:

After a PSOD, ESXi captures a core dump, which includes a snapshot of the hypervisor memory and the state of the virtual machines. The core dump is stored based on the host configuration (core dump partition, file, or network), and it helps diagnose the cause of the critical failure by providing insights into the state of the system at the time of the crash. A core dump is crucial for troubleshooting and resolving the issues leading to PSOD. And here is the change. In ESXi 6.7, the core dump was stored in a disk partition but since ESXi 7, it has been stored in the precreated file.

For the detailed vSphere design, I would like to know the typical core dump file size to allocate optimal storage space for core dumps potentially redirected to shared datastore (by default, in ESXi 7 and later, the core dumps are stored in ESX-OSData partition, typically on boot disk). Of course, the core dump size depends on multiple factors, but the main factor should be the memory used by vmKernel.   

ESXi host memory usage is split into three buckets

  1. vmKernel memory usage (core hypervisor)
  2. Other memory usage
    • BusyBox Console including
      • Core BusyBox Utilities (e.g., ls, cp, mv, ps, top, etc.)
      • Networking and Storage Tools (ifconfig, esxcfg-nics, esxcfg-vswitch, esxcli, etc.)
      • Direct Console User Interface (DCUI)
      • Management Agents and Daemons (hostd, vpxa, network daemons like SSH, DNS, NTP, and network file copy aka NFC)
  3. Free memory

So let's go to the lab and test it. Here is data from three different ESXi host configurations I have access to. 

ESXi, 8.0.3 (24022510) with 256 GB (262 034 MB) physical RAM

vSAN is disabled, NSX is installed

In Production mode running 10 Powered On VMs having 24 GB vRAM:

  • vmKernel memory usage:  1544 MB
  • Other memory usage: 21 498 MB
  • Free memory: 238 991 MB
In Maintenance mode (no VMs):
  • vmKernel memory usage:  1453 MB
  • Other memory usage: 4 207 MB
  • Free memory: 256 373 MB
Let's try PSOD on the ESXi host in maintenance mode.

In ESXi 8.0.3 with 256 GB RAM, the core dump is set to be stored into a 3.6 GB file (3,882,876,928 bytes) at the ESX-OSData.
 [root@dp-esx02:~] esxcli system coredump file list  
 Path                                                   Active Configured    Size  
 ------------------------------------------------------------------------------------------------------- ------ ---------- ----------  
 /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile  true    true 3882876928  

It is configured and active. 

 [root@dp-esx02:~] esxcli system coredump file get  
   Active: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile  
   Configured: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile  

The core dump file has 3.6 GB
 [root@dp-esx02:~] ls -lah /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile  
 -rw-------  1 root   root    3.6G Oct 29 13:07 /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile  

Now let's try the first PSOD on the ESXi host in maintenance mode and watch what happens. Below is the command to initiate PSOD and the screenshot
 vsish -e set /reliability/crashMe/Panic 1  

VMware Support will ask you for zdump file (VMware proprietary bin file) which can be generated by command esxcfg-dumppart
 [root@dp-esx02:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile --zdumpname /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1  
 Created file /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1  
 [root@dp-esx02:~] ls -lah /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1  
 -rw-r--r--  1 root   root   443.9M Oct 29 13:07 /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1  
The extracted VMkernel zdump file from the first PSOD has 443.9 MB.

Now let's try the second PSOD.
 vsish -e set /reliability/crashMe/Panic 1  

Let's extract the core dump.
 [root@dp-esx02:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile --zdumpname /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2  
 Created file /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1  
 [root@dp-esx02:~] ls -lah /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1  
 -rw-r--r--    1 root     root      311.2M Nov  4 09:33 /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1
The extracted VMkernel zdump file from the 2nd PSOD has 311.2 MB.

Only one core dump exists in the core dump file; therefore, multiple core dumps are not stored in the system. Thus the best practice is to extract every core dump ( esxcfg-dumppart --file --copy)  from the core dump file to an external storage location to allow core dump analysis of older PSODs. 
 
Let's continue in our PSOD test with additional PSOD's to find out if the system can manage sequential core dumps having a total size bigger than 3.6 GB, which is the size of a core dump file. Let's assume the single core dump size is always around 300 MB, we would need 13 PSODs, so let's do 14 PSODs. 

_3rd PSOD:    304.9 MB
_4th PSOD:    285.4 MB
_5th PSOD:    303.2 MB
_6th PSOD:    316.6 MB
_7th PSOD:    322.9 MB
_8th PSOD:    288.3 MB
_9th PSOD:    283.0 MB
10th PSOD:    276.7 MB
11th PSOD:    292.5 MB
12th PSOD:    289.7 MB
13th PSOD:    281.8 MB
14th PSOD:    290.3 MB
TOTAL:           4.2 GB

So even though we have a 3.6 GB core dump file for the ESXi host with 256 GB RAM, we can collect more than 4 GB core dumps. This is the proof, that the core dump file is used for a single core dump and the next core dump rewrites the old one. 

ESXi, 8.0.3 (24022510) with 128 GB (131 008 MB) physical RAM

vSAN is disabled, NSX is not installed

In Maintenance mode (no VMs):
  • vmKernel memory usage:  694 MB
  • Other memory usage: 1 660 MB
  • Free memory: 128 653 MB
In ESXi 8.0.3 with 128 GB RAM, the core dump is set to be stored into a 2.27 GB file (2,441,084,928 bytes) at the ESX-OSData partition.
 [root@esx21:~] esxcli system coredump file list  
 Path                                                                                                     Active  Configured        Size
-------------------------------------------------------------------------------------------------------  ------  ----------  ----------
/vmfs/volumes/6727594d-c447be9c-5a0e-90b11c13fc14/vmkdump/4C4C4544-0054-5810-8033-B3C04F48354A.dumpfile    true        true  2441084928

Now let's try PSOD on ESXi host in maintenance mode.
 vsish -e set /reliability/crashMe/Panic 1  

Let's extract the core dump.
[root@esx21:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/6727594d-c447be9c-5a0e-90b11c13fc14/vmkdump/4C4C4544-0054-5810-8033-B3C04F48354A.dumpfile --zdumpname /vmfs/volumes/ESX21
-FLASH-01/coredump.esx21.1
Created file /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
[root@esx21:~] ls -lah /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
-rw-r--r--    1 root     root      111.0M Dec  2 08:26 /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
The VMkernel zdump file extracted from the PSOD on an ESXi 8.0 U3 host with 128 GB of RAM is 111 MB in size, which is significantly smaller than the zdump file from an ESXi 8.0 U3 host with 256 GB of RAM.

Let's compare it to ESXi host with 256 GB RAM
  • 128 GB RAM is half of 256 GB RAM
  • vmKernel memory usage 694 MB is ~half of 1453 MB
  • coredump file 2.27 GB file is ~60% of 3.6 GB file
  • zdump file 111 MB is  ~4x smaller than 443.9 MB

ESXi, 7.0.3 (23794027) with 512 GB (524 178 MB) physical RAM

In Production mode running 38 Powered On VMs having 310.37 GB vRAM:
  • vmKernel memory usage:  3 227 MB
  • Other memory usage: 366 140 MB
  • Free memory: 154 810 MB
In Maintenance mode (no VMs):
  • vmKernel memory usage:  2 776 MB
  • Other memory usage: 25 402 MB
  • Free memory: 495 998 MB
In ESXi 7.0.3 with 512 GB RAM, the core dump is set to be stored into an 8.16 GB file at the ESX-OSData partition.
 [root@prg03t0-esx05:~] esxcli system coredump file list  
 Path                                                        Active Configured    Size  
 ------------------------------------------------------------------------------------------------------------------ ------ ---------- ----------  
 /vmfs/volumes/6233a3c2-58e4bf62-94e7-0025b5ea0e13/vmkdump/00000000-00E0-0000-0000-000000000006-8162115584.dumpfile  true    true 8162115584  

Now let's try PSOD on ESXi host in maintenance mode.
 vsish -e set /reliability/crashMe/Panic 1  


Let's extract the core dump.
[root@prg03t0-esx05:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/6233a3c2-58e4bf62-94e7-0025b5ea0e13/vmkdump/00000000-00E0-0000-0000-000000000006-8162115584.dumpfile --zdumpname /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1
Created file /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1  
[root@prg03t0-esx05:~] ls -lah /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1
-rw-r--r--    1 root     root        4.6G Nov  5 18:11 /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1
The extracted VMkernel zdump file from the PSOD of ESXi 7.0 U3 with 512 GB RAM has 4.6 GB, which is significantly bigger than the zdump file from ESXi 8.0 U3 with 256 GB RAM. 

The VMkernel zdump file extracted from the PSOD on an ESXi 7.0 U3 host with 512 GB of RAM is 4.6 GB in size, which is significantly larger than the zdump file from an ESXi 8.0 U3 host with 256 GB of RAM.

Let's compare it to ESXi 8.0 U3 with 256 GB RAM
  • 512 GB RAM is 2x bigger than 256 GB RAM
  • vmKernel memory usage 2 776 MB is ~2x larger than 1453 MB (it makes sense)
  • coredump file 8.16 GB file is ~2.25x larger than 3.6 GB file (it makes sense)
  • zdump file 4.6 GB (4 710 MB) is ~10x larger than 443.9 MB (hmm, interesting)
Why is the zdump file 10x bigger on ESXi 7.0 U3 with 512 GB RAM and not just 2x bigger than I would expect? To be honest, I don't know. I have to retest it on ESXi 8.0 U3 with 512 GB RAM and 1.5 TB RAM when possible.

ESXi, 7.0.3 (23794027) with 1.5 TB (1 571 489 MB) physical RAM

This ESXi host (1.5 GB RAM) is only in production so it is managed by the operational team and we want to avoid testing PSOD in production. However, we checked the memory usage and core dump file size. 

In Maintenance mode (no VMs):
  • vmKernel memory usage:  2 705 MB
  • Other memory usage: 2 705 MB
  • Free memory: 1 561 570 MB
In ESXi 7.0.3 with 1.5 TB RAM, the core dump is set to be stored into a 16.1 GB file at the ESX-OSData partition.
 [root@prg0301-esx36:~] esxcli system coredump file list  
 Path                                                         Active Configured     Size  
 ------------------------------------------------------------------------------------------------------------------- ------ ---------- -----------  
 /vmfs/volumes/5dec0956-3d83cd8b-de10-0025b52ae000/vmkdump/00000000-0021-0000-0000-000000000024.dumpfile        true    true 16106127360  
I cannot test PSOD in the production system by myself, so I have to wait until our operation team schedules the vSphere 8 upgrade and we can test it together. 

Conclusion

Core dump files for ESXi 6.7 and lower are stored in a disk partition. ESXi 7 and higher store core dumps into the core dump file. The core dump file is used for a single core dump, therefore it should be extracted (esxcfg-dumppart --file --copy) by vSphere administrator immediately after the PSOD otherwise it will be lost when another PSOD occurs.

In the current ESXi 8, the core file is located in ESX-OSData partition which can be on a boot disk or an additional disk.

If the boot disk is higher than 128 GB ESXi 8 the standard layout is
  1. 101 MB   - Boot Loader partition
  2. 4 GB        - Boot Bank 1 partition
  3. 4 GB        - Boot Bank 2 partition
  4. 119.9 GB - ESX-OSData partition
This is the disk usage of ESXi 8.0 U3 with 256 GB RAM and 128 GB boot disk - vSAN disabled, NSX installed.
 Filesystem  Size  Used Available Use% Mounted on  
 VMFSOS   119.8G  5.2G  114.6G  4% /vmfs/volumes/OSDATA-66d98185-2bceed00-72c5-0025b5ea0e0d  
 vfat     4.0G 274.1M   3.7G  7% /vmfs/volumes/BOOTBANK1  
 vfat     4.0G 338.9M   3.7G  8% /vmfs/volumes/BOOTBANK2  
As you see, 5.2 GB is used in ESX-OSData partition. 

We use ESXi with 1.5 TB RAM booting from SAN (Fibre Channel) in our production environment. The boot disk (LUN on shared storage) is a size of 32 GB. In such case, the partition layout looks as described below
  1. 101 MB - Boot Loader partition
  2. 4 GB      - Boot Bank 1 partition
  3. 4 GB      - Boot Bank 2 partition
  4. 23.9 GB - ESX-OSData partition
The ESX-OSData volume takes on the role of the legacy /scratch partition, locker partition for VMware Tools, and core dump destination. In ESX-OSData having 23.9 GB, there is still space for a core dump file (16.1 GB), log files, and trace files. If we want to keep 20% free space on ESX-OSData partition, we have 19.1 GB available. 16.1 GB is preallocated for the core dump file and 3 GB is available for logs and traces. This should be enough. 

Note: Logs and traces are also configured to be sent to a remote syslog server (Aria Operations for Logs / aka LogInsight).

Even though core dumps can be redirected to a shared datastore (by changing Scratch Partition location), keeping the core dump in the boot device is a relatively good design choice when using a 32 GB or higher capacity durable boot disk device (HDD, NVMe, SATADOM, etc.). 

The VMware minimum recommended boot disk size is 32 GB, while 128 GB is considered ideal. VMware also recommends using durable boot devices, such as local disks or NVMe drives, instead of SD cards or USB sticks for ESXi 7.0 and later versions.

Note: In my home lab, I still boot from USB and have ESX-OSData on NVMe disk because my old equipment does not support booting from NVMe.



No comments: