Nine years ago, I wrote the blog "How large is my ESXi core dump partition?". Back then, it was about core dumps in ESXi 5.5. Over the years, a lot has changed in ESXi which is true for core dumps too.
Let's write a new blog post about the same topic but right now for ESXi 8.0 U3. The behavior should be the same in ESXi 7.0. In this blog post, I will use some data from ESXi 7.0 U3 because we are still running ESXi 7.0 U3 in production and I plan and design the upgrade to vSphere 8. That's why I have ESXi 8.0 U3 just in the lab where some hardware configurations are unavailable. We use ESXi hosts with 1.5 TB RAM in production but I don't have hosts with such memory capacity in my lab.
What is a core dump? It boils down to PSOD. ESXi host Purple Screen of Death (PSOD) happens when VMkernel experiences a critical failure. This can be due to hardware issues, driver problems, deadlock, etc. During the PSOD event, the ESXi hypervisor captures a core dump to help diagnose the cause of the failure. Here’s what happens during this process:
After a PSOD, ESXi captures a core dump, which includes a snapshot of the hypervisor memory and the state of the virtual machines. The core dump is stored based on the host configuration (core dump partition, file, or network), and it helps diagnose the cause of the critical failure by providing insights into the state of the system at the time of the crash. A core dump is crucial for troubleshooting and resolving the issues leading to PSOD. And here is the change. In ESXi 6.7, the core dump was stored in a disk partition but since ESXi 7, it has been stored in the precreated file.
For the detailed vSphere design, I would like to know the typical core dump file size to allocate optimal storage space for core dumps potentially redirected to shared datastore (by default, in ESXi 7 and later, the core dumps are stored in ESX-OSData partition, typically on boot disk). Of course, the core dump size depends on multiple factors, but the main factor should be the memory used by vmKernel.
ESXi host memory usage is split into three buckets
- vmKernel memory usage (core hypervisor)
- Other memory usage
- BusyBox Console including
- Core BusyBox Utilities (e.g., ls, cp, mv, ps, top, etc.)
- Networking and Storage Tools (ifconfig, esxcfg-nics, esxcfg-vswitch, esxcli, etc.)
- Direct Console User Interface (DCUI)
- Management Agents and Daemons (hostd, vpxa, network daemons like SSH, DNS, NTP, and network file copy aka NFC)
- Free memory
So let's go to the lab and test it. Here is data from three different ESXi host configurations I have access to.
ESXi, 8.0.3 (24022510) with 256 GB (262 034 MB) physical RAM
- vmKernel memory usage: 1544 MB
- Other memory usage: 21 498 MB
- Free memory: 238 991 MB
- vmKernel memory usage: 1453 MB
- Other memory usage: 4 207 MB
- Free memory: 256 373 MB
[root@dp-esx02:~] esxcli system coredump file list
Path Active Configured Size
------------------------------------------------------------------------------------------------------- ------ ---------- ----------
/vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile true true 3882876928
It is configured and active.
[root@dp-esx02:~] esxcli system coredump file get
Active: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
Configured: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
[root@dp-esx02:~] ls -lah /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
-rw------- 1 root root 3.6G Oct 29 13:07 /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
vsish -e set /reliability/crashMe/Panic 1
[root@dp-esx02:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile --zdumpname /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1
Created file /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1
[root@dp-esx02:~] ls -lah /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1
-rw-r--r-- 1 root root 443.9M Oct 29 13:07 /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.1.1
vsish -e set /reliability/crashMe/Panic 1
[root@dp-esx02:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile --zdumpname /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2
Created file /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1
[root@dp-esx02:~] ls -lah /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1
-rw-r--r-- 1 root root 311.2M Nov 4 09:33 /vmfs/volumes/DP-STRG02-Datastore01/zdump/zdump-coredump.dp-esx02.2.1
ESXi, 8.0.3 (24022510) with 128 GB (131 008 MB) physical RAM
- vmKernel memory usage: 694 MB
- Other memory usage: 1 660 MB
- Free memory: 128 653 MB
[root@esx21:~] esxcli system coredump file list
Path Active Configured Size
------------------------------------------------------------------------------------------------------- ------ ---------- ----------
/vmfs/volumes/6727594d-c447be9c-5a0e-90b11c13fc14/vmkdump/4C4C4544-0054-5810-8033-B3C04F48354A.dumpfile true true 2441084928
vsish -e set /reliability/crashMe/Panic 1
[root@esx21:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/6727594d-c447be9c-5a0e-90b11c13fc14/vmkdump/4C4C4544-0054-5810-8033-B3C04F48354A.dumpfile --zdumpname /vmfs/volumes/ESX21
-FLASH-01/coredump.esx21.1
Created file /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
[root@esx21:~] ls -lah /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
-rw-r--r-- 1 root root 111.0M Dec 2 08:26 /vmfs/volumes/ESX21-FLASH-01/coredump.esx21.1.1
- 128 GB RAM is half of 256 GB RAM
- vmKernel memory usage 694 MB is ~half of 1453 MB
- coredump file 2.27 GB file is ~60% of 3.6 GB file
- zdump file 111 MB is ~4x smaller than 443.9 MB
ESXi, 7.0.3 (23794027) with 512 GB (524 178 MB) physical RAM
- vmKernel memory usage: 3 227 MB
- Other memory usage: 366 140 MB
- Free memory: 154 810 MB
- vmKernel memory usage: 2 776 MB
- Other memory usage: 25 402 MB
- Free memory: 495 998 MB
[root@prg03t0-esx05:~] esxcli system coredump file list
Path Active Configured Size
------------------------------------------------------------------------------------------------------------------ ------ ---------- ----------
/vmfs/volumes/6233a3c2-58e4bf62-94e7-0025b5ea0e13/vmkdump/00000000-00E0-0000-0000-000000000006-8162115584.dumpfile true true 8162115584
vsish -e set /reliability/crashMe/Panic 1
[root@prg03t0-esx05:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/6233a3c2-58e4bf62-94e7-0025b5ea0e13/vmkdump/00000000-00E0-0000-0000-000000000006-8162115584.dumpfile --zdumpname /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1
Created file /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1
[root@prg03t0-esx05:~] ls -lah /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1
-rw-r--r-- 1 root root 4.6G Nov 5 18:11 /vmfs/volumes/PRG03T0-HDD01/coredump.esx05.1.1
- 512 GB RAM is 2x bigger than 256 GB RAM
- vmKernel memory usage 2 776 MB is ~2x larger than 1453 MB (it makes sense)
- coredump file 8.16 GB file is ~2.25x larger than 3.6 GB file (it makes sense)
- zdump file 4.6 GB (4 710 MB) is ~10x larger than 443.9 MB (hmm, interesting)
ESXi, 7.0.3 (23794027) with 1.5 TB (1 571 489 MB) physical RAM
- vmKernel memory usage: 2 705 MB
- Other memory usage: 2 705 MB
- Free memory: 1 561 570 MB
[root@prg0301-esx36:~] esxcli system coredump file list
Path Active Configured Size
------------------------------------------------------------------------------------------------------------------- ------ ---------- -----------
/vmfs/volumes/5dec0956-3d83cd8b-de10-0025b52ae000/vmkdump/00000000-0021-0000-0000-000000000024.dumpfile true true 16106127360
Conclusion
- 101 MB - Boot Loader partition
- 4 GB - Boot Bank 1 partition
- 4 GB - Boot Bank 2 partition
- 119.9 GB - ESX-OSData partition
Filesystem Size Used Available Use% Mounted on
VMFSOS 119.8G 5.2G 114.6G 4% /vmfs/volumes/OSDATA-66d98185-2bceed00-72c5-0025b5ea0e0d
vfat 4.0G 274.1M 3.7G 7% /vmfs/volumes/BOOTBANK1
vfat 4.0G 338.9M 3.7G 8% /vmfs/volumes/BOOTBANK2
- 101 MB - Boot Loader partition
- 4 GB - Boot Bank 1 partition
- 4 GB - Boot Bank 2 partition
- 23.9 GB - ESX-OSData partition
No comments:
Post a Comment