SPECaccel(R)2023 Result Supermciro Tesla H100 PCIe 80GB 120GQ-TNRT Test Sponsor: NVIDIA Corporation accel2023 License: 9045 Test date: Oct-2023 Test sponsor: NVIDIA Corporation Hardware availability: Mar-2023 Tested by: NVIDIA Corporation Software availability: Nov-2023 Base Base Base Base Peak Peak Peak Peak Benchmarks Model Ref. Run Time Ratio Model Ref. Run Time Ratio -------------- ------ ------ --------- --------- ------ ------ --------- --------- 403.stencil LOP 440 229 1.92 S LOP 440 229 1.92 S 403.stencil LOP 440 229 1.92 S LOP 440 229 1.92 S 403.stencil LOP 440 229 1.92 * LOP 440 229 1.92 * 404.lbm LOP 455 134 3.39 * LOP 455 134 3.39 * 404.lbm LOP 455 134 3.41 S LOP 455 134 3.41 S 404.lbm LOP 455 135 3.37 S LOP 455 135 3.37 S 450.md LOP 600 320 1.88 S LOP 600 320 1.88 S 450.md LOP 600 319 1.88 * LOP 600 319 1.88 * 450.md LOP 600 319 1.88 S LOP 600 319 1.88 S 452.ep LOP 415 143 2.89 S LOP 415 143 2.89 S 452.ep LOP 415 144 2.89 S LOP 415 144 2.89 S 452.ep LOP 415 144 2.89 * LOP 415 144 2.89 * 453.clvrleaf LOP 1000 170 5.89 S LOP 1000 170 5.89 S 453.clvrleaf LOP 1000 169 5.91 S LOP 1000 169 5.91 S 453.clvrleaf LOP 1000 170 5.89 * LOP 1000 170 5.89 * 455.seismic LOP 780 235 3.32 S LOP 780 235 3.32 S 455.seismic LOP 780 235 3.32 * LOP 780 235 3.32 * 455.seismic LOP 780 238 3.28 S LOP 780 238 3.28 S 456.spF LOP 475 129 3.67 S LOP 475 129 3.67 S 456.spF LOP 475 129 3.67 S LOP 475 129 3.67 S 456.spF LOP 475 129 3.67 * LOP 475 129 3.67 * 457.spC LOP 540 183 2.95 * LOP 540 183 2.95 * 457.spC LOP 540 183 2.95 S LOP 540 183 2.95 S 457.spC LOP 540 183 2.95 S LOP 540 183 2.95 S 459.miniGhost LOP 590 304 1.94 * LOP 590 304 1.94 * 459.miniGhost LOP 590 304 1.94 S LOP 590 304 1.94 S 459.miniGhost LOP 590 304 1.94 S LOP 590 304 1.94 S 460.ilbdc LOP 555 211 2.63 * LOP 555 211 2.63 * 460.ilbdc LOP 555 211 2.63 S LOP 555 211 2.63 S 460.ilbdc LOP 555 212 2.62 S LOP 555 212 2.62 S 463.swim LOP 440 200 2.20 * LOP 440 200 2.20 * 463.swim LOP 440 198 2.23 S LOP 440 198 2.23 S 463.swim LOP 440 209 2.10 S LOP 440 209 2.10 S 470.bt LOP 1055 178 5.91 * LOP 1055 178 5.91 * 470.bt LOP 1055 178 5.91 S LOP 1055 178 5.91 S 470.bt LOP 1055 178 5.91 S LOP 1055 178 5.91 S ============================================================================================ 403.stencil LOP 440 229 1.92 * LOP 440 229 1.92 * 404.lbm LOP 455 134 3.39 * LOP 455 134 3.39 * 450.md LOP 600 319 1.88 * LOP 600 319 1.88 * 452.ep LOP 415 144 2.89 * LOP 415 144 2.89 * 453.clvrleaf LOP 1000 170 5.89 * LOP 1000 170 5.89 * 455.seismic LOP 780 235 3.32 * LOP 780 235 3.32 * 456.spF LOP 475 129 3.67 * LOP 475 129 3.67 * 457.spC LOP 540 183 2.95 * LOP 540 183 2.95 * 459.miniGhost LOP 590 304 1.94 * LOP 590 304 1.94 * 460.ilbdc LOP 555 211 2.63 * LOP 555 211 2.63 * 463.swim LOP 440 200 2.20 * LOP 440 200 2.20 * 470.bt LOP 1055 178 5.91 * LOP 1055 178 5.91 * SPECaccel 2023_base 2.98 SPECaccel 2023_peak 2.98 HARDWARE -------- CPU Name: Intel Xeon Gold 6338 Max MHz.: 3400 Nominal: 2000 Enabled: 64 cores, 2 chips, 2 threads/core Orderable: 2 chips Cache L1: 32 KB I + 48 KB D on chip per core L2: 1280 KB I+D on chip per core L3: 48 MB I+D on chip per chip Other: None Memory: 512 GB (16x 16GB, PC3200 CL3 DDR4) Storage: 1TB SATA Other: None Base Threads Run: 1 Min. Peak Threads: 1 Max. Peak Threads: 1 ACCELERATOR ----------- Accel Model Name: H100 PCIe 80GB Accel Vendor: NVIDIA Accel Name: Tesla H100 PCIe 80GB Type of Accel: GPU Accel Connection: PCIe 4.0 16x Does Accel Use ECC: Yes Accel Description: See Notes Accel Driver: NVIDIA UNIX x86_64 Kernel Module 525.60.13 SOFTWARE -------- OS: Rocky Linux release 8.8 (Green Obsidian) 4.18.0-477.15.1.el8_8.x86_64 Compiler: C/Fortran: Version 23.11 of NVHPC SDK Firmware: 1.4a 10/11/2022 File System: xfs System State: Run level 3 (multi-user) Other: None Base Parallel Model: LOP Base Threads Run: 1 Peak Parallel Models: LOP Max. Peak Threads: 1 Min. Peak Threads: 1 Operating System Notes ---------------------- Shell stacksize set to unlimited via "limit stacksize unlimited" Platform Notes -------------- Information from nvaccelinfo CUDA Driver Version: 12000 NVRM version: NVIDIA UNIX x86_64 Kernel Module 525.60.13 Wed Nov 30 06:39:21 UTC 2022 Device Number: 0 Device Name: NVIDIA H100 PCIe Device Revision Number: 9.0 Global Memory Size: 85021163520 Number of Multiprocessors: 114 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate: 1755 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels: Yes ECC Enabled: Yes Memory Clock Rate: 1593 MHz Memory Bus Width: 5120 bits L2 Cache Size: 52428800 bytes Max Threads Per SMP: 2048 Async Engines: 3 Unified Addressing: Yes Managed Memory: Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch: Yes Cluster Launch: Yes Unified Function Pointers: Yes Default Target: cc90 Sysinfo program /local/home/mcolgrove/ACCELV2/bin/sysinfo Rev: r6622 of 2021-04-07 b1a7d5f8f71be5aff70a755cad7211a0 running on ice3 Fri Oct 20 17:48:42 2023 SUT (System Under Test) info as seen by some common utilities. For more information on this section, see https://www.spec.org/cpu2017/Docs/config.html#sysinfo From /proc/cpuinfo model name : Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz 2 "physical id"s (chips) 128 "processors" cores, siblings (Caution: counting these is hw and system dependent. The following excerpts from /proc/cpuinfo might not be reliable. Use with caution.) cpu cores : 32 siblings : 64 physical 0: cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 physical 1: cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 From lscpu from util-linux 2.32.1: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 32 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 106 Model name: Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz Stepping: 6 CPU MHz: 3200.000 CPU max MHz: 3200.0000 CPU min MHz: 800.0000 BogoMIPS: 4000.00 Virtualization: VT-x L1d cache: 48K L1i cache: 32K L2 cache: 1280K L3 cache: 49152K NUMA node0 CPU(s): 0-31,64-95 NUMA node1 CPU(s): 32-63,96-127 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities /proc/cpuinfo cache data cache size : 49152 KB From numactl --hardware WARNING: a numactl 'node' might or might not correspond to a physical chip. available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 node 0 size: 257616 MB node 0 free: 126167 MB node 1 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 node 1 size: 257985 MB node 1 free: 227371 MB node distances: node 0 1 0: 10 20 1: 20 10 From /proc/meminfo MemTotal: 527975808 kB HugePages_Total: 0 Hugepagesize: 2048 kB /sbin/tuned-adm active Current active profile: throughput-performance /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor has performance /usr/bin/lsb_release -d Rocky Linux release 8.8 (Green Obsidian) From /etc/*release* /etc/*version* centos-release: Rocky Linux release 8.8 (Green Obsidian) os-release: NAME="Rocky Linux" VERSION="8.8 (Green Obsidian)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="8.8" PLATFORM_ID="platform:el8" PRETTY_NAME="Rocky Linux 8.8 (Green Obsidian)" ANSI_COLOR="0;32" redhat-release: Rocky Linux release 8.8 (Green Obsidian) rocky-release: Rocky Linux release 8.8 (Green Obsidian) rocky-release-upstream: Derived from Red Hat Enterprise Linux 8.8 system-release: Rocky Linux release 8.8 (Green Obsidian) system-release-cpe: cpe:/o:rocky:rocky:8:GA uname -a: Linux ice3 4.18.0-477.15.1.el8_8.x86_64 #1 SMP Wed Jun 28 15:04:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Kernel self-reported vulnerability status: CVE-2018-12207 (iTLB Multihit): Not affected CVE-2018-3620 (L1 Terminal Fault): Not affected Microarchitectural Data Sampling: Not affected CVE-2017-5754 (Meltdown): Not affected mmio_stale_data: Mitigation: Clear CPU buffers; SMT vulnerable retbleed: Not affected CVE-2018-3639 (Speculative Store Bypass): Mitigation: Speculative Store Bypass disabled via prctl CVE-2017-5753 (Spectre variant 1): Mitigation: usercopy/swapgs barriers and __user pointer sanitization CVE-2017-5715 (Spectre variant 2): Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling, PBRSB-eIBRS: SW sequence CVE-2020-0543 (Special Register Buffer Data Sampling): Not affected CVE-2019-11135 (TSX Asynchronous Abort): Not affected run-level 3 Sep 19 12:23 SPEC is set to: /local/home/mcolgrove/ACCELV2 Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/rl_ice33-local xfs 930G 201G 729G 22% /local From /sys/devices/virtual/dmi/id Vendor: Supermicro Product: SYS-120GQ-TNRT Product Family: SMC X12 Cannot run dmidecode; consider saying (as root) chmod +s /usr/sbin/dmidecode BIOS: BIOS Vendor: American Megatrends International, LLC. BIOS Version: 1.4a BIOS Date: 10/11/2022 (End of data from sysinfo program) Compiler Version Notes ---------------------- ============================================================================== C | 457.spC(base) ------------------------------------------------------------------------------ /usr/lib64/crt1.o: In function `_start': (.text+0x24): undefined reference to `main' pgacclnk: child process exit status 1: /usr/bin/ld nvc Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 403.stencil(base) 404.lbm(base) 452.ep(base) 470.bt(base) ------------------------------------------------------------------------------ nvc Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 457.spC(base) ------------------------------------------------------------------------------ /usr/lib64/crt1.o: In function `_start': (.text+0x24): undefined reference to `main' pgacclnk: child process exit status 1: /usr/bin/ld nvc Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 403.stencil(base) 404.lbm(base) 452.ep(base) 470.bt(base) ------------------------------------------------------------------------------ nvc Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== Fortran | 450.md(base) 455.seismic(base) 456.spF(base) 460.ilbdc(base) | 463.swim(base) ------------------------------------------------------------------------------ nvfortran Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== Fortran, C | 453.clvrleaf(base) 459.miniGhost(base) ------------------------------------------------------------------------------ nvfortran Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. nvc Rel Dev-r239283 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ Base Compiler Invocation ------------------------ C benchmarks: nvc Fortran benchmarks: nvfortran Benchmarks using both Fortran and C: nvfortran nvc Base Portability Flags ---------------------- 403.stencil: -DSPEC_NO_NOTHING 457.spC: -mcmodel=medium -Wl,--no-relax Base Optimization Flags ----------------------- C benchmarks: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Fortran benchmarks: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Benchmarks using both Fortran and C: 453.clvrleaf: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia 459.miniGhost: -Mnomain -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Peak Optimization Flags ----------------------- C benchmarks: 403.stencil: basepeak = yes 404.lbm: basepeak = yes 452.ep: basepeak = yes 457.spC: basepeak = yes 470.bt: basepeak = yes Fortran benchmarks: 450.md: basepeak = yes 455.seismic: basepeak = yes 456.spF: basepeak = yes 460.ilbdc: basepeak = yes 463.swim: basepeak = yes Benchmarks using both Fortran and C: 453.clvrleaf: basepeak = yes 459.miniGhost: basepeak = yes The flags file that was used to format this result can be browsed at http://www.spec.org/accel2023/flags/nv2023_flags_v2.html You can also download the XML flags source by saving the following link: http://www.spec.org/accel2023/flags/nv2023_flags_v2.xml SPECaccel is a registered trademark of the Standard Performance Evaluation Corporation. All other brand and product names appearing in this result are trademarks or registered trademarks of their respective holders. ------------------------------------------------------------------------------------------------------------- For questions about this result, please contact the tester. For other inquiries, please contact info@spec.org. Copyright 2023 Standard Performance Evaluation Corporation Tested with SPECaccel2023 v2.0.17 on 2023-10-20 20:48:41-0400. Report generated on 2023-12-06 13:07:26 by accel2023 ASCII formatter v112. Originally published on 2023-11-08.