SPEC(R) MPIL2007 Summary
                          Hewlett Packard Enterprise
                                   SGI 8600
                       (Intel Xeon Gold 6148, 2.40 GHz)
                           Tue Oct 10 20:46:29 2017

MPI2007 License: 1                                       Test date: Oct-2017
Test sponsor: HPE                            Hardware availability: Jul-2017
Tested by:    HPE                            Software availability: Nov-2017

                Base     Base       Base        Peak     Peak       Peak
Benchmarks      Ranks  Run Time     Ratio       Ranks  Run Time     Ratio
-------------- ------  ---------  ---------    ------  ---------  ---------
121.pop2         4352       22.3      175   S    4096       21.0      186   S  
121.pop2         4352       20.9      186   S    4096       20.5      190   *  
121.pop2         4352       21.1      185   *    4096       20.3      192   S  
122.tachyon      4352       28.2       69.0 S    4352       28.2       69.0 S  
122.tachyon      4352       27.9       69.6 S    4352       27.9       69.6 S  
122.tachyon      4352       28.2       69.0 *    4352       28.2       69.0 *  
125.RAxML        4352       47.2       61.9 *    5120       42.1       69.3 *  
125.RAxML        4352       47.0       62.1 S    5120       42.0       69.5 S  
125.RAxML        4352       47.3       61.7 S    5120       42.3       68.9 S  
126.lammps       4352       16.6      148   S    5120       14.7      168   S  
126.lammps       4352       16.3      151   *    5120       15.1      163   S  
126.lammps       4352       16.0      154   S    5120       14.8      166   *  
128.GAPgeofem    4352       48.9      121   *    4352       48.9      121   *  
128.GAPgeofem    4352       48.9      121   S    4352       48.9      121   S  
128.GAPgeofem    4352       48.6      122   S    4352       48.6      122   S  
129.tera_tf      4352       20.5       53.7 S    4096       20.4       53.9 S  
129.tera_tf      4352       20.4       54.0 *    4096       20.0       54.9 *  
129.tera_tf      4352       20.2       54.5 S    4096       19.9       55.2 S  
132.zeusmp2      4352       23.4       90.5 S    2048       20.4      104   S  
132.zeusmp2      4352       23.8       89.0 S    2048       20.8      102   S  
132.zeusmp2      4352       23.7       89.4 *    2048       20.7      102   *  
137.lu           4352       19.0      221   S    2048       15.7      268   S  
137.lu           4352       19.2      219   S    2048       15.6      270   *  
137.lu           4352       19.1      220   *    2048       15.4      273   S  
142.dmilc        4352       13.0      283   *    4352       13.0      283   *  
142.dmilc        4352       13.2      280   S    4352       13.2      280   S  
142.dmilc        4352       13.0      283   S    4352       13.0      283   S  
143.dleslie      4352       12.6      247   S    4864       11.5      269   S  
143.dleslie      4352       12.6      246   S    4864       11.8      262   S  
143.dleslie      4352       12.6      247   *    4864       11.6      267   *  
145.lGemsFDTD    4352       37.0      119   *    2048       30.3      146   *  
145.lGemsFDTD    4352       37.2      119   S    2048       30.3      146   S  
145.lGemsFDTD    4352       36.9      119   S    2048       30.4      145   S  
147.l2wrf2       4352       33.3      246   *    5120       31.8      258   S  
147.l2wrf2       4352       33.7      244   S    5120       31.8      258   *  
147.l2wrf2       4352       33.2      247   S    5120       31.9      257   S  
==============================================================================
121.pop2         4352       21.1      185   *    4096       20.5      190   *  
122.tachyon      4352       28.2       69.0 *    4352       28.2       69.0 *  
125.RAxML        4352       47.2       61.9 *    5120       42.1       69.3 *  
126.lammps       4352       16.3      151   *    5120       14.8      166   *  
128.GAPgeofem    4352       48.9      121   *    4352       48.9      121   *  
129.tera_tf      4352       20.4       54.0 *    4096       20.0       54.9 *  
132.zeusmp2      4352       23.7       89.4 *    2048       20.7      102   *  
137.lu           4352       19.1      220   *    2048       15.6      270   *  
142.dmilc        4352       13.0      283   *    4352       13.0      283   *  
143.dleslie      4352       12.6      247   *    4864       11.6      267   *  
145.lGemsFDTD    4352       37.0      119   *    2048       30.3      146   *  
147.l2wrf2       4352       33.3      246   *    5120       31.8      258   *  
 SPECmpiL_base2007                    133  
 SPECmpiL_peak2007                                                    144  


                              BENCHMARK DETAILS
                              -----------------
      Type of System: Homogeneous
 Total Compute Nodes: 128
         Total Chips: 256
         Total Cores: 5120
       Total Threads: 10240
        Total Memory: 24 TB
      Base Ranks Run: 4352
  Minimum Peak Ranks: 2048
  Maximum Peak Ranks: 5120
          C Compiler: Intel C Composer XE for Linux,
                      Version 18.0.0.128 Build 20170811
        C++ Compiler: Intel C++ Composer XE for Linux,
                      Version 18.0.0.128 Build 20170811
    Fortran Compiler: Intel Fortran Composer XE for Linux,
                      Version 18.0.0.128 Build 20170811
       Base Pointers: 64-bit
       Peak Pointers: 64-bit
         MPI Library: HPE Performance Software - Message Passing
                      Interface 2.17
      Other MPI Info: OFED 3.2.2
      Pre-processors: None
      Other Software: None

                Node Description: HPE XA730i Gen10 Server Node
                ==============================================


                                   HARDWARE
                                   --------
     Number of nodes: 128
    Uses of the node: compute
              Vendor: Hewlett Packard Enterprise
               Model: SGI 8600 (Intel Xeon Gold 6148, 2.40 GHz)
            CPU Name: Intel Xeon Gold 6148
    CPU(s) orderable: 1-2 chips
       Chips enabled: 2
       Cores enabled: 40
      Cores per chip: 20
    Threads per core: 2
 CPU Characteristics: Intel Turbo Boost Technology up to 3.70 GHz
             CPU MHz: 2400
       Primary Cache: 32 KB I + 32 KB D on chip per core
     Secondary Cache: 1 MB I+D on chip per core
            L3 Cache: 27.5 MB I+D on chip per chip
         Other Cache: None
              Memory: 192 GB (12 x 16 GB 2Rx4 PC4-2666V-R)
      Disk Subsystem: None
      Other Hardware: None
             Adapter: Mellanox MT27700 with ConnectX-4 ASIC
  Number of Adapters: 2
           Slot Type: PCIe x16 Gen3 8GT/s
           Data Rate: InfiniBand 4X EDR
          Ports Used: 1
   Interconnect Type: InfiniBand


                                   SOFTWARE
                                   --------
             Adapter: Mellanox MT27700 with ConnectX-4 ASIC
      Adapter Driver: OFED-3.4-2.1.8.0
    Adapter Firmware: 12.18.1000
    Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo),
                      Kernel 3.10.0-514.2.2.el7.x86_64
   Local File System: LFS
  Shared File System: LFS
        System State: Multi-user, run level 3
      Other Software: SGI Management Center Compute Node 3.5.0,
                      Build 716r171.rhel73-1705051353


                         Node Description: Lustre FS
                         ===========================


                                   HARDWARE
                                   --------
     Number of nodes: 4
    Uses of the node: fileserver
              Vendor: Hewlett Packard Enterprise
               Model: Rackable C1104-GP2 (Intel Xeon E5-2690 v3, 2.60
                      GHz)
            CPU Name: Intel Xeon E5-2690 v3
    CPU(s) orderable: 1-2 chips
       Chips enabled: 2
       Cores enabled: 24
      Cores per chip: 12
    Threads per core: 1
 CPU Characteristics: Intel Turbo Boost Technology up to 3.50 GHz
                      Hyper-Threading Technology disabled
             CPU MHz: 2600
       Primary Cache: 32 KB I + 32 KB D on chip per core
     Secondary Cache: 256 KB I+D on chip per core
            L3 Cache: 30 MB I+D on chip per chip
         Other Cache: None
              Memory: 128 GB (8 x 16 GB 2Rx4 PC4-2133P-R)
      Disk Subsystem: 684 TB RAID 6
                      48 x 8+2 2TB 7200 RPM
      Other Hardware: None
             Adapter: Mellanox MT27700 with ConnectX-4 ASIC
  Number of Adapters: 2
           Slot Type: PCIe x16 Gen3
           Data Rate: InfiniBand 4X EDR
          Ports Used: 1
   Interconnect Type: InfiniBand


                                   SOFTWARE
                                   --------
             Adapter: Mellanox MT27700 with ConnectX-4 ASIC
      Adapter Driver: OFED-3.3-1.0.0.0
    Adapter Firmware: 12.14.2036
    Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo),
                      Kernel 3.10.0-514.2.2.el7.x86_64
   Local File System: ext3
  Shared File System: LFS
        System State: Multi-user, run level 3
      Other Software: None


              Interconnect Description: InfiniBand (MPI and I/O)
              ==================================================


                                   HARDWARE
                                   --------
              Vendor: Mellanox Technologies and SGI
               Model: SGI P0002145
        Switch Model: SGI P0002145
  Number of Switches: 30
     Number of Ports: 36
           Data Rate: InfiniBand 4X EDR
            Firmware: 11.0350.0394
            Topology: Enhanced Hypercube
         Primary Use: MPI and I/O traffic


                              Base Tuning Notes
                              -----------------
    src.alt used: 143.dleslie->integer_overflow

                                 Submit Notes
                                 ------------
    The config file option 'submit' was used.

                                General Notes
                                -------------
    
    
     Software environment:
       export MPI_REQUEST_MAX=65536
       export MPI_TYPE_MAX=32768
       export MPI_IB_RAILS=2
       export MPI_IB_IMM_UPGRADE=false
       export MPI_CONNECTIONS_THRESHOLD=0
       export MPI_IB_DCIS=2
       export MPI_IB_HYPER_LAZY=false
       ulimit -s unlimited
    
     BIOS settings:
       AMI BIOS version SAED7177, 07/17/2017
    
     Job Placement:
       Each MPI job was assigned to a topologically compact set
       of nodes.
    
     Additional notes regarding interconnect:
       The Infiniband network consists of two independent planes,
       with half the switches in the system allocated to each plane.
       I/O traffic is restricted to one plane, while MPI traffic can
       use both planes.

                           Base Compiler Invocation
                           ------------------------
C benchmarks: 
     icc

C++ benchmarks:

    126.lammps: icpc

Fortran benchmarks: 
     ifort

Benchmarks using both Fortran and C: 
     icc ifort


                            Base Portability Flags
                            ----------------------
      121.pop2: -DSPEC_MPI_CASE_FLAG


                           Base Optimization Flags
                           -----------------------
C benchmarks: 
     -O3 -xCORE-AVX512 -no-prec-div -ipo

C++ benchmarks:

    126.lammps: -O3 -xCORE-AVX512 -no-prec-div -ansi-alias -ipo

Fortran benchmarks: 
     -O3 -xCORE-AVX512 -no-prec-div -ipo

Benchmarks using both Fortran and C: 
     -O3 -xCORE-AVX512 -no-prec-div -ipo


                               Base Other Flags
                               ----------------
C benchmarks: 
     -lmpi

C++ benchmarks:

    126.lammps: -lmpi

Fortran benchmarks: 
     -lmpi

Benchmarks using both Fortran and C: 
     -lmpi


                           Peak Compiler Invocation
                           ------------------------
C benchmarks (except as noted below): 
     icc

     125.RAxML: /sw/sdev/intel/parallel_studio_xe_2017_update4/compilers_and_libraries_2017.4.196/linux/bin/intel64/icc

C++ benchmarks:

    126.lammps: icpc

Fortran benchmarks (except as noted below): 
     ifort

   143.dleslie: /sw/sdev/intel/parallel_studio_xe_2017_update4/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort

Benchmarks using both Fortran and C: 
     icc ifort


                            Peak Portability Flags
                            ----------------------
Same as Base Portability Flags


                           Peak Optimization Flags
                           -----------------------
C benchmarks:

   122.tachyon: basepeak = yes

     125.RAxML: -O3 -xCORE-AVX512 -no-prec-div -ipo

     142.dmilc: basepeak = yes

C++ benchmarks:

    126.lammps: -O3 -xCORE-AVX512 -no-prec-div -ansi-alias -ipo

Fortran benchmarks: 
     -O3 -xCORE-AVX512 -no-prec-div -ipo

Benchmarks using both Fortran and C:

      121.pop2: -O3 -xCORE-AVX512 -no-prec-div -ipo

 128.GAPgeofem: basepeak = yes

   132.zeusmp2: Same as 121.pop2

    147.l2wrf2: Same as 121.pop2


                               Peak Other Flags
                               ----------------
Same as Base Other Flags


The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/HPE_x86_64_Intel18_flags.html

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/HPE_x86_64_Intel18_flags.xml

    SPEC and SPEC MPI are registered trademarks of the Standard
    Performance Evaluation Corporation.  All other brand and product names
    appearing in this result are trademarks or registered trademarks of
    their respective holders.
-----------------------------------------------------------------------------
For questions about this result, please contact the tester.
For other inquiries, please contact webmaster@spec.org.
Copyright 2006-2010 Standard Performance Evaluation Corporation
Tested with SPEC MPI2007 v2.0.1.
Report generated on Wed Oct 25 17:12:10 2017 by MPI2007 ASCII formatter v1463.
Originally published on 25 October 2017.