SPEC® ACCEL_OCL Result

Copyright 2014-2015 Standard Performance Evaluation Corporation

ASUS (Test Sponsor: NVIDIA Corporation)

NVIDIA Tesla K40c

ASUS P9X79 Motherboard

SPECaccel_ocl_base = 1.98

SPECaccel_ocl_energy_base = 2.56

SPECaccel_ocl_peak = 2.29

SPECaccel_ocl_energy_peak = 2.99

ACCEL license: 019 Test date: Feb-2014
Test sponsor: NVIDIA Corporation Hardware Availability: Nov-2013
Tested by: NVIDIA Corporation Software Availability: Feb-2014
Benchmark results graph
Hardware
CPU Name: Intel Core i7-3930K
CPU Characteristics:
CPU MHz: 3200
CPU MHz Maximum: 3800
FPU: Integrated
CPU(s) enabled: 6 cores, 1 chip, 6 cores/chip, 2 threads/core
CPU(s) orderable: 1 chip
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 8 GB (2 x 4 GB 2Rx4 PC3-14900R-9, running at 1600
MHz)
Disk Subsystem: 1000 GB Seagate ST1000DM003 7200 RPM SATA
Other Hardware: None
Accelerator
Accel Model Name: Tesla K40c
Accel Vendor: NVIDIA
Accel Name: NVIDIA Tesla K40c
Type of Accel: GPU
Accel Connection: PCIe 3.0 16x
Does Accel Use ECC: No
Accel Description: See Notes
Accel Driver: NVIDIA UNIX x86_64 Kernel Module 319.60
Software
Operating System: Red Hat Enterprise Linux Server release 6.4
(Santiago)
2.6.32-358.el6.x86_64
Compiler: PGI Accelerator Server Complete, Release 14.2
File System: ext4
System State: Run level 3 (multi-user)
Other Software: CUDA 5.5 SDK
Power
Power Supply: 1200 W
Power Supply Details: Thermaltake SMART M1200W
Max. Power (W): 314.79
Idle Power (W): 110.34
Min. Temperature (C): 25.94
Power Analyzer
Power Analyzer: Power Analyzer
Hardware Vendor: Xitron Technologies, Inc.
Model: 2801
Serial Number: 28011109005
Input Connection: RS232 via USB-adapter
Metrology Institute: NIST
Calibration By: Micro Precision Calibration, Inc.
Calibration Label: 220081222038459
Calibration Date: 02.20.2014
PTDaemon Version: 1.6.2 (372e138a; 2013-12-04)
Setup Description: connected to the single power supply
that powers the system
Current Ranges Used: 2.0A
Voltage Range Used: 135V
Temperature Meter
Temperature Meter: Temperature Meter
Hardware Vendor: Digi
Model: DigiWATCHPORT_H
Serial Number: WS34682143
Input Connection: USB
PTDaemon Version: 1.6.2 (372e138a; 2013-12-04)
Setup Description: Position 5mm above intake fan

Base Results Table

Benchmark Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
101.tpacf 65.9  1.62  14.6   237   222   2.23  65.8  1.63  14.9   241   227   2.19  66.2  1.62  15.0   240   226   2.18 
103.stencil 57.0  2.19  16.1   292   282   2.81  58.0  2.16  16.4   292   282   2.77  58.0  2.16  16.3   292   282   2.77 
104.lbm 39.5  2.84  11.0   287   278   3.51  39.5  2.84  11.2   289   284   3.43  39.5  2.84  11.0   289   279   3.50 
110.fft 63.6  1.75  19.4   314   305   2.11  63.3  1.75  19.4   315   306   2.12  63.2  1.76  19.6   315   310   2.09 
112.spmv 76.4  1.92  20.9   289   273   2.52  76.1  1.93  20.7   290   272   2.54  76.4  1.92  20.8   290   273   2.53 
114.mriq 31.4  3.47  8.18  270   260   4.41  31.5  3.46  8.19  270   260   4.41  31.5  3.47  8.35  270   265   4.32 
116.histo 78.0  1.46  15.4   215   198   2.02  72.2  1.58  14.4   216   199   2.17  77.8  1.47  15.4   216   198   2.02 
117.bfs 52.9  2.21  13.0   263   245   2.93  52.7  2.22  12.9   263   245   2.95  52.8  2.21  12.9   263   245   2.94 
118.cutcp 32.8  3.02  8.59  273   262   3.86  32.8  3.01  8.59  275   262   3.86  32.7  3.02  8.57  273   262   3.87 
120.kmeans 88.5  1.13  17.5   207   198   1.54  87.6  1.14  17.3   205   198   1.56  86.2  1.16  17.1   206   199   1.58 
121.lavamd 58.8  1.85  17.2   308   292   2.30  58.7  1.86  17.1   308   292   2.30  58.6  1.86  17.1   308   292   2.30 
122.cfd 70.7  1.78  18.3   268   259   2.35  70.5  1.79  18.3   267   260   2.35  70.4  1.79  18.5   268   262   2.33 
123.nw 65.0  1.77  14.8   235   227   2.44  65.0  1.77  14.8   235   228   2.44  65.0  1.77  14.8   235   228   2.44 
124.hotspot 37.1  3.07  10.5   300   284   3.58  37.1  3.08  10.6   301   285   3.57  37.1  3.07  10.6   301   287   3.55 
125.lud 79.4  1.50  22.1   290   278   1.99  79.3  1.50  22.3   289   281   1.97  79.3  1.50  22.2   289   279   1.98 
126.ge 52.0  2.98  13.7   276   263   3.92  52.0  2.98  13.8   276   265   3.89  52.0  2.98  13.8   277   266   3.88 
127.srad 53.4  2.14  15.2   295   285   2.62  53.4  2.14  15.2   295   284   2.63  53.4  2.14  15.2   295   286   2.62 
128.heartwall 87.6  1.21  21.6   256   247   1.67  86.0  1.23  21.7   273   252   1.66  86.0  1.23  21.4   258   249   1.68 
140.bplustree 67.9  1.59  16.8   262   247   2.11  68.0  1.59  16.9   256   249   2.09  67.9  1.59  16.8   257   247   2.11 

Peak Results Table

Benchmark Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Seconds Ratio Energy
(kJ)
Maximum
Power
Average
Power
Energy
Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
101.tpacf 59.4   1.80  13.6   240   230   2.39  59.3   1.80  13.6   240   229   2.40  58.9   1.82  13.5   240   228   2.43 
103.stencil 57.0   2.19  16.1   292   282   2.81  58.0   2.16  16.4   292   282   2.77  58.0   2.16  16.3   292   282   2.77 
104.lbm 33.5   3.34  9.29  287   277   4.15  33.5   3.34  9.30  287   278   4.14  33.5   3.34  9.29  286   277   4.15 
110.fft 63.6   1.75  19.4   314   305   2.11  63.3   1.75  19.4   315   306   2.12  63.2   1.76  19.6   315   310   2.09 
112.spmv 73.0   2.01  20.0   293   274   2.63  73.0   2.02  20.0   298   274   2.63  73.1   2.01  20.0   292   274   2.63 
114.mriq 31.4   3.47  8.18  270   260   4.41  31.5   3.46  8.19  270   260   4.41  31.5   3.47  8.35  270   265   4.32 
116.histo 78.0   1.46  15.4   215   198   2.02  72.2   1.58  14.4   216   199   2.17  77.8   1.47  15.4   216   198   2.02 
117.bfs 38.5   3.04  9.29  255   241   4.09  38.5   3.04  9.27  258   241   4.10  38.4   3.05  9.23  258   241   4.12 
118.cutcp 32.8   3.02  8.59  273   262   3.86  32.8   3.01  8.59  275   262   3.86  32.7   3.02  8.57  273   262   3.87 
120.kmeans 84.4   1.18  16.8   206   199   1.61  82.2   1.22  16.4   206   199   1.65  83.1   1.20  16.5   206   199   1.63 
121.lavamd 58.8   1.85  17.2   308   292   2.30  58.7   1.86  17.1   308   292   2.30  58.6   1.86  17.1   308   292   2.30 
122.cfd 69.0   1.83  17.9   268   260   2.40  68.9   1.83  17.9   268   260   2.41  69.0   1.83  17.9   268   259   2.41 
123.nw 65.0   1.77  14.8   235   227   2.44  65.0   1.77  14.8   235   228   2.44  65.0   1.77  14.8   235   228   2.44 
124.hotspot 37.1   3.07  10.5   300   284   3.58  37.1   3.08  10.6   301   285   3.57  37.1   3.07  10.6   301   287   3.55 
125.lud 71.7   1.66  18.6   270   259   2.36  71.5   1.66  18.6   270   260   2.36  72.0   1.65  18.6   275   258   2.36 
126.ge 7.26  21.4   1.76  281   242   30.6   7.23  21.4   1.77  280   245   30.3   7.25  21.4   1.77  281   244   30.4  
127.srad 53.4   2.14  15.2   295   285   2.62  53.4   2.14  15.2   295   284   2.63  53.4   2.14  15.2   295   286   2.62 
128.heartwall 87.6   1.21  21.6   256   247   1.67  86.0   1.23  21.7   273   252   1.66  86.0   1.23  21.4   258   249   1.68 
140.bplustree 67.9   1.59  16.8   262   247   2.11  68.0   1.59  16.9   256   249   2.09  67.9   1.59  16.8   257   247   2.11 

Platform Notes

 Sysinfo program /local/home/SPECACCEL/Docs/sysinfo
 $Rev: 6874 $ $Date:: 2013-11-20 #$ 0953404ef7e75a5f9bbb534c6de3f831
 running on sbe02 Sat Feb 22 21:44:42 2014

 This section contains SUT (System Under Test) info as seen by
 some common utilities.  To remove or add to this section, see:
   http://www.spec.org/accel/Docs/config.html#sysinfo

 From /proc/cpuinfo
    model name : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
       1 "physical id"s (chips)
       12 "processors"
    cores, siblings (Caution: counting these is hw and system dependent.  The
    following excerpts from /proc/cpuinfo might not be reliable.  Use with
    caution.)
       cpu cores : 6
       siblings  : 12
       physical 0: cores 0 1 2 3 4 5
    cache size : 12288 KB

 From /proc/meminfo
    MemTotal:        8130700 kB
    HugePages_Total:       0
    Hugepagesize:       2048 kB

 /usr/bin/lsb_release -d
    Red Hat Enterprise Linux Server release 6.4 (Santiago)

 From /etc/*release* /etc/*version*
    redhat-release: Red Hat Enterprise Linux Server release 6.4 (Santiago)
    system-release: Red Hat Enterprise Linux Server release 6.4 (Santiago)
    system-release-cpe: cpe:/o:redhat:enterprise_linux:6server:ga:server

 uname -a:
    Linux sbe02 2.6.32-358.el6.x86_64 #1 SMP Tue Jan 29 11:47:41 EST 2013 x86_64
    x86_64 x86_64 GNU/Linux

 run-level 3 Feb 22 13:29

 SPEC is set to: /local/home/SPECACCEL
    Filesystem    Type    Size  Used Avail Use% Mounted on
    /dev/mapper/VolGroup-lv_home
                  ext4    860G   52G  765G   7% /local
 Additional information from dmidecode:

    Warning: Use caution when you interpret this section. The 'dmidecode' program
    reads system data which is "intended to allow hardware to be accurately
    determined", but the intent may not be met, as there are frequent changes to
    hardware, firmware, and the "DMTF SMBIOS" standard.


 (End of data from sysinfo program)
 Information from pgaccelinfo
 CUDA Driver Version:           5050
 NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  319.60
                                Wed Sep 25 14:28:26 PDT 2013
 Device Number:                 0
 Device Name:                   Tesla K40c
 Device Revision Number:        3.5
 Global Memory Size:            12079136768
 Number of Multiprocessors:     15
 Number of SP Cores:            2880
 Number of DP Cores:            960
 Concurrent Copy and Execution: Yes
 Total Constant Memory:         65536
 Total Shared Memory per Block: 49152
 Registers per Block:           65536
 Warp Size:                     32
 Maximum Threads per Block:     1024
 Maximum Block Dimensions:      1024, 1024, 64
 Maximum Grid Dimensions:       2147483647 x 65535 x 65535
 Maximum Memory Pitch:          2147483647B
 Texture Alignment:             512B
 Clock Rate:                    745 MHz
 Max. Clock Rate:               875 MHz
 Execution Timeout:             No
 Integrated Device:             No
 Can Map Host Memory:           Yes
 Compute Mode:                  default
 Concurrent Kernels:            Yes
 ECC Enabled:                   Yes
 Memory Clock Rate:             3004 MHz
 Memory Bus Width:              384 bits
 L2 Cache Size:                 1572864 bytes
 Max Threads Per SMP:           2048
 Async Engines:                 2
 Unified Addressing:            Yes

General Notes

 ECC disabled using the command "nvidia-smi -e 0"
 Kit built system using a CoolMaster HAF X case

Base Runtime Environment

C benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.1 CUDA 4.2.1   OpenCL Device #0: Tesla K40c, v 319.60 

C++ benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.1 CUDA 4.2.1   OpenCL Device #0: Tesla K40c, v 319.60 

Base Compiler Invocation

C benchmarks:

 pgcc 

C++ benchmarks:

 pgc++ 

Base Portability Flags

118.cutcp:  -D__GNUC__ 

Base Optimization Flags

C benchmarks:

 -fast   -Mfprelaxed 

C++ benchmarks:

 -fast   -Mfprelaxed 

Base Other Flags

C benchmarks:

 -I/opt/cuda-5.5/include/    -lOpenCL 

C++ benchmarks:

 -I/opt/cuda-5.5/include/    -lOpenCL 

Peak Runtime Environment

C benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.1 CUDA 4.2.1   OpenCL Device #0: Tesla K40c, v 319.60 

C++ benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.1 CUDA 4.2.1   OpenCL Device #0: Tesla K40c, v 319.60 

Peak Compiler Invocation

C benchmarks:

 pgcc 

C++ benchmarks:

 pgc++ 

Peak Portability Flags

118.cutcp:  -D__GNUC__ 

Peak Optimization Flags

C benchmarks:

110.fft:  basepeak = yes 
114.mriq:  basepeak = yes 
116.histo:  basepeak = yes 
117.bfs:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=64   -DSPEC_ACCEL_WG_SIZE_1_0=64 
118.cutcp:  basepeak = yes 
121.lavamd:  basepeak = yes 
124.hotspot:  basepeak = yes 
127.srad:  basepeak = yes 
128.heartwall:  basepeak = yes 
140.bplustree:  basepeak = yes 

C++ benchmarks:

101.tpacf:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=1024 
103.stencil:  basepeak = yes 
104.lbm:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=32   -DSPEC_ACCEL_WG_SIZE_0_1=1   -DSPEC_ACCEL_WG_SIZE_0_2=1 
112.spmv:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=96 
120.kmeans:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=288 
122.cfd:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_3_0=288 
123.nw:  basepeak = yes 
125.lud:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=32 
126.ge:  -fast   -Mfprelaxed   -DSPEC_ACCEL_WG_SIZE_0_0=512   -DSPEC_ACCEL_WG_SIZE_1_0=1   -DSPEC_ACCEL_WG_SIZE_1_1=512 

Peak Other Flags

C benchmarks:

 -I/opt/cuda-5.5/include/    -lOpenCL 

C++ benchmarks:

 -I/opt/cuda-5.5/include/    -lOpenCL 

The flags file that was used to format this result can be browsed at
http://www.spec.org/accel/flags/pgi2014_flags.20150303.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/accel/flags/pgi2014_flags.20150303.xml.