SPEC® ACCEL™ OCL Result

Copyright 2015-2017 Standard Performance Evaluation Corporation

Supermicro (Test Sponsor: HZDR)

NVIDIA Tesla K20m

Supermicro X9DRG-HF

SPECaccel_ocl_base = 1.70

SPECaccel_ocl_peak = 1.89

ACCEL license: 65A Test date: Aug-2017
Test sponsor: HZDR Hardware Availability: Jan-2013
Tested by: HZDR Software Availability: Aug-2016
Benchmark results graph
Hardware
CPU Name: Intel Xeon E5-2609
CPU Characteristics: No TURBO
CPU MHz: 2400
CPU MHz Maximum: 2400
FPU: Integrated
CPU(s) enabled: 8 cores, 2 chips, 4 cores/chip
CPU(s) orderable: 1,2 chips
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 10 MB I+D on chip per chip
Other Cache: None
Memory: 64 GB (8 x 8 GB 2Rx4 PC3-12800R-11, ECC, running
at 1066MHz)
Disk Subsystem: 60 GB INTEL SSDSC2CW060A3
Other Hardware: None
Accelerator
Accel Model Name: Tesla K20m
Accel Vendor: NVIDIA
Accel Name: NVIDIA Tesla K20m
Type of Accel: GPU
Accel Connection: PCIe 2.0 16x
Does Accel Use ECC: yes
Accel Description: NVIDIA Tesla K20m, 2688 CUDA cores, 732 MHz
6 GB GDDR5 RAM
(Kepler Generation)
Accel Driver: NVIDIA UNIX x86_64 Kernel Module 367.48
Software
Operating System: Ubuntu 14.04.5 LTS
Ubuntu 14.04.5 LTS
4.4.0-38-generic
Compiler: GNU Compiler C/C++ Version 6.2.0
File System: ext3
System State: Run level 5 (user-level)
Other Software: NVIDIA Cuda SDK 7.0, driver version 367.48

Results Table

Benchmark Base Peak
Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
101.tpacf 80.3  1.33   80.2  1.33   80.4  1.33   81.1  1.32   81.4  1.31   81.3  1.32  
103.stencil 69.1  1.81   69.1  1.81   69.3  1.80   69.1  1.81   69.1  1.81   69.3  1.80  
104.lbm 54.6  2.05   54.1  2.07   54.4  2.06   43.1  2.60   43.0  2.61   43.2  2.59  
110.fft 42.2  2.63   41.9  2.65   42.3  2.62   42.2  2.63   41.9  2.65   42.3  2.62  
112.spmv 89.6  1.64   89.5  1.64   89.5  1.64   89.5  1.64   89.5  1.64   89.1  1.65  
114.mriq 26.7  4.08   22.5  4.84   26.8  4.07   26.7  4.08   22.5  4.84   26.8  4.07  
116.histo 100    1.14   112    1.02   99.7  1.14   100    1.14   112    1.02   99.7  1.14  
117.bfs 70.3  1.66   71.3  1.64   71.3  1.64   54.8  2.13   55.1  2.12   54.3  2.16  
118.cutcp 46.1  2.15   46.3  2.14   44.5  2.23   46.1  2.15   46.3  2.14   44.5  2.23  
120.kmeans 93.1  1.07   93.6  1.07   93.1  1.07   92.7  1.08   88.8  1.13   93.0  1.08  
121.lavamd 22.7  4.80   23.4  4.65   23.0  4.75   22.7  4.80   23.4  4.65   23.0  4.75  
122.cfd 72.4  1.74   73.2  1.72   72.4  1.74   72.2  1.75   73.8  1.71   73.7  1.71  
123.nw 84.4  1.36   84.4  1.36   84.5  1.36   84.4  1.36   84.4  1.36   84.5  1.36  
124.hotspot 51.2  2.23   51.0  2.24   51.0  2.23   51.2  2.23   51.0  2.24   51.0  2.23  
125.lud 118    1.01   116    1.02   116    1.02   105    1.13   104    1.15   105    1.13  
126.ge 56.8  2.73   56.4  2.75   56.4  2.75   12.9  12.0    12.9  12.1    12.9  12.0   
127.srad 78.6  1.45   78.5  1.45   78.6  1.45   78.6  1.45   78.5  1.45   78.6  1.45  
128.heartwall 158    0.670  158    0.670  158    0.671  158    0.670  158    0.670  158    0.671 
140.bplustree 117    0.923  117    0.921  117    0.921  117    0.923  117    0.921  117    0.921 

Platform Notes

 Sysinfo program /tmp/spec/1.2/Docs/sysinfo
 $Rev: 6965 $ $Date:: 2015-04-21 #$ c05a7f14b1b1765e3fe1df68447e8a35
 running on kepler002 Thu Aug 24 13:13:30 2017

 This section contains SUT (System Under Test) info as seen by
 some common utilities.  To remove or add to this section, see:
   http://www.spec.org/accel/Docs/config.html#sysinfo

 From /proc/cpuinfo
    model name : Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
       2 "physical id"s (chips)
       8 "processors"
    cores, siblings (Caution: counting these is hw and system dependent.  The
    following excerpts from /proc/cpuinfo might not be reliable.  Use with
    caution.)
       cpu cores : 4
       siblings  : 4
       physical 0: cores 0 1 2 3
       physical 1: cores 0 1 2 3
    cache size : 10240 KB

 From /proc/meminfo
    MemTotal:       65949360 kB
    HugePages_Total:       0
    Hugepagesize:       2048 kB

 /usr/bin/lsb_release -d
    Ubuntu 14.04.5 LTS

 From /etc/*release* /etc/*version*
    debian_version: jessie/sid
    os-release:
       NAME="Ubuntu"
       VERSION="14.04.5 LTS, Trusty Tahr"
       ID=ubuntu
       ID_LIKE=debian
       PRETTY_NAME="Ubuntu 14.04.5 LTS"
       VERSION_ID="14.04"
       HOME_URL="http://www.ubuntu.com/"
       SUPPORT_URL="http://help.ubuntu.com/"
    redhat-release: Red Hat Enterprise Linux Server release 6.5 (Santiago)
    rh-release: Red Hat Enterprise Linux Server release 7.2 (Maipo)

 uname -a:
    Linux kepler002 4.4.0-38-generic #57~14.04.1-Ubuntu SMP Tue Sep 6 17:20:43
    UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

 run-level 5 Jan 23 15:07

 SPEC is set to: /tmp/spec/1.2
    Filesystem     Type  Size  Used Avail Use% Mounted on
    /dev/sda1      ext3   30G   14G   15G  47% /

 Cannot run dmidecode; consider saying 'chmod +s /usr/sbin/dmidecode'

 (End of data from sysinfo program)

Base Runtime Environment

C benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.2 CUDA 8.0.44   OpenCL Device #0: Tesla K20m, v 367.48 

C++ benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.2 CUDA 8.0.44   OpenCL Device #0: Tesla K20m, v 367.48 

Base Compiler Invocation

C benchmarks:

 gcc 

C++ benchmarks:

 g++ 

Base Portability Flags

116.histo:  -DSPEC_LOCAL_MEMORY_HEADROOM=2 
122.cfd:  -std=gnu++98 

Base Optimization Flags

C benchmarks:

 -O2   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 

C++ benchmarks:

 -O2   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 

Peak Runtime Environment

C benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.2 CUDA 8.0.44   OpenCL Device #0: Tesla K20m, v 367.48 

C++ benchmarks:

 OpenCL Platform: NVIDIA CUDA, OpenCL 1.2 CUDA 8.0.44   OpenCL Device #0: Tesla K20m, v 367.48 

Peak Compiler Invocation

C benchmarks:

 gcc 

C++ benchmarks:

 g++ 

Peak Portability Flags

116.histo:  -DSPEC_LOCAL_MEMORY_HEADROOM=2 
122.cfd:  -std=gnu++98 

Peak Optimization Flags

C benchmarks:

110.fft:  basepeak = yes 
114.mriq:  basepeak = yes 
116.histo:  basepeak = yes 
117.bfs:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=64   -DSPEC_ACCEL_WG_SIZE_1_0=64   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
118.cutcp:  basepeak = yes 
121.lavamd:  basepeak = yes 
124.hotspot:  basepeak = yes 
127.srad:  basepeak = yes 
128.heartwall:  basepeak = yes 
140.bplustree:  basepeak = yes 

C++ benchmarks:

101.tpacf:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=1024   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
103.stencil:  basepeak = yes 
104.lbm:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=32   -DSPEC_ACCEL_WG_SIZE_0_1=1   -DSPEC_ACCEL_WG_SIZE_0_2=1   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
112.spmv:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=96   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
120.kmeans:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=288   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
122.cfd:  -O2   -DSPEC_ACCEL_WG_SIZE_3_0=288   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
123.nw:  basepeak = yes 
125.lud:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=32   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 
126.ge:  -O2   -DSPEC_ACCEL_WG_SIZE_0_0=512   -DSPEC_ACCEL_WG_SIZE_1_0=1   -DSPEC_ACCEL_WG_SIZE_1_1=512   -I/opt/pkg/devel/cuda/7.0/include   -L/opt/pkg/devel/cuda/7.0/libb64   -lOpenCL 

The flags file that was used to format this result can be browsed at
https://www.spec.org/accel/flags/flags-advanced.20170929.html.

You can also download the XML flags source by saving the following link:
https://www.spec.org/accel/flags/flags-advanced.20170929.xml.