SPEC logo

SPECweb96 Release 1.0 Run and Reporting Rules

Version 2.7. Last modified: Fri May 30 10:52:05 PDT 1997

Last updates include additional reporting requirements added to sections 3.2.2.6, 3.3.1.4.1, and 3.3.1.4.2.

Table of Contents

  1. Introduction
  2. Running the SPECweb96 Release 1.0 Benchmark
  3. Reporting Results for the SPECweb96 Release 1.0 Benchmark
  4. Building the SPECweb96 Release 1.0 Benchmark

1.0 Introduction

This document specifies how the benchmarks in the SPECweb96 Release 1.0 suite are to be run for measuring and publicly reporting performance results. These rules are according to the norms laid down by the SPEC Web Subcommittee and approved by the SPEC Open Systems Steering Committee. This ensures that results generated with this suite are meaningful, comparable to other generated results, and are repeatable (with documentation covering factors pertinent to duplicating the results).

Per the SPEC license agreement, all results publicly disclosed must adhere to these Run and Reporting Rules.

1.1 Philosophy

The general philosophy behind the rules for running the SPECweb96 Release 1.0 benchmark is to ensure that an independent party can reproduce the reported results.

The following attributes are expected:

Furthermore, SPEC expects that any public use of results from this benchmark suite shall be for servers and configurations that are appropriate for public consumption and comparison. Thus, it is also expected that:

1.2 Caveat

SPEC reserves the right to adapt the benchmark codes, workloads, and rules of SPECweb96 Release 1.0 as deemed necessary to preserve the goal of fair benchmarking. SPEC with notify members and licencees whenever it makes changes to the suite and will rename the metrics (e.g. from SPECweb96 to SPECweb97a). In the event that a workload is removed, SPEC reserves the right to republish in summary form "adapted" results for previously published systems, converted to the new metric. In the case of other changes, a republication may necessitate retesting and may require support from the original test sponsor.

Relevant standards are cited in these run rules as URL references, and are current as of the date of publication. Changes or updates to these referenced documents or URL's may necessitate repairs to the links and/or amendment of the run rules. The current run rules will be available at the SPEC web site at http://www.spec.org. SPEC with notify members and licencees whenever it makes changes to the suite.


2.0 Running the SPECweb96 Release 1.0 Benchmark

2.1 Environment

2.1.1 Protocols

As the WWW is defined by its interoperative protocol definitions, SPECweb requires adherence to the related protocol standards. The benchmark environment shall be governed by the following standards:

HTTP1.0
Basic WWW protocol, as defined in http://www.w3.org/pub/WWW/Protocols/HTTP1.0/draft-ietf-http-spec.html.
RFC 761
DoD standard Transmission Control Protocol, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc761.txt
RFC791
Internet Protocol, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc791.txt
RFC792
Internet Control Message Protocol, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc792.txt and updated by RFC0950.
RFC 793
Transmission Control Protocol, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc793.txt
RFC950
Internet Standard Subnetting Procedure, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc950.txt
RFC 1122
Requirements for Internet hosts - communication layers, as defined in http://info.internet.isi.edu/in-notes/rfc/files/rfc1122.txt

For further explanation of these protocols, the following might be helpful:

RFC 1180
TCP/IP tutorial [http://info.internet.isi.edu/in-notes/rfc/files/rfc1180.txt]
RFC 1739
A Primer On Internet and TCP/IP Tools [http://info.internet.isi.edu/in-notes/rfc/files/rfc1739.txt]

2.1.2 Server

For a run to be valid, the following attributes must hold true:

Any deviations from the standard, default configuration for the server will need to be documented so an independent party would be able to reproduce the result without further assistance.

2.2 Measurement

2.2.1 File Set

The benchmark will make references to files located on the server. The range of files access will be determined by the particular level of requested load for each measurement. The particular files referenced shall be determined by the random workload generation in the benchmark itself.

The benchmark suite provides tools for the creation of the files to be used. It is the responsibility of the benchmarker to ensure that these files are placed on the server so that they can be accessed properly by the benchmark. These files, and only these files shall be used as the target file set. The benchmark shall perform internal validations to verify the expected file(s); no modification or bypassing of this validation is allowed.

2.2.2 Load Levels

Each benchmark run consists of a set of requested load levels for which an actual measurement is made. The benchmark measures the actual level achieved and the associated average response time for each of the requested levels.

The measurement of all data points defining a performance curve is made within a single benchmark run, starting with the lowest requested load level and proceeding to the highest requested load level. The requested load levels are specified in a list, from lowest to highest, from left to right, respectively, in the parameter file.

If any requested load level must be rerun for any reason, the entire benchmark run must be restarted and the series of requested load levels repeated. No server or testbed configuration changes, server reboots, or file system initializations (e.g., "newfs") are allowed between requested load levels.

The performance curve must consist of a minimum of 10 data points of requested load, uniformly distributed across the range from zero to the maximum requested load. Additional points in addition to these 10 uniformly distributed points can also be reported.

2.2.3 Benchmark Parameters

All benchmark parameter values must be left at their default values when generating reportable SPECweb96 results, except as noted in the following list:

Server
The means of accessing the desired server shall be defined. This includes the name or address(es) of the server, as well as the proper port number.
Load
A collection of clients called load generators is used to generate an aggregate load on the server being tested.

In particular, there are several settings that cannot be changed without invalidating the result.

Server Fileset
The size of the fileset generated on the server by the benchmark is established as a function of requested throughput. Thus, fileset size is dependent on throughput across the entire results curve. This provides a more realistic server load since more files are being manipulated on the server as the load is increased. This reflects typical server use in real-world environments. The default parameters of the benchmark allow the automatic creation of valid total and working filesets on the server being measured.
Time parameters
RUNTIME, the time of measurement for which results are reported, must be the default 600 seconds for reportable results. The WARMUP_TIME must be set to the default of 300 seconds for reportable results.
Workload parameters
The workload specifics are fixed by the benchmark specification. The given name of a workload file may specify any workload file properly built by the fileset generation step of the benchmark.

3.0 Reporting Results for the SPECweb96 Release 1.0 Benchmark

3.1 Metrics And Reference Format

The report of results for the SPECweb96 benchmark is generated in ASCII and HTML format by the provided SPEC tools. These tools may not be changed, except for portability reasons with prior SPEC approval. This section describes the report generated by those tools. The tools perform error checking and will flag many error conditions as resulting in an "invalid run". However, these automatic checks are only there for your convenience, and do not relieve you of your responsibility to check your own results and follow the run and reporting rules.

While SPEC believes that a full performance curve best describes a server's performance, the need for a single figure of merit is recognized. The benchmark single figure of merit, SPECweb96, is the peak throughput measured during the run (reported in operations per second). For a result to be valid, the peak throughput must be within 5% of the corresponding requested load. The results of a benchmark run, comprised of several load levels, are plotted on a performance curve on the results reporting page. The data values for the points on the curve are also enumerated in a table.

No data point within 25% of the maximum reported throughput may be reported where the number of failed requests for any file class is greater than 1% of total requests for that file class, plus one. No data point within 25% of the maximum reported throughput may be reported whose "Actual Mix Pcnt" versus "Target Mix Pcnt" differs by more than 10% of the "Target Mix Pcnt" for any workload class. E.g., if the target mix percent is 0.35 then valid actual mix percents are 0.35 +/- 0.035.

3.1.1 Table Format

The server performance graph is contstructed from a table containing the data points from a single run of the benchmark. The table consists of two columns:

3.1.2 Graphical Format

Server performance is depicted in a plot with the following format:

All data points of the plot must be enumerated in the table described in paragraph 3.1.1.

3.1.3 Detailed Results

The SPEC tools will allow verbose output optionally to be selected, in which case additional data are reported in a table:

3.2 Server Configuration

The system configuration information that is required to duplicate published performance results must be reported. This list is not intended to be all-inclusive, nor is each feature in the list required to be described. The rule of thumb is: if it affects performance or the feature is required to duplicate the results, describe it. All components must be generally available within 6 months of the or iginal publication of a performance result.

3.2.1 Server Hardware

The following server hardware components must be reported:

3.2.2 Server Software

The following server software components must be reported:

3.3 Testbed Configuration

3.3.1 Network Configuration

A brief description of the network configuration used to achieve the benchmark results is required. The minimum information to be supplied is:

3.3.2 Load Generators

The following load generator hardware components must be reported:

3.4 General Availability Dates

The dates of general customer availability must be listed for the major components: hardware, HTTP server, and operating system, month and year. All the system, hardware and software features are required to be available within 6 months of the date of test.

3.5 Test Sponsor

The reporting page must list the date the test was performed , month and year, the organization which performed the test and is reporting the results, and the SPEC license number of that organization.

3.6 Notes/Summary of Tuning Parameters

This section is used to document:

3.7 Other Required Information

The following additional information is also required to appear on the results reporting page for SPECweb96 Release 1.0 results:

The following additional information may be required to be provided for SPEC's results review:


4.0 Building the SPECweb96 Release 1.0 Benchmark

SPEC provides client driver software, which includes tools for running the benchmark and reporting its' results. This software implements various checks for conformance with these run and reporting rules. Therefore the SPEC software must be used except that necessary substitution of equivalent functionality (e.g. file set generation) may be done only with prior approval from SPEC. Any such substitution must be reviewed and deemed "performance-neutral" by the OSSC.

You may not change this software without prior approval from SPEC. SPEC permits minimal performance-neutral portability changes, but only with prior approval. All changes must be reviewed and deemed "performance-neutral" by the OSSC. Source code changes required for standards compliance must be reported to SPEC, citing appropriate standards documents. SPEC will consider incorporating such changes in future releases. Whenever possible, SPEC will strive to develop and enhance the benchmark to be standards-compliant. The portability change will be allowed if, without the change, the:

Special libraries may be used in conjunction with the benchmark code as long as they do not replace routines in the benchmark source code, and they are not "benchmark-specific".

Driver software includes C code (ANSI C) and perl scripts (perl5). SPEC will provide prebuilt versions of perl and the driver code, or these may be recompiled from the provided source. SPEC requires the user to provide OS and server software to support HTTP 1.0 as described in section 2.