In 1993, the X Performance Characterization (XPC) project group developed Xmark93, a standardized benchmarking tool for measuring the performance of computer systems running the X Window System. A year later, the group set its sights higher, beginning work to develop a picture-level benchmark that will be placed in the public domain. The goal is for the benchmark to be available before the end of 1996.
Xmark93 allows systems evaluators and vendors to compare the performance of X server/hardware systems for a broad set of X primitives covering a wide range of applications. The benchmark provides a standardized method for summarizing X11perf results, providing a single-number measure of overall X11 server/hardware performance.
Xmark93 is derived by calculating the ratio between the geometrically weighted mean of the 447 individual X11perf tests for the server/hardware being evaluated and the corresponding results from a Sun Microsystems SparcStation 1. Weightings for X11perf tests were obtained by balloting X11 technical experts. The weightings reflect the experts' ratings of the relative importance of individual X11perf operations within a wide mix of applications.
The Xmark93 shell script can be run on any computer that supports the "sh" shell; it need not be run on the same machine that executed X11perf or on the system under test. The Xmark93 shell script takes as input the output file produced by a complete run of X11perf Rev 1.3 (including the xor set). X11perf version 1.3 was shipped with X11R5.
There has been a great deal of debate within the X technical community about what can be measured with an Xmark93 number. This spectrum of technical and political opinions is to be expected with any benchmark. Since no single number can accurately describe all aspects of X performance, there will always be some applications for which any particular benchmark does not accurately predict performance. In general, however, Xmark93 measures a broad set of X primitive performance covering a wide range of applications.
A system with an Xmark93 of 2 has X primitive performance, on the average, twice as fast as a system with an Xmark93 of 1. Because it is an average, some tests on the 1 Xmark93 system may actually run faster than on the 2 Xmark93 system. The XPC group monitors the Xmark93 results it publishes for any contradictions regarding the relative X primitive performance of different systems.
Due to the nature of the geometric mean and the number of tests being used, Xmark93 is relatively insensitive to a vendor tuning a particular test for peak performance. Still, when vendors optimize their systems for Xmark93, tuning the tests with higher weights will result in greater improvements in the Xmark93 number. Since the higher weights correlate with the primitives' importance in a wide mix of applications, however, the end user will ultimately benefit from vendors tuning for higher Xmark93 numbers.
As with any benchmark, Xmark93 should not be used as the single final buying criteria. Running the end-user applications is the ultimate performance criteria.
Because X11perf output consists of hundreds of numbers, reviewing all of them is not practical when reporting or evaluating overall performance of X servers. For some users, a small subset of the X11perf numbers is enough to characterize the applications they run. Other users, however, might not have the application-specific data necessary to identify which X operations most impact performance in their environment.
Xmark93 uses a weighted average of all of the X11perf numbers, not just a subset of them, since primitive usage/frequency varies wildly from application to application. Motif applications, for example, perform lots of tiny rectangle fills to get a three-dimensional shadowing effect, whereas OpenLook applications make wide use of ellipses to get rounded buttons. Moreover, a graphics-intensive application has very different characteristics from a simple graphical user interface, which in turn would have different characteristics from a text-oriented application. The weighting used by Xmark93 reduces the impact of some rarely used X operations such as 1x1 rectangles on the overall performance number.
The first step in obtaining Xmark93 numbers is to run X11perf Rev 1.3 with the options -rop GXcopy GXxor and -all. (See the Xmark93 script itself for syntax suggestions.) Then the Xmark93 script is run with the resulting X11perf output file as input.
The resulting Xmark93 file contains three numbers:
The SparcStation 1 number was collected on this configuration:
For further details, consult the Xmark93 script itself. The Xmark93 script may be obtained by anonymous ftp.
The goal of the XPC group is to create standardized ways of measuring the performance of X server/hardware systems that are meaningful for user comparisons. The development of a picture-level benchmark will allow for performance measurements that better approximate actual activity during a user session. The XPC group's Xlib Playback Benchmark (XPB), scheduled for release in 1996, will record and play back activities found in popular X-Window System applications. When completed, the new benchmark will give vendors and users a more robust tool for comparing X Window performance across different hardware platforms.