SPEC SIP Infrastructure 2011 Support FAQ

Revision Date: February 7th, 2011
Table of Contents


Q. How many machines do I need to run the benchmark?

A. Two: The FABAN harness master and a load generator can run on one machine, and another must be used as the SUT. This could also be done on one machine with two concurrently running VMs with separate IP addresses. However, this would just be to try out the benchmark. Nn practice, you would want to have several machines to make sure the load generators are not the performance bottleneck and ensure that you test the SUT capacity correctly.

Q. Can master and load driver run on the same machine?

A. Yes they can.

Q. Can SUT and load driver run on the same machine?

A. No.

Q. Can SUT and the Master run on the same machine?

A. No.

Q: Is there a limit on the number of clients?

A: There is no preset limit from Faban on the number of clients that it can manage, but we are limited by network bandwidth and the processing capability of the Faban master. SPEC SIP subcommittee members have tested up to 20 clients.

Q: What Hardware platforms has the benchmark been tested on?

A: 32 bit and 64 bit Intel Xeon, 64 bit AMD Opteron, UltraSPARC T2

Q: What Operating Systems has the benchmark been tested on?

A: SunOS: Solaris10, OpenSolaris; Linux: Red Hat Enterprise Linux 4 and 5, Ubuntu 8.04, Fedora 10

Q: What about unsupported platforms (e.g., Windows, Mac-OS, AIX, HP-UX, z/OS etc.)?

A: We will try to help you as much as possible but cannot guarantee the benchmark will work on these platforms. If you wish to contribute developer resources to make this happen we will be glad to accept the help. Source code is available to SPECsip_Infrastructure2011 licensees to aid in porting efforts.

Q. What JDK version is required to run the benchmark?

A. Java 1.6 is required. Some 1.6 classes are required for the user authentication.

Q: How long will it take to complete a fully compliant run for the benchmark?

A: The minimum required time is: ~2 hours: 10 min pre-registration, 45 min warm up time, 60 min steady state run, and 1 min cool down. However, you can do a shorter, non-compliant run to test that the system is working properly. A reasonable short run would be 1 minute pre-registration, 5 minutes warm-up time, and 10 minute steady-state run time.


Q: What are the differences from other benchmarks?

A. There has been no known standards-based SIP benchmark. There is a well known de-facto benchmark called SIPstone (www.sipstone.org), which is a micro-benchmark. The main difference in the SPEC SIP Infrastructure 2011 benchmark is that it is a full system benchmark: it includes a user behavior model, authentication (on per-transaction basis), and device registration. Under the workload, device registration contributes the bulk of SIP workload, as is found in VoIP environments. In addition, because of the authentication requirement, the benchmark strongly exercises the back-end interface from the SIP server to a subscriber database or an authentication server.

Q: What does Initial Registration Time mean?

A:The amount of time to populate the user location database before the first call made. Recommended minimal time is 10 minutes since this the time interval between successful device registrations at steady state run but it is allowed to use longer time to ensure the successful population of user locations.

Q. How long should the Initial Registration be set for my system?

A.It should be long enough so that each device is successfully registered. If the time is set too short, the registration arrival rate to the SUT may be high enough to cause SIP messages to drop. One can monitor the interface and socket drop counters (such as using netstat) to see if there is packet drop during the registration, or use snoop or tcpdump to observe if SUT is keeping up with the registration rate.

Q. Does Initial Registration happen before the test (ie does it happen in the warmup or is the initial registration actually part of the load for the test?)

A: There are four stages of a "completed" test run: 1) Initial Registration, 2) Warm Up period, 3) Steady State, and 4) Cool Down. The statistics are collected only during Steady State. However, if during Initial Registration there are devices failing to register, the test can also fail during Steady State.

Q: What does Warm Up Time mean?

A:The amount of time for the test rig to get into the steady state. The load drivers will send the same load during the warm up time and during the steady state run but statistics are not collected during the warm up time. There is a minimal Warm Up time required for a conforming run. Refer to the Run Rules for the requirement.

Q: What does Steady State Run mean?

A:The interval over which the transaction and call statistics are collected.

Q: Does the periodic (T=10min) register for a device happen while the device is in use? (i.e., does my device register while I am actually in a call)?

A:The device registration is generated by a separate sipp process so the registration still happen when you are in a call.

Q: What does Cool Down Time mean?

A:The additional time the load drives maintain their load after the end of the steady state. This makes sure that the load remains constant until the last SIPp instance finishes collecting run statistics.

Q: What does a "completed" run mean?

A:The benchmark run completes the four stages of a run, without being interrupted because of errors.

Q: How can the percentage of completed calls be over 100 percent? Is something broken?

A: No. Think of the 1 hour steady state run period as a time window. During this window, there are three potential scenarios for calls. First is simple: calls can start and end during the window. Second is more subtle: calls can start during the window but complete after the window. From the point of view of the monitoring software, these look like calls that didn't finish. Third and finally, calls start before the window (during the warm-up phase) and end during the window. These look like mystery calls that completed but never started. The software tracks both started calls and completed calls. A completed call percentage over 100 percent simply means there were more calls of the 3rd case than of the 2nd case during the run.


Q: How do I test the control subnet between Master and Clients are working correctly?

A: From the Master, ping the host name or IP address of each client. If ping works then there is connectivity between the master and the client.

Q: How do I test the data subnet between the SUT and the Clients?

A: From the SUT, ping each IP address (or host name) of the client, that are on the same subnet as the SUT. Then run a network performance micro-benchmark such as uperf, iperf or netperf to make sure there is adequate bandwidth between the SUT and client.

Q: What is the default master-client communication in a Linux environment?


Q: I am using Solaris for the Master, how do I change the master-client communication from rsh to ssh?

A: By default, for SunOS the Faban master talks to its clients using rsh. Since the master is SunOS, you'll need to edit the cmdmap.xml in the SunOS directory to map rsh to ssh.

Q. The installer fails with "java.awt.HeadlessException: No X11 DISPLAY variable was set, but this program performed an operation which requires it."

A. The installer cannot connect to the X server on your desktop. Make sure you have your X display settings set correctly. Test them with a simple application such as xclock.


Q: It's not working, what should I do?

A: Reread the User's Guide and make sure that you've followed all the steps and have run through the check lists in section 3. If you still haven't resolved your problem, review this FAQ to see if there is anything similar to the condition you are seeing.

Q: What events would cause a run to not complete?

A.Before launch, the benchmark validation checks consistency of the input parameters. If there is inconsistency such as number of UAC control interfaces, number of UAC data interfaces, and number of UAC data ports are not identical, the validation check will fail and the run stops. The benchmark also checks connectivity by sending ping message to the clients. If any such ping is not successful, the run also stops. During the four stages of a run, at any time a SIPp process exits (because of unrecoverable error) the benchmark will stop.

Q: How do we detect if the load-generation clients are performance bottleneck?

A: To detect whether the clients are bottleneck, one may monitor network interface statistics and cpu load on the clients. When there is packet loss, or high CPU load, it is an indication of client overload.

Q: What do I do if load-generation clients are the performance bottleneck?

A: Increase the number of clients, or use clients with more powerful CPU and better network capabilities.

Q. What does dead call (e.g., 2009-07-24 09:50:14:117 1248454214.117313: Dead call 4-9445@alt (successful), received 'SIP/2.0 200 OK ) mean? And why does it happen?

A. The recipient of the call has closed down the state regarding this call. The message has arrived too late.

Q. What is this message - 2009-09-07 16:18:07:344 1252365487.344767: Aborting call on UDP retransmission timeout for Call-ID '2684744-11995@'.?

A. A SIP UDP retransmission timed out after multiple retries. Perhaps the SUT was over capacity (subjected to too much load). One end of the call has terminated the call. This will not terminate the benchmark run. In other words, these messages are normal.

Q. Why do I see this message on the Faban run log and the test failed - PING yumi07: 64 data bytes ----yumi07 PING Statistics---- 1 packets transmitted, 0 packets received, 100% packet loss?

A. The yumi-07 interface has gone dormant (perhaps due to power saving mode) and needs to be waken up by the first ping packet. Rerunning the benchmark solves the issue.

Q. Is it all right to see "Resolving remote host ''... Done."?

A. Yes, the message is for information only. SIPp logs such messages to stderr.

Q. How about this message "Ringtime is True"?

A. This is first message in the sipp error log. By itself it does not indicate an issue, but Faban will display this message when it disrupts the benchmark run due to other reasons. Scroll back the Faban Run Log to look for other signs of trouble, such as the message discussed next.

Q. Why do I see this message "Unable to bind main socket, errno = 125 (Address already in use)"?

A. An earlier instance of the SIPp load driver is present, and owns the IP address and UDP port so that the new SIPp instance cannot be launched. To debug, log into the client machine to look for lingering sipp processes and kill them if found.

Q. What does "Aborting call on an unexpected BYE for call" mean?

A. The BYE message arrived not conforming to the scenarios defined in the Design Document.

Q. What does "Aborting call on an unexpected CANCEL for call" mean?

A. A CANCEL message arrived when the UAS was expecting a different message or call scenario.

Q. First there was "I am not expecting a call from user06874193, but I got one..". Then, " exits with status 255" and the benchmark run failed.

A. A message was misrouted to the wrong client, which did not expect to receive this message from the sender IP address. Due to the SIPp UAC-UAS control protocol, SIPp will exit immediately upon receiving such a message. This error commonly happen if the SIP server on the SUT was not restarted before a run, combined with the fact that not all registration transactions during the Initial Registration were successful (so there were stale location entries in the SIP server).

Q. Why do I see "The watchdog timer has tripped the minor threshold of 500ms too many times (121 out of 120 allowed) (3000 out of 0 major 10ms timeouts tripped)" on the Faban Run Log?

A. The load driver client has run out of CPU, memory, or networking resources and cannot sustain that amount of load generation. You need to switch to more powerful clients, or use more clients to share the load generation.

Q. I am seeing "Benchmark validation failed." What causes it?

A. Check if the Benchmark Configuration page is filled properly. The benchmark has a few validation checks before it starts a run. First, the sum of load distribution needs to be 100.0. Second, the benchmark check if the number of UAC, UAS and UDE clients are the same. For example, if there are 3 UAC clients specified, then number of UAS and number UDE clients should also be three. This applies to the Control Interfaces, Data Interfaces, and Data Ports. Make sure a single space is used to separate the fields, not multiple spaces.

Q. What could have caused "Aborting call on UDP retransmission timeout for Call-ID '2136065-28696@'"?

A. The SUT was likely overloaded and could not respond to SIP transactions.

Q: I get the warning "JAVA_HOME /path/to/java does not exist. Using /other/path/to/java instead."

A. The Java being used on the clients is not the same as on the master. Syncing them may solve some problems.

Q: I see a warning message in the run log that says "Resolving remote host 'mysuthostname.com'... Done". Is that a problem?

A. No. SIPp sends this message to stderr for some reason and FABAN picks it up as a warning. It can be ignored.

Q: I see a warning message in the run log that says "CmdService: Could not copy myclienthostname:/root/specsip_infrastructure2011/working/uas_errors.log". Is this a problem?

A: No. You have error logging turned on and there were no errors. The master could not find an error logging file since it doesn't exist.

Copyright © 2009-2011 Standard Performance Evaluation Corporation. All rights reserved. Java® is a registered trademark of Sun Microsystems.