Siemens Nixdorf pyrC compiler flags (as of November 1997) ========================================================= The following is a list of short explanations of compiler/linker flags used for SPEC CINT95 result submissions for Siemens Nixdorf / Pyramid RM systems, using the pyrC 6.0 compiler. This flag description supersedes the earlier description given for version 5.0 of the same compiler (some flags are new, and the new description covers more flags). It is likely that future result submissions, if they use new compilers or new compiler versions, will have different flags; then this flag description will be superseded by a new one. --------------------------------------------------------------------- 1. Compiler Flags [Syntax note: For most flags that have a numeric parameter (e.g., inlining control), this parameter can be separated from the flag by either a comma "," or a colon ":".] -qfeedback Standard (1-pass) feedback optimization: Produce code that collects call graph and flow graph information suitable for feedback directed optimization. -WM,-profdir Specifies that profiling information should be written to and read from the directory . Default is ./PROF. -qfeedback2 Additional (2-pass) feedback optimization: Produce code that collects information from an executable optimized in a first pass of feedback optimization (i.e. one compiled with -qfeedback / -WM,-U, or -WM,-O4, or -WM,-O5, or -WM,-Omips4). -WM,-profdir2 Specifies that profiling information from 2-pass feedback compilation should be written to and read from the directory . Default is ./PROF2. -WM,-use_fb2 Specifies that profiling information from 2-pass feedback compilation should be used in the generation of the (final) executable. Must be used together with -WM,-O4, or -WM,-O5, or -WM,-Omips4. -WM,-Omips4 Performs all safe and generally applicable optimizations including interprocedural optimizations, register allocation across function calls and feedback directed optimizations (function inlining, procedure positioning, branch elimination, procedure splitting, register allocation and cross basic block scheduling). This flag also directs the compiler to produce nonposition- independent code, to generate code using the instruction set of the MIPS4 ISA, to inline alloca, printf, memcpy, memset, memcmp, and memmove and to use U-code system libraries. These libraries represent the same system services as their regular counterparts, but in a form more suitable for interprocedural optimization. The flag also includes -Wb,-fast_int_mul (see below). -WM,-G Specifies that data items smaller than bytes in size should be placed in the global data area and accessed using a faster addressing mode. Default is 0. -WM,-pre_opt Adds an additional phase of optimization that may find additionl optimization possibilities at the expense of more compile time. -WM,-no_positioning Disables procedure positioning feedback optimization. -Wb,-br_likely_cntl,, Controls the branch likely optimization which sets the likely bit in a conditional branch. If feedback indicates that a conditional branch is probably taken and the branch cannot be reversed, the branch's likely bit is set if both of the following criteria are met: 1) the branch is taken at least percent of the time and 2) equals 0 or the branch is taken at times more often than the time the branch's function is called. Both and are expressed as percentages, defaults are 90 and 0, respectively. -Wb,-prefetch,, This will insert prefetch instructions in loops if a loop appears to access memory in a serial fashion. Only loops which have at least iterations are considered. is the expected latency for fetches from memory in units of machine instruction cycle times. Off by default; Omips4 sets it "on" and sets the values to 40 and 400, respectively. -Wb,-fast_int_mul Directs the optimizer to to use the floating-point unit to perform 32-bit integer multiplications wherever doing so would result in correct, faster code. Because this flag changes the behavior of multiplications that overflow, programs that depend on the trunction to 32-bits of two- complement multiplication (the default behavior) should not use this flag. Because the difference to the default behavior appears in overflow cases only (not in legal C programs), and because rule 2.2.5 of the CPU95 Run Rules exempts numerical accuracy flags from baseline restrictions anyway, this flag is not an assertion flag in the sense of the CPU95 Run Rules. -Wb,-no_self_copy Asserts that the optimizer may assume that the difference between any two pointers referencing the same data item is greater than seven bytes. -Wc,-xjp_mh_opt,, Controls the hot switch optimization which uses conditional branches instead of indirect jumps at C switch statements. For a switch label to be considered for this optimization, the label's relative frequency of execution must be greater than num1 percent. The parameter num2 limits the maximum number of conditional branches. -WM,-Omips4 sets the values to 3 and 10, respectively. -WG,-xxx / -Wg,-xxx / -Wn,-xxx Flags that have one of these forms control either the "inliner" pass of the compiler (-Wg,-xxx), or the "cloner" pass of the compiler (-Wn,-xxx), or both (-WG,-xxx). A setting with a more specific value (lower case letter g or n) overrides the more general setting (uppercase letter G). Although the following description uses the "-WG,-xxx" form, it holds for the other forms also. Some flags exist for the "cloner" only (the pass that optimizes for specific call locations of subroutines), they provide finer control over the cloning process. They can be written in the form -WG,-xxx or -Wn,-xxx; the following description uses the form -Wn,-xxx. -WG,-inline_limit: Sets a size threshold for inlining/cloning. A call will not be inlined/cloned if the resulting function (after inlining/cloning) exceeds basic blocks. Default is 500. -WG,-space_time: Tells the "umerge" phase to consider only those functions for inlining/cloning whose estimated ratio of code expansion to time savings is less than n. Default is 3.0. -WG,-boc: Tells the "umerge" phase to consider only those functions for inlining /cloning whose estimated ratio of runtime cycle save to I-cache cost of doing inlining/cloning is greater than or equal to n. Default is 1.0. -Wg,-dont_prune_zero_edges Directs the inliner to inline function calls with zero execution count in the feedback information. The default is to not inline these edges. -Wn,-clone_expansion: Directs the cloner to limit the maximum relative growth of the program to . The default for is 1.3. -Wn,-recursion_depth: Sets the maximum number of function calls through which the cloner will search to identify recursive functions. For example, -WN,-recursion_depth:1 means that functions who call themselves will be consider recursive functions. -Wn,-only_clone_recursion Directs the cloner to only clone recursive functions. -Wn,-recursion_limit: Directs the cloner to limit the maximum number of basic blocks in a recursive function to . If -Wn,-recursion_limit isn't given, then this is set by the -WG,-inline_limit flag. If neither of these flags is given, the default is 500. -Wo,-loopunroll: Tells the optimizer to unroll loops times. Default is 4, Omips4 sets to 8. -Wo,-unrolllimit: is the limit on the number of instructions within a loop unrolled by the optimizer. Default is 500, Omips4 sets to 2000. -Wo,-no_const_in_reg Tells the optimizer not to put constants in registers. -Wo,splitedges, Controls the edge splitting algorithm in "uopt" which inserts an empty basic block on infrequently executed control flow edges to increase optimization opportunities. This optimization uses feedback information to limit the number of split edges and avoid excessive compilation time. "uopt" will split an edge if its execution frequency multiplied by num is less than the smaller of the execution frequencies of the edge's head and tail basic blocks. Setting num to zero disables edge splitting. -Wo,-recursive_calls Directs uopt to use different heuristics that result in better performance if there are recursive function calls in the source code. Only effective in -WM,-O3 and -WM,-O5 modes. -KOlimit: Changes the threshold size for optimizing very large programs. The argument specifies the maximum size in basic blocks of a function that will be optimized by the global optimizer. The default value of the argument is 1000. The optimization phase of the compiler warns you if this flag is needed to optimize a particular program. There can be no space around the colon (:). -WM,-Omips4 sets num to 3000. 2. Linker Flags -dn This option is passed to ld. It specifies static linking in the link editor. 3. Portability Flags: -DI_TIME -DI_SYS_TIME Enables certain (SPEC-approved) source code parts via conditional compilation. Questions? More details can be found in the compiler documentation. SPEC-specific questions should be sent to the SPEC OSG representative Reinhold Weicker, weicker.pad@sni.de