This is the flag description file for AMD compiled binaries using the gcc compiler version 3.3 from SuSE Linux Enterprise Server 8 Service Pack 3. It also includes flag descriptions for The PGI 5.1 compiler. ---------------------------------------------------------------------------- Flags for gcc 3.3 (from SLES8 SP3) ---------------------------------------------------------------------------- O0 `-O0' Do not optimize. This is the default. O3 `-O' `-O1' Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function. With `-O', the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time. `-O' turns on the following optimization flags: -fdefer-pop -fmerge-constants -fthread-jumps -floop-optimize -fcrossjumping -fif-conversion -fif-conversion2 -fdelayed-branch -fguess-branch-probability -fcprop-registers `-O' also turns on `-fomit-frame-pointer' on machines where doing so does not interfere with debugging. `-O2' Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. The compiler does not perform loop unrolling or function inlining when you specify `-O2'. As compared to `-O', this option increases both compilation time and the performance of the generated code. `-O2' turns on all optimization flags specified by `-O'. It also turns on the following optimization flags: -fforce-mem -foptimize-sibling-calls -fstrength-reduce -fcse-follow-jumps -fcse-skip-blocks -frerun-cse-after-loop -frerun-loop-opt -fgcse -fgcse-lm -fgcse-sm -fdelete-null-pointer-checks -fexpensive-optimizations -fregmove -fschedule-insns -fschedule-insns2 -fsched-interblock -fsched-spec -fcaller-saves -fpeephole2 -freorder-blocks -freorder-functions -fstrict-aliasing -falign-functions -falign-jumps -falign-loops -falign-labels Please note the warning under `-fgcse' about invoking `-O2' on programs that use computed gotos. `-O3' Optimize yet more. `-O3' turns on all optimizations specified by `-O2' and also turns on the `-finline-functions', `-fweb', `-funit-at-time', `-ftracer', `-funswitch-loops' and `-frename-registers' options. -funroll-all-loops `-funroll-all-loops' Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes programs run more slowly. `-funroll-all-loops' implies the same options as `-funroll-loops' -fprofile-arcs/ -fbranch-probabilities `-fprofile-arcs' Instrument "arcs" during compilation to generate coverage data or for profile-directed block ordering. During execution the program records how many times each branch is executed and how many times it is taken. When the compiled program exits it saves this data to a file called `AUXNAME.da' for each source file. AUXNAME is generated from the name of the output file, if explicitly specified and it is not the final executable, otherwise it is the basename of the source file. In both cases any suffix is removed (e.g. `foo.da' for input file `dir/foo.c', or `dir/foo.da' for output file specified as `-o dir/foo.o'). For profile-directed block ordering, compile the program with `-fprofile-arcs' plus optimization and code generation options, generate the arc profile information by running the program on a selected workload, and then compile the program again with the same optimization and code generation options plus `-fbranch-probabilities' (*note Options that Control Optimization: Optimize Options.). With `-fprofile-arcs', for each function of your program GCC creates a program flow graph, then finds a spanning tree for the graph. Only arcs that are not on the spanning tree have to be instrumented: the compiler adds code to count the number of times that these arcs are executed. When an arc is the only exit or only entrance to a block, the instrumentation code can be added to the block; otherwise, a new basic block must be created to hold the instrumentation code. -ffast-math `-ffast-math' Sets `-fno-math-errno', `-funsafe-math-optimizations', `-fno-trapping-math', `-ffinite-math-only' and `-fno-signaling-nans'. This option causes the preprocessor macro `__FAST_MATH__' to be defined. This option should never be turned on by any `-O' option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. `-fno-math-errno' Do not set ERRNO after calling math functions that are executed with a single instruction, e.g., sqrt. A program that relies on IEEE exceptions for math error handling may want to use this flag for speed while maintaining IEEE arithmetic compatibility. This option should never be turned on by any `-O' option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. The default is `-fmath-errno'. `-funsafe-math-optimizations' Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid and (b) may violate IEEE or ANSI standards. When used at link-time, it may include libraries or startup files that change the default FPU control word or other similar optimizations. This option should never be turned on by any `-O' option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. The default is `-fno-unsafe-math-optimizations'. `-ffinite-math-only' Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs. This option should never be turned on by any `-O' option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications. The default is `-fno-finite-math-only'. `-fno-trapping-math' Compile code assuming that floating-point operations cannot generate user-visible traps. These traps include division by zero, overflow, underflow, inexact result and invalid operation. This option implies `-fno-signaling-nans'. Setting this option may allow faster code if one relies on "non-stop" IEEE arithmetic, for example. This option should never be turned on by any `-O' option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. The default is `-ftrapping-math'. `-fsignaling-nans' Compile code assuming that IEEE signaling NaNs may generate user-visible traps during floating-point operations. Setting this option disables optimizations that may change the number of exceptions visible with signaling NaNs. This option implies `-ftrapping-math'. This option causes the preprocessor macro `__SUPPORT_SNAN__' to be defined. The default is `-fno-signaling-nans'. This option is experimental and does not currently guarantee to disable all GCC optimizations that affect signaling NaN behavior. -m32 These `-m' switches are supported in addition to the above on AMD x86-64 processors in 64-bit environments. `-m32' `-m64' Generate code for a 32-bit or 64-bit environment. The 32-bit environment sets int, long and pointer to 32 bits and generates code that runs on any i386 system. The 64-bit environment sets int to 32 bits and long and pointer to 64 bits and generates code for AMD's x86-64 architecture. -fdefer-pop `-fno-defer-pop' Always pop the arguments to each function call as soon as that function returns. For machines which must pop arguments after a function call, the compiler normally lets arguments accumulate on the stack for several function calls and pops them all at once. Disabled at levels `-O', `-O2', `-O3', `-Os'. -fmerge-constants `-fmerge-constants' Attempt to merge identical constants (string constants and floating point constants) across compilation units. This option is the default for optimized compilation if the assembler and linker support it. Use `-fno-merge-constants' to inhibit this behavior. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fthread-jumps `-fthread-jumps' Perform optimizations where we check to see if a jump branches to a location where another comparison subsumed by the first is found. If so, the first branch is redirected to either the destination of the second branch or a point immediately following it, depending on whether the condition is known to be true or false. Enabled at levels `-O', `-O2', `-O3', `-Os'. -floop-optimize `-floop-optimize' Perform loop optimizations: move constant expressions out of loops, simplify exit test conditions and optionally do strength-reduction and loop unrolling as well. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fcrossjumping `-fcrossjumping' Perform cross-jumping transformation. This transformation unifies equivalent code and save code size. The resulting code may or may not perform better than without cross-jumping. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fif-conversion `-fif-conversion' Attempt to transform conditional jumps into branch-less equivalents. This include use of conditional moves, min, max, set flags and abs instructions, and some tricks doable by standard arithmetics. The use of conditional execution on chips where it is available is controlled by `if-conversion2'. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fif-conversion2 `-fif-conversion2' Use conditional execution (where available) to transform conditional jumps into branch-less equivalents. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fdelayed-branch `-fdelayed-branch' If supported for the target machine, attempt to reorder instructions to exploit instruction slots available after delayed branch instructions. Enabled at levels `-O', `-O2', `-O3', `-Os'. -fguess-branch-probability `-fno-guess-branch-probability' Do not guess branch probabilities using a randomized model. Sometimes gcc will opt to use a randomized model to guess branch probabilities, when none are available from either profiling feedback (`-fprofile-arcs') or `__builtin_expect'. This means that different runs of the compiler on the same program may produce different object code. In a hard real-time system, people don't want different runs of the compiler to produce code that has different behavior; minimizing non-determinism is of paramount import. This switch allows users to reduce non-determinism, possibly at the expense of inferior optimization. The default is `-fguess-branch-probability' at levels `-O', `-O2', `-O3', `-Os'. -fcprop-registers `-fno-cprop-registers' After register allocation and post-register allocation instruction splitting, we perform a copy-propagation pass to try to reduce scheduling dependencies and occasionally eliminate the copy. Disabled at levels `-O', `-O2', `-O3', `-Os'. -fforce-mem `-fforce-mem' Force memory operands to be copied into registers before doing arithmetic on them. This produces better code by making all memory references potential common subexpressions. When they are not common subexpressions, instruction combination should eliminate the separate register-load. Enabled at levels `-O2', `-O3', `-Os'. -foptimize-sibling-calls `-foptimize-sibling-calls' Optimize sibling and tail recursive calls. Enabled at levels `-O2', `-O3', `-Os'. -fstrength-reduce `-fstrength-reduce' Perform the optimizations of loop strength reduction and elimination of iteration variables. Enabled at levels `-O2', `-O3', `-Os'. -fcse-follow-jumps `-fcse-follow-jumps' In common subexpression elimination, scan through jump instructions when the target of the jump is not reached by any other path. For example, when CSE encounters an `if' statement with an `else' clause, CSE will follow the jump when the condition tested is false. Enabled at levels `-O2', `-O3', `-Os'. -fcse-skip-blocks `-fcse-skip-blocks' This is similar to `-fcse-follow-jumps', but causes CSE to follow jumps which conditionally skip over blocks. When CSE encounters a simple `if' statement with no else clause, `-fcse-skip-blocks' causes CSE to follow the jump around the body of the `if'. Enabled at levels `-O2', `-O3', `-Os'. -frerun-cse-after-loop `-frerun-cse-after-loop' Re-run common subexpression elimination after loop optimizations has been performed. Enabled at levels `-O2', `-O3', `-Os'. -frerun-loop-opt `-frerun-loop-opt' Run the loop optimizer twice. Enabled at levels `-O2', `-O3', `-Os'. -fgcse `-fgcse' Perform a global common subexpression elimination pass. This pass also performs global constant and copy propagation. _Note:_ When compiling a program using computed gotos, a GCC extension, you may get better runtime performance if you disable the global common subexpression elimination pass by adding `-fno-gcse' to the command line. Enabled at levels `-O2', `-O3', `-Os'. -fgcse-lm `-fgcse-lm' When `-fgcse-lm' is enabled, global common subexpression elimination will attempt to move loads which are only killed by stores into themselves. This allows a loop containing a load/store sequence to be changed to a load outside the loop, and a copy/store within the loop. Enabled by default when gcse is enabled. -fgcse-sm `-fgcse-sm' When `-fgcse-sm' is enabled, A store motion pass is run after global common subexpression elimination. This pass will attempt to move stores out of loops. When used in conjunction with `-fgcse-lm', loops containing a load/store sequence can be changed to a load before the loop and a store after the loop. Enabled by default when gcse is enabled. -fdelete-null-pointer-checks `-fdelete-null-pointer-checks' Use global dataflow analysis to identify and eliminate useless checks for null pointers. The compiler assumes that dereferencing a null pointer would have halted the program. If a pointer is checked after it has already been dereferenced, it cannot be null. In some environments, this assumption is not true, and programs can safely dereference null pointers. Use `-fno-delete-null-pointer-checks' to disable this optimization for programs which depend on that behavior. Enabled at levels `-O2', `-O3', `-Os'. -fexpensive-optimizations `-fexpensive-optimizations' Perform a number of minor optimizations that are relatively expensive. Enabled at levels `-O2', `-O3', `-Os'. -fregmove `-fregmove' Attempt to reassign register numbers in move instructions and as operands of other simple instructions in order to maximize the amount of register tying. This is especially helpful on machines with two-operand instructions. Note `-fregmove' and `-foptimize-register-move' are the same optimization. Enabled at levels `-O2', `-O3', `-Os'. -fschedule-insns `-fschedule-insns' If supported for the target machine, attempt to reorder instructions to eliminate execution stalls due to required data being unavailable. This helps machines that have slow floating point or memory load instructions by allowing other instructions to be issued until the result of the load or floating point instruction is required. Enabled at levels `-O2', `-O3', `-Os'. -fschedule-insns2 `-fschedule-insns2' Similar to `-fschedule-insns', but requests an additional pass of instruction scheduling after register allocation has been done. This is especially useful on machines with a relatively small number of registers and where memory load instructions take more than one cycle. Enabled at levels `-O2', `-O3', `-Os'. -fsched-interblock `-fno-sched-interblock' Don't schedule instructions across basic blocks. This is normally enabled by default when scheduling before register allocation, i.e. with `-fschedule-insns' or at `-O2' or higher. -fsched-spec `-fno-sched-spec' Don't allow speculative motion of non-load instructions. This is normally enabled by default when scheduling before register allocation, i.e. with `-fschedule-insns' or at `-O2' or higher. -fcaller-saves `-fcaller-saves' Enable values to be allocated in registers that will be clobbered by function calls, by emitting extra instructions to save and restore the registers around such calls. Such allocation is done only when it seems to result in better code than would otherwise be produced. This option is always enabled by default on certain machines, usually those which have no call-preserved registers to use instead. Enabled at levels `-O2', `-O3', `-Os'. -fpeephole2 `-fno-peephole' `-fno-peephole2' Disable any machine-specific peephole optimizations. The difference between `-fno-peephole' and `-fno-peephole2' is in how they are implemented in the compiler; some targets use one, some use the other, a few use both. `-fpeephole' is enabled by default. `-fpeephole2' enabled at levels `-O2', `-O3', `-Os'. -freorder-blocks `-freorder-blocks' Reorder basic blocks in the compiled function in order to reduce number of taken branches and improve code locality. Enabled at levels `-O2', `-O3', `-Os'. -freorder-functions `-freorder-functions' Reorder basic blocks in the compiled function in order to reduce number of taken branches and improve code locality. This is implemented by using special subsections `text.hot' for most frequently executed functions and `text.unlikely' for unlikely executed functions. Reordering is done by the linker so object file format must support named sections and linker must place them in a reasonable way. Also profile feedback must be available in to make this option effective. See `-fprofile-arcs' for details. Enabled at levels `-O2', `-O3', `-Os'. -fstrict-aliasing `-fstrict-aliasing' Allows the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. For example, an `unsigned int' can alias an `int', but not a `void*' or a `double'. A character type may alias any other type. Pay special attention to code like this: union a_union { int i; double d; }; int f() { a_union t; t.d = 3.0; return t.i; } The practice of reading from a different union member than the one most recently written to (called "type-punning") is common. Even with `-fstrict-aliasing', type-punning is allowed, provided the memory is accessed through the union type. So, the code above will work as expected. However, this code might not: int f() { a_union t; int* ip; t.d = 3.0; ip = &t.i; return *ip; } Every language that wishes to perform language-specific alias analysis should define a function that computes, given an `tree' node, an alias set for the node. Nodes in different alias sets are not allowed to alias. For an example, see the C front-end function `c_get_alias_set'. Enabled at levels `-O2', `-O3', `-Os'. -falign-functions `-falign-functions' `-falign-functions=N' Align the start of functions to the next power-of-two greater than N, skipping up to N bytes. For instance, `-falign-functions=32' aligns functions to the next 32-byte boundary, but `-falign-functions=24' would align to the next 32-byte boundary only if this can be done by skipping 23 bytes or less. `-fno-align-functions' and `-falign-functions=1' are equivalent and mean that functions will not be aligned. Some assemblers only support this flag when N is a power of two; in that case, it is rounded up. If N is not specified, use a machine-dependent default. Enabled at levels `-O2', `-O3'. -falign-jumps `-falign-jumps' `-falign-jumps=N' Align branch targets to a power-of-two boundary, for branch targets where the targets can only be reached by jumping, skipping up to N bytes like `-falign-functions'. In this case, no dummy operations need be executed. If N is not specified, use a machine-dependent default. Enabled at levels `-O2', `-O3'. -falign-loops `-falign-loops' `-falign-loops=N' Align loops to a power-of-two boundary, skipping up to N bytes like `-falign-functions'. The hope is that the loop will be executed many times, which will make up for any execution of the dummy operations. If N is not specified, use a machine-dependent default. Enabled at levels `-O2', `-O3'. -falign-labels `-falign-labels' `-falign-labels=N' Align all branch targets to a power-of-two boundary, skipping up to N bytes like `-falign-functions'. This option can easily make code slower, because it must insert dummy operations for when the branch target is reached in the usual flow of the code. If `-falign-loops' or `-falign-jumps' are applicable and are greater than this value, then their values are used instead. If N is not specified, use a machine-dependent default which is very likely to be `1', meaning no alignment. Enabled at levels `-O2', `-O3'. `-finline-limit=N' By default, gcc limits the size of functions that can be inlined. This flag allows the control of this limit for functions that are explicitly marked as inline (i.e., marked with the inline keyword or defined within the class definition in c++). N is the size of functions that can be inlined in number of pseudo instructions (not counting parameter handling). The default value of N is 600. Increasing this value can result in more inlined code at the cost of compilation time and memory consumption. Decreasing usually makes the compilation faster and less code will be inlined (which presumably means slower programs). This option is particularly useful for programs that use inlining heavily such as those based on recursive templates with C++. Inlining is actually controlled by a number of parameters, which may be specified individually by using `--param NAME=VALUE'. The `-finline-limit=N' option sets some of these parameters as follows: `max-inline-insns' is set to N. `max-inline-insns-single' is set to N/2. `max-inline-insns-single-auto' is set to N/2. `min-inline-insns' is set to 130 or N/4, whichever is smaller. `max-inline-insns-rtl' is set to N. Using `-finline-limit=600' thus results in the default settings for these parameters. See below for a documentation of the individual parameters controlling inlining. _Note:_ pseudo instruction represents, in this particular context, an abstract measurement of function's size. In no way, it represents a count of assembly instructions and as such its exact meaning might change from one release to an another. `-freduce-all-givs' Forces all general-induction variables in loops to be strength-reduced. _Note:_ When compiling programs written in Fortran, `-fmove-all-movables' and `-freduce-all-givs' are enabled by default when you use the optimizer. These options may generate better or worse code; results are highly dependent on the structure of loops within the source code. `-fprefetch-loop-arrays' If supported by the target machine, generate instructions to prefetch memory to improve the performance of loops that access large arrays. Disabled at level `-Os'. rm -f *.da *.life analyz_prbrob.out Remove any profile feedback information from previous runs. ---------------------------------------------------------------------------- Portability flags used with gcc 3.3 compiler ---------------------------------------------------------------------------- -DFMAX_IS_DOUBLE Denotes the availability of "double fmax(double, double)" in system library. Used in 252.eon. -DHAS_ERRLIST Tells that the system provides the "sys_nerr" and "sys_errlist[]" variables in 252.eon. -DLINUX_i386 Used to enable LINUX specific defines in 186.crafty. -DPSEC_CPU2000_GLIBC22 Compatibility with 2.2 & later versions of glibc (253.perlbmk). -DSPEC_CPU2000_LINUX_I386 Specifies to compile for LINUX system (253.perlbmk). -DSPEC_CPU2000_LP64 (Portability) Used to make longs and pointers 64 bit (Used in all benchmarks, except peak runs of 181.mcf, 197.parser and 300.twolf). -DSPEC_CPU2000_NEED_BOOL Use SPEC provided definition of the boolean type (253.perlbmk). -DSYS_IS_USG Specifies that the operating system is USG compliant. Used in 254.gap. -DSYS_HAS_TIME_PROTO Do not explicitly declare time(). Used in 254.gap. -DSYS_HAS_SIGNAL_PROTO Do not explicitly include the contents of . Used in 254.gap. -DSYS_HAS_IOCTL_PROTO Do not explicitly declare ioctl(). Used in 254.gap. -DSYS_HAS_ANSI System is ANSI compliant. Used in 254.gap. -DSYS_HAS_CALLOC_PROTO Do not explicitly declare calloc(). Used in 254.gap. ---------------------------------------------------------------------------- PGI (Portland Group International) compiler 5.1 flags ---------------------------------------------------------------------------- +ACML Linking with AMD Core Math Library (version 1.5). Supplied with the PGI compiler 5.1 RM_SOURCES=lapak.f90 Remove the source file 'lapak.f90' in 178.galgel. -DSPEC_CPU2000_LP64 (Portability) Used to make longs and pointers 64 bit The optimization levels and their meanings are as follows: -O0 A basic block is generated for each Fortran statement. No scheduling is done between statements. No global optimizations are performed. -O1 Scheduling within extended basic blocks is performed. Some register allocation is performed. No global optimizations are performed. -O2 All level 1 optimizations are performed. In addition, scalar optimizations such as induction recognition and loop invariant motion are performed by the global optimizer. -O3 This level performs all level-one and level-two optimizations and enables more aggressive hoisting and scalar replacement optimizations. -fast Equivalent to "-O2 -Munroll -Mnoframe -Mlre" -fastsse Equivalent to "-fast -Mscalarsse -Mvect=sse -Mcache_align -Mflushz" -Mcache_align Align unconstrained objects of length greater than or equal to 16 bytes on cache-line boundaries. An unconstrained object is a data object that is not a member of an aggregate structure or common block. This option does not affect the alignment of allocatable or automatic arrays. Note: To effect cache-line alignment of stack-based local variables, the main program or function must be compiled with -Mcache_align. -Mfixed Process source using Fortran90 freeform specifications. -Mflushz Set SSE MXCSR register to flush-to-zero mode. -Mipa=[option] Enables interprocedural analysis with the specified option. The valid options are: -Mipa=align Instructs the IPA to recognize when pointer targets are all cache-line aligned, allowing better SSE code generation. -Mipa=arg Instructs the IPA to remove arguments replaced by -Mipa=ptr,const -Mipa=const Enable propagation of constants across procedure calls. -Mipa=fast Equivalent to: -Mipa=const,globals,localarg,ptr,vestigial -Mipa=globals Instructs the IPA to optimize references to globals when not used in procedure calls. -Mipa=localarg Externalizes local variables for use with -Mipa=arg -Mipa=ptr Instructs the IPA to perform pointer disambiguation across procedure calls. -Mipa=vestigial Instructs the IPA to eliminate functions that are not called. -mp Enable OpenMP -Mnoframe Eliminate operations that set up a true stack frame pointer for functions. -Mnosmart Don't run the Smart assembly re-write tool to enable post-compilation linear assembly scheduling and optimization -Mscalarsse Utilize the SSE (Streaming SIMD(Single Instruction Multiple Data) Extensions) and SSE2 instructions to perform the operations coded. This assumes the user has an assembler capable of interpreting SSE/SSE2 instructions, as in later versions of Linux. This implies -Mflushz. -Munroll Invokes the loop unroller. This also sets the optimization level to 2 if the level is set to less than 2. c:m Instructs the compiler to completely unroll loops with a constant loop count less than or equal to m, a supplied constant. If this value is not supplied, the m count is set to 4. n:u Instructs the compiler to unroll u times, a loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the unroller computes the number of times a candidate loop is unrolled. -Mvect=sse Instructs the vectorizer to search for loops, and where possible, use the SSE or SSE2 and prefetch instructions (depending on which processor is targeted). ---------------------------------------------------------------------------- Other Notes ---------------------------------------------------------------------------- taskset [options] [mask] [pid | command [arg] ... ] taskset is used to set or retreive the CPU affinity of a running process given its PID or to launch a new COMMAND with a given CPU affinity. The CPU affinity is represented as a bitmask, with the lowest order bit corresponding to the first logical CPU and highest order bit corresponding to the last logical CPU. When the taskset returns, it is gauranteed that the given program has been scheduled to a legal CPU. The default behaviour of taskset is to run a new command with a given affinity mask: taskset [mask] [command] [arguments] The taskset command is used in the following form in the config file: submit= "MYNUM=$SPECUSERNUM" ; MYMASK=\$((1<<\$SPECUSERNUM)); /usr/bin/taskset \$MYMASK $command $MYMASK is the bitmask corresponding to a specific SPECUSERNUM. For example, $MYMASK value for the first copy of a rate run will be 0x00000001, for the second copy of the rate will be 0x00000002 etc. Thus, the first copy of the rate run will have a CPU affinity of CPU0, the second copy will have the affinity CPU1 etc. BIOS Setting Definitions - DRAM Interleave defines whether data will be interleaved among the four data banks within individual DRAMs. Node Interleave defines whether or not data addresses will be alternating between both processors in 4KB blocks. ACPI SRAT defines whether the Static Resource Allocation Table is exported by the BIOS to a location where the operating system can see it. The SRAT may only be exported when Node Interleave is disabled.