Description of compiler flags for Intel C/C++ compiler for Linux: ----------------------------------------------------------------- -O2 Optimizes for speed. -O2 turns ON intrinsic inlining. Enables the following capabilities for performance gain: Constant propagation Copy propagation Dead-code elimination Global register allocation Global instruction scheduling and control speculation Loop unrolling Optimizes code selection Partial redundancy elimination Strength reduction/induction variable simplification Variable renaming Exception handling optimizations Tail recursions Peephole optimizations Structure assignment lowering and optimizations Dead store elimination ON by default. -O3 Enables -O2 option with more aggressive optimization, for example, prefetching, scalar replacement, and loop transformations. Optimizes for maximum speed, but does not guarantee higher performance unless loop and memory access transformation take place. -ax[i|M|K|W] Generates, in a single binary, code specialized to the extensions specified by the codes: i Pentium Pro, Pentium II processors M Pentium with MMX(TM) technology processor K Pentium III processor W Pentium 4 processor In addition, -ax generates generic IA-32 code. The generic code is usually slower. -x[i|M|K|W] Generate specialized code to run exclusively on the processors supporting the extensions indicated by the codes: i Pentium Pro, Pentium II processors M Pentium with MMX(TM) technology processor K Pentium III processor W Pentium 4 processor -nolib_inline Disables inline expansion of standard library functions. OFF by default. -ipo Enables interprocedural optimizations across files. OFF by default. Enables the following optimizations: Inline function expansion Interprocedural constant propagation Monitoring module-level static variables Dead code elimination Propagation of function characteristics Multifile optimization Passing arguments in registers Loop-invariant code motion -prof_gen[x] Instructs the compiler to produce instrumented code in your object files in preparation for instrumented execution. NOTE: The dynamic information files are produced in phase 2 when you run the instrumented executable. OFF by default. -prof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -prof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. OFF by default. -ansi Select strict ANSI C/C++ conformance dialect. OFF by default. -fno-alias Assume no aliasing in program. OFF by default. -ansi_alias[-] ansi_alias directs the compiler to assume the following: - Arrays are not accessed out of bounds. - Pointers are not cast to non-pointer types, and vice-versa. - References to objects of two different scalar types cannot alias. For example, an object of type int cannot alias with an object of type float, or an object of type float cannot alias with an object of type double. If your program satisfies the above conditions, setting the -ansi_alias flag will help the compiler better optimize the program. However, if your program does not satisfy one of the above conditions, the -ansi_alias flag may lead the compiler to generate incorrect code. OFF by default. -ipo Enables interprocedural optimizations across files. OFF by default. Enables the following optimizations: Inline function expansion Interprocedural constant propagation Monitoring module-level static variables Dead code elimination Propagation of function characteristics Multifile optimization Passing arguments in registers Loop-invariant code motion -unroll[n] Set maximum number (n) of times to unroll loops. Omit n to use default heuristics. Use n =0 to disable loop unrolling. OFF by default. -prefetch[-] Is warned and ignored by the Intel C/C++ compiler. -fshort-enums Allocate as many bytes as needed for enumerated types. OFF by default. -Qoption,, Pass options to tool specified by . The option must be entered with a -ip or -ipo specification. specifiers: -ip_ninl_max_stats=n Sets the valid max number of intermediate language statements for a function that is expanded in line. The number n is a positive integer. The number of intermediate language statements usually exceeds the actual number of source language statements. The default value for n is 230. The compiler uses a larger limit for user inline functions. -ip_ninl_min_stats=n Sets the valid min number of intermediate language statements for a function that is expanded in line. The number n is a positive integer. The default value for ip_ninl_min_stats is: IA-32 compiler: ip_ninl_min_stats = 7 -ip_ninl_max_total_stats=n Sets the maximum increase in size of a function, measured in intermediate language statements, due to inlining. n is a positive integer whose default value is 2000. options: c C/C++ compiler. f Fortran compiler. Compiler gives no warnings when f is specified with a C compiler. OFF by default. -static Prevents linking with shared libraries. Default is OFF. Description of compiler flags for Intel Fortran Compiler 7.0 for Linux: ----------------------------------------------------------------------- -O2 Optimize to favor code speed. Disable option -fp. The -O2 option is ON by default. Inlines intrinsics. Enables the following capabilities for performance gain: constant propagation copy propagation dead-code elimination global register allocation global instruction scheduling and control speculation loop unrolling optimized code selection partial redundancy elimination strength reduction/induction variable simplification variable renaming predication software pipelining -fp Disables the use of the ebp register in optimizations. Directs to use the ebp-based stack frame for all functions. Default is OFF. -O3 Enables -O2 option with more aggressive optimization. Optimizes for maximum speed, but does not guarantee higher performance unless loop and memory access transformation take place. In conjunction with -axK and -xK options, this option causes the compiler to perform more aggressive data dependency analysis than for - O2. This may result in longer compilation times. Default is OFF. -ax{i|M|K|W} Generates processor-specific code corresponding to one of codes: i, M, K, and W while also generating generic IA-32 code. Compiler generates multiple versions of some routines, and chooses the best version for the host processor at runtime indicated by processor-specific codes: i Pentium Pro, Pentium II processors M Pentium with MMX(TM) technology processor K Pentium III processor W Pentium 4 processor -x{i|M|K|W} Generates code that is optimized for a specific processor corresponding to one of codes: i, M, K, and W, but that will execute on any IA-32 processor. With this option, the resulting program may not run on processors older than the target specified. i Pentium Pro, Pentium II processors M Pentium with MMX(TM) technology processor K Pentium III processor W Pentium 4 processor -nolib_inline Disables inline expansion of standard library functions. Default is ON. -ipo Enables interprocedural optimizations across files. Default is OFF. -ansi_alias[-] Enables (default) or disables assumption of the programs ANSI conformance. Default is ON. -fno-alias Assume no aliasing in program. Default is OFF. -scalar_rep[-] Enables (default) or disables scalar replacement performed during loop transformations (requires O3). -unroll[n] Set maximum number of times to unroll a loop. n omitted: compiler decides whether to perform unrolling or not (default) n = 0: disables unroller Eliminates some code; hides latencies; can increase code size. -prof_gen Instruments the program for profiling: to get the execution count of each basic block. Default is OFF. -prof_use Enables the use of profiling dynamic feedback information during optimization. Profiles the most frequently executed areas and increases effectiveness of IPO. Default is OFF. -prefetch[-] Enables or disables prefetch insertion (requires -O3). Reduces the wait time; optimum use is determined empirically. ON by default. -auto Makes all local variables AUTOMATIC. Default is OFF. -align Analyzes and reorders memory layout for variables and arrays. Default is ON. -Zp{n} Specifies alignment constraint for structures on 1-, 2-, 4-, 8-, or 16-byte boundary. Default is -Zp4. -Qoption,, Passes options specified by to , where opts is a comma-separated list of options. The syntax for this option is: Designates one or more of these tools: fpp Intel Fortran preprocessor f Fortran compiler (f90com) link Linker (ld(1)) indicates one or more valid argument strings for the designated program. specifiers: -ip_ninl_max_stats=n Sets the valid number of intermediate language statements for a function that is expanded in line. The number n is a positive integer. The number of intermediate language statements usually exceeds the actual number of source language statements. The default value for n is 230. ip_ninl_max_total_stats=n Sets the maximum increase in size of a function, measured in intermediate language statements, due to inlining. The number n is a positive integer. The default value for n is 2000. OFF by default. -static Prevents linking with shared libraries. Default is OFF.