Floating-point Options for Multiple Architectures

The options described in this topic provide optimizations with varying degrees of precision in floating-point calculations for the IA-32, IntelŪ EM64T, and ItaniumŪ architectures. Where options are not universally supported on all architectures, the description lists the supported architecture.

Using the options listed below can reduce application performance. In general, to achieve greater application performance you might need to sacrifice some degree of floating-point accuracy.

The floating point options listed in this topic provide optimizations with varying degrees of precision in floating-point arithmetic; -O0 (Linux*) or /Od (Windows*) disables these optimizations.

Windows*

Linux*

Description

/fp

-fp-model

Specifies semantics used in floating-point calculations.

The default model is fast, which enables aggressive optimizations when implementing floating-point calculations. The optimizations increase speed, but might affect floating-point computation accuracy.

See the following topic for detailed descriptions of the different models and examples:

/fltconsistency

-fltconsistency

Enables improved floating-point consistency and may slightly reduce execution speed. It limits floating-point optimizations and maintains declared precision.

The option also disables inlining of math library functions.

IA-32 and Intel EM64T:

  • In general, the option maintains maximum precision not the declared precision.

For more information, see the following topic:

/Qprec

-mp1

Improves floating-point precision; this option has less impact to performance than the -fltconsistency (Linux) or /fltconsistency (Windows) option.

This option prevents the compiler from performing optimizations which change NaN comparison semantics; also, the option causes all values used in comparisons to be truncated to declared precision prior to use in the comparison. Furthermore, the option insures the use of library routines, which provide more accurate results compared to the X87 transcendental instructions.  Finally, the option causes the Intel® Compiler to use precise divide and square root operations.

For more information, see the following topic:

  • -mp1 compiler option

/Qftz

-ftz

Flushes denormal results to zero when the application is in gradual underflow mode. Flushing the denormal values to zero with this option may improve overall application performance.

Use this option if the denormal values are not critical to application behavior.

Architectures:

  • Linux: IA-32, IntelŪ EM64T, ItaniumŪ

  • Windows: IA-32, ItaniumŪ

  • Mac OS*: IA-32

This option affects the result of abrupt underflow by setting the floating underflow to zero and allowing the execution to continue.

IA-32 and Intel® EM64T:

  • The compiler automatically sets the flush-to-zero mode in the SSE Control Register (MXCSR) when SSE instructions are enabled.

  • Use this option to flush x87 floating-point values to zero (0). This option can significantly degrade performance in x87 code since the generated code stream must be synchronized after each floating-point instruction to allow the operating system to do the necessary abrupt underflow corrections. There may be performance gains in code targeting SSE and SSE2. There are other considerations when using SSE instructions. Refer to the compiler option topic below for other SSE-specific behaviors.

  • Use this option on any source where flushing x87 denormal values to zero is desired.

ItaniumŪ:

  • Using the -O3 (Linux) or /O3 (Windows) option sets the abrupt underflow to zero (enables this option). At lower optimization levels, gradual underflow to zero (0) is the default behavior.

  • Use this option only on source files containing main; this enables the FTZ mode. The initial thread, and any threads subsequently created by that process, will operate in FTZ mode.

  • Gradual underflow to zero (0) can degrade performance. Using higher optimization levels to get the default abrupt underflow or explicitly setting this option improves performance. The option may improve performance on ItaniumŪ 2 processor, even in the absence of actual underflow, most frequently for single-precision code.

  • This option instructs the compiler to treat denormal values used in a computation as zero (0) so no floating invalid exception occurs.

If this option produces undesirable results of the numerical behavior of your program, you can turn the FTZ mode off by using -no-ftz (Linux) or /Qftz- (Windows) in the command line while still benefiting from the -O3 (Linux) or /O3 (Windows) optimizations.

For more information, see the following topic:

  • -ftz compiler option

/fpe

-fpe

Provides some control over the results of floating-point exception handling at run time for the main program.

For more information, see the following topic:

  • -fpe compiler option