-m’ options are defined for the SH implementations:
-isa=sh4-nofputo the assembler.
-m4a-nofpu, except that it implicitly passes
-dspto the assembler. GCC doesn't generate any DSP instructions at the moment.
switchtables. The default is to use 16-bit offsets.
-mdalignfor alignment constraints.
MACregister as call-clobbered, even if
-mieeeis implicitly enabled. If
-mno-ieeeis implicitly set, which results in faster floating-point greater-equal and less-equal comparisons. The implcit settings can be overridden by specifying either
-musermodeis in effect and the selected code generation option (e.g.
-m4) does not allow the use of the
icbiinstruction. If the selected code generation option does not allow the use of the
-musermodeis not in effect, the inlined code manipulates the instruction cache address array directly with an associative write. This not only requires privileged mode at run time, but it also fails if the cache line had been mapped via the TLB and has become unmapped.
sh*-*-linux*and SH3* or SH4*. When the target is SH4A, this option also partially utilizes the hardware atomic instructions
movco.lto create more efficient code, unless ‘
strict’ is specified.
gbr-offset=’ parameter has to be specified as well.
SR.IMASK = 1111. This model works only when the program runs in privileged mode and is only suitable for single-core systems. Additional support from the interrupt/exception handling code of the system is not required. This model is enabled by default when the target is
sh*-*-linux*and SH1* or SH2*.
movco.linstructions only. This is only available on SH4A and is suitable for multi-core systems. Since the hardware instructions support only 32 bit atomic variables access to 8 or 16 bit variables is emulated with 32 bit accesses. Code compiled with this option is also compatible with other software atomic model interrupt/exception handling systems if executed on an SH4A system. Additional support from the interrupt/exception handling code of the system is not required for this model.
soft-tcb’ model has been selected. For other models this parameter is ignored. The specified value must be an integer multiple of four and in the range 0-1020.
__atomic_test_and_set. Notice that depending on the particular hardware and software configuration this can degrade overall performance due to the operand cache line flushes that are implied by the
tas.binstruction. On multi-core SH4A processors the
tas.binstruction must be used with caution since it can result in data corruption for certain cache configurations.
-mno-inline-ic_invalidateif the inlined code would not work in user mode.
-musermodeis the default when the target is
sh*-*-linux*. If the target is SH1* or SH2*
-musermodehas no effect, since there is no user mode.
inv’ where, if no CSE or hoisting opportunities have been found, or if the entire operation has been hoisted to the same place, the last stages of the inverse calculation are intertwined with the final multiply to reduce the overall latency, at the expense of using a few more instructions, and thus offering fewer scheduling opportunities with other code.
inv:minlat’ strategy. This gives high code density for
inv’ algorithm for initial code generation, but if the code stays unoptimized, revert to the ‘
call2’, or ‘
fp’ strategies, respectively. Note that the potentially-trapping side effect of division by zero is carried by a separate instruction, so it is possible that all the integer instructions are hoisted out, but the marker for the side effect stays where it is. A recombination to floating-point operations or a call is not possible in that case.
inv:minlat’ strategy. In the case that the inverse calculation is not separated from the multiply, they speed up division where the dividend fits into 20 bits (plus sign where applicable) by inserting a test to skip a number of operations in this case; this test slows down the case of larger dividends. ‘
inv20u’ assumes the case of a such a small dividend to be unlikely, and ‘
inv20l’ assumes it to be likely.
For targets other than SHmedia strategy can be one of:
div1to perform the operation. Division by zero calculates an unspecified result and does not trap. This is the default except for SH4, SH2A and SHcompact.
div1instruction with case distinction for larger divisors. Division by zero calculates an unspecified result and does not trap. This is the default for SH4. Specifying this for targets that do not have dynamic shift instructions defaults to
When a division strategy has not been specified the default strategy is selected based on the current target. For SH2A the default strategy is to use the
divu instructions instead of library function calls.
call’ and ‘
inv:call’ division strategies, and the compiler still expects the same sets of input/output/clobbered registers as if this option were not present.
gettrinstruction to number. The default is 2 if
-mpt-fixedis in effect, 100 otherwise.
pt*instructions won't trap. This generally generates better-scheduled code, but is unsafe on current hardware. The current architecture definition says that
ptreltrap when the target anded with 3 is 3. This has the unintentional effect of making it unsafe to schedule these instructions before a branch, or hoist them out of a loop. For example,
__do_global_ctors, a part of
libgccthat runs constructors at program startup, calls functions in a list which is delimited by −1. With the
ptabsis done before testing against −1. That means that all the constructors run a bit more quickly, but when the loop comes to the end of the list, the program crashes because
ptabsloads −1 into a target register.
Since this option is unsafe for any hardware implementing the current architecture specification, the default is
-mno-pt-fixed. Unless specified explicitly with
-mno-pt-fixed also implies
-mgettrcost=100; this deters register allocation from using target registers for storing ordinary integers.
ptrel, but with assembler and/or linker tricks it is possible to generate symbols that cause
ptrelto trap. This option is only meaningful when
-mno-pt-fixedis in effect. It prevents cross-basic-block CSE, hoisting and most scheduling of symbol loads. The default is
bfare fast. If
-mzdcbranchis specified, the compiler prefers zero displacement branch code sequences. This is enabled by default when generating code for SH4 and SH4A. It can be explicitly disabled by specifying
nopif a suitable instruction can't be found. By default this option is disabled. It can be enabled to work around hardware bugs as found in the original SH7055.
-mfused-maddoption is now mapped to the machine-independent
-mno-fused-maddis mapped to
fscainstruction for sine and cosine approximations. The option
-mfscamust be used in combination with
-funsafe-math-optimizations. It is enabled by default when generating code for SH4A. Using
-mno-fscadisables sine and cosine approximations even if
-funsafe-math-optimizationsis in effect.
fsrrainstruction for reciprocal square root approximations. The option
-mfsrramust be used in combination with
-ffinite-math-only. It is enabled by default when generating code for SH4A. Using
-mno-fsrradisables reciprocal square root approximations even if
-ffinite-math-onlyare in effect.
© Free Software Foundation
Licensed under the GNU Free Documentation License, Version 1.3.