FORCE_EVAL when expressions must be evaluated for their side-effect, other
usage of volatile is not justified, hacks around long double constants are
not justified eventhough gcc can miscompile those with non-default FPU setting)
+<li>When excessive precision is not harmful, temporary variables
+should be float_t or double_t (so on i386 no superfluous store is
+generated)
<li>Whenever fenv is accessed the FENV_ACCESS pragma of c99 should be used
(eventhough gcc does not yet support it), and all usage of optional FE_
macros should be protected by #ifdef
and then round to 64bit when storing it, this can
give different result than a single 64bit rounding.
(on x86-linux the default fpu setting is to round the
-results in extended precision, this only affects x87 instructions, not see2 etc)
+results in extended precision, this only affects x87 instructions, not sse2 etc)
(freebsd and openbsd use double precision by default)
<p>
So x = a+b may give different results depending on
<p>
<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#F.7.3">C99 annex F</a>
prohibits double rounding, but that's non-normative.
+<p>
+Note that the value of the result can only be ruined by
+double rounding in nearest rounding mode, but the double
+rounding issue haunts directed rounding modes as well:
+raising the underflow flag might be omitted.
+On x86 with downward rounding
+<pre>
+(double)(0x1p-1070 + 0x1p-2000L)
+</pre>
+does not raise underflow (only inexact) eventhough the
+final result is an inexact subnormal.
+
<li>Wider exponent range (x87 issue):
<p>
may be false when the two sides
are kept in different precision.
(This is not an x87 specific problem, it matters whenever there
-is a higher precision fp type than the currently used one.
+is a higher precision fp type than the currently used one and
+FLT_EVAL_METHOD!=0.
It goes away if the highest precision (long double) is used
everywhere, but that can have a huge penalty).
+(clang uses sse by default on i386 with FLT_EVAL_METHOD==0,
+while gcc uses the 80bit x87 fp registers and FLT_EVAL_METHOD==2)
<p>
C99 has a way to control this (see
<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#5.1.2.3">5.1.2.3 example 4</a>,
gcc 4.5 fixed it with '-fexcess-precision=standard'
(it is enabled by '-std=c99', but the default is
'-fexcess-precision=fast')
+(An alternative solution would be if gcc spilled the
+registers with temporary results without rounding,
+storing the 80 bit registers entirely in memory
+which would make the behaviour under FLT_EVAL_METHOD==2
+mode more predictable)
<p>
The workaround for older gcc is to force the
compiler to store the intermediate results:
what we want and don't depend on the implicit
excess precision).
+<li>Float literals
+<p>
+The standard allows 1 ulp errors in the conversion
+of decimal floating-point literals into floating-point
+values (it only requires the same result for the same
+literal thus <tt>1.1 - 1.1</tt> is always 0,
+but <tt>1.1 - 11e-1</tt> maybe +-0x1p-52 or 0).
+<p>
+A reasonable compiler always use correctly rounded
+conversion according to the default (nearest) rounding
+mode, but there are exceptions:
+the x87 has builtin constants which are faster to load
+from hardware than from memory
+(and the hw has sticky bit set correctly for rounding).
+gcc can recognize these constants so an operation on
+<pre>
+3.141592653589793238462643383L
+</pre>
+can turn into code that uses the <tt>fldpi</tt>
+instruction instead of memory loads.
+The only issue is that fldpi depends on
+the current rounding mode at runtime
+so the result can indeed be 1 ulp off compared
+to the compile-time rounded value.
+<p>
+According to the freebsd libm code gcc truncates long double
+const literals on i386.
+I assume this happens because freebsd uses 64bit long doubles by default
+(double precision) and gcc incorrectly uses the precision setting of the
+host platform instead of the target one, but i did not observe this on linux.
+(as a workaround sometimes double-double arithmetics was used
+to initialize long doubles on i386, but most of these should be
+fixed in musl's math code now)
+
<li>Compiler optimizations:
<p>
Runtime and compile time semantics may be different
C99 actually allows most of these optimizations
but they can be turned off with STDC pragmas (see
<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.10.6">6.10.6</a>).
-Unfortunately <a href="http://gcc.gnu.org/c99status.html">gcc does not support these pragmas</a>.
+Unfortunately <a href="http://gcc.gnu.org/c99status.html">gcc does not support these pragmas</a>
+nor clang (<a href="http://llvm.org/bugs/show_bug.cgi?id=8100">clang bug 8100</a>).
<p>
FENV_ACCESS ON tells the compiler that the code wants
to access the floating point environment (eg. set different rounding mode)
static const volatile two52 = 0x1p52;
</pre>
and using the '-frounding-math' gcc flag.
-<p>
-According to the freebsd libm code gcc truncates long double
-const literals on i386.
-I assume this happens because freebsd uses 64bit long doubles by default
-(double precision) and gcc incorrectly uses the precision setting of the
-host platform instead of the target one, but i did not observe this on linux.
-(as a workaround sometimes double-double arithmetics was used
-to initialize long doubles on i386, but most of these should be
-fixed in musl's math code now)
<li>ld80 vs ld128
<p>