X-Git-Url: http://nsz.repo.hu/git/?a=blobdiff_plain;f=libm%2Findex.html;h=adad680e096db0dd1e72c4e8473163b5bddbe588;hb=b008d9b4602b213d59f0e735f4844151da00fac5;hp=e177d6c972a686f49e95d89bec27e4c920ade1db;hpb=6a433528075d5cc62d4d4c3fefb59d42ec78355a;p=www diff --git a/libm/index.html b/libm/index.html index e177d6c..adad680 100644 --- a/libm/index.html +++ b/libm/index.html @@ -35,7 +35,7 @@ cvs -d $CVSROOT get src/li
|error| < 1.5 ulp-should hold. +should hold for most functions. (error is the difference between the exact result and the calculated floating-point value) (in theory correct rounding can be achieved but with big implementation cost, @@ -69,14 +69,16 @@ sqrt, trunc.
Binary representation of floating point numbers matter because bit hacks are often needed in the math code. +(in particular bit hacks are used instead of relational operations for nan +and sign checks becuase relational operators raise invalid fp exception on nan +and they treat -0.0 and +0.0 equally and more often than not these are not desired)
float and double bit manipulation can be handled in a portable way in c using union types: @@ -173,6 +178,8 @@ on the endianness and write different code for different architectures.
The ugly parts of libm hacking.
Some notes are from: http://www.vinc17.org/research/extended.en.html +
Useful info about floating-point in gcc: +http://gcc.gnu.org/wiki/FloatingPointMath
-(According the freebsd libm code gcc truncates -long double const literals on i386. -I haven't yet verified if this still the case, -but as a workaround double-double arithmetics is used: -initializing the long double constant from two doubles) +According to the freebsd libm code gcc truncates long double +const literals on i386. +I assume this happens because freebsd uses 64bit long doubles by default +(double precision) and gcc incorrectly uses the precision setting of the +host platform instead of the target one, but i did not observe this on linux. +(as a workaround sometimes double-double arithmetics was used +to initialize long doubles on i386, but most of these should be +fixed in musl's math code now)
@@ -358,27 +368,27 @@ but feraiseexcept is not available for some reason, then simple arithmetics can be be used just for their exception raising side effect (eg. 1/0.0 to raise divbyzero), however beaware -of compiler optimizations (dead code elimination,..). +of compiler optimizations (constant folding and dead code elimination,..).
Unfortunately gcc does not always take fp exceptions into account: a simple x = 1e300*1e300; may not raise overflow exception at runtime, but get optimized into x = +inf. see compiler optimizations above.
-Another x87 gcc bug related to fp exceptions is that +Another x87 gcc bug related to fp exceptions is that in some cases comparision operators (==, <, etc) don't raise invalid when an operand is nan (eventhough this is required by ieee + c99 annex F). (see gcc bug52451).
-The ieee standard defines signaling and quite nan +The ieee standard defines signaling and quiet nan floating-point numbers as well. The c99 standard only considers quiet nan, but it allows signaling nans to be supported as well. Without signaling nans x * 1 is equivalent to x, but if signaling nan is supported then the former raises an invalid exception. -This may complicates things further if one wants to write +This may complicate things further if one wants to write portable fp math code.
A further libm design issue is the math_errhandling macro: @@ -389,8 +399,23 @@ but errno is hard to support: certain library functions are implemented as a single asm instruction (eg sqrt), the only way to set errno is to query the fp exception flags and then set the errno variable based on that. -So eventhough errno may be convenient in libm it is +So eventhough errno may be convenient, in libm it is not the right thing to do. +
+For soft-float targets however errno seems to be the only option +(which means annex K cannot be fully supported, as it requires +the support of exception flags). +The problem is that at context switches the fpu status should +be saved and restored which is done by the kernel on hard-fp +architectures when the state is in an fpu status word. +In case of soft-fp emulation this must be done by the c runtime: +context switches between threads can be supported with thread local +storage of the exception state, but signal handlers may do floating-point +arithmetics which should not alter the fenv state. +Wrapping signal handlers is not possible/difficult for various +reasons and the compiler cannot know which functions will be used +as signal handlers, so the c runtime has no way to guarantee that +signal handlers do not alter the fenv.
@@ -429,16 +454,26 @@ in gcc it can be 1.0fi).
The freebsd libm code has many inconsistencies (naming conventions, 0x1p0 notation vs decimal notation,..), -one of them is the integer type used for bitmanipulations: +one of them is the integer type used for bit manipulations: The bits of a double are unpacked into one of -int32_t, uint32_t and u_int32_t +int, int32_t, uint32_t and u_int32_t integer types.
-int32_t is used most often which is wrong because of -implementation defined signed int representation. +int32_t is used the most often which is not wrong in itself +but it is used incorrectly in many places. +
+int is a bit worse because unlike int32_t it is not guaranteed +to be 32bit two's complement representation. (but of course in +practice they are the same) +
+The issues found so far are left shift of negative integers +(undefined behaviour), right shift of negative integers +(implementation defined behaviour), signed overflow +(implementation defined behaviour), unsigned to signed conversion +(implementation defined behaviour).
-In general signed int is not handled carefully -in the libm code: scalbn even depends on signed int overflow. +It is easy to avoid these issues without performance impact, +but a bit of care should be taken around bit manipulations.