X-Git-Url: http://nsz.repo.hu/git/?a=blobdiff_plain;ds=inline;f=libm%2Findex.html;h=360146e81229af8948d124fa0d41308d9fe3378b;hb=23940a119a2f4bc56937741c9c47845a96998df3;hp=313773edbf94fd1a0afc7b1b15716269ce6f39a1;hpb=f6652941c9cc38e2b0dc41d632738c2a29c4c232;p=www
diff --git a/libm/index.html b/libm/index.html
index 313773e..360146e 100644
--- a/libm/index.html
+++ b/libm/index.html
@@ -1,42 +1,26 @@
libm
libm
-This page is about designing libm for the
-musl libc.
-
Writing the math code from scratch is a huge work
-so already existing code is used.
+
This page is about libm for the
+musl libc.
-
-musl branch for math hacks:
-
-- browse with gitweb
-
- over git (recommended)
-
-git clone git://nsz.repo.hu:45100/repo/musl
-
- - over http (slower)
-
-git clone http://nsz.repo.hu/repo/musl
-
- - over ssh (passwd is anon)
-
-git clone ssh://anon@nsz.repo.hu:45022/repo/musl
-
-
-
-The math code is mostly from freebsd which in turn
-is based on fdlibm.
+
Writing math code from scratch is a huge work so already existing code is
+used. Several math functions are taken from the
+freebsd libm and a few from the
+openbsd libm implementations.
+Both of them are based on fdlibm.
+The freebsd libm seems to be the most well maintained and most correct version
+of fdlibm.
sources:
- freebsd /lib/msun (browse)
@@ -49,59 +33,100 @@ cvs -d $CVSROOT get src/li
-
-
-Workarounds
-The bsd libm code has many workarounds for various
-compiler issues. It's not clear what's the best way
-to handle the uglyness.
-
In long double code fpu is assumed to be set to extended precision
-
Pending questions:
+
-- Use STDC pragmas (eventhough gcc does not support them)?
-
- Where should we use volatile (to avoid evaluation precision and const folding issues)?
-
- What tricks to use for proper exception raising? (huge*huge, etc)
-
- Should signaling nans be considered?
-
- Which compiler flags should be used (-frounding-math, -ffloat-store)?
-
- TODO: inline/macro optimization for small functions (isnan, isinf, signbit, creal, cimag,..)
-
-
-Code cleanups
-The fdlibm code has code style and correctness issues
-which might worth addressing.
-
Pending questions:
+
Assumption about floating-point representation and arithmetics
+(see c99 annex F.2):
-- Can we use a polyeval(coeffs, x) function instead of writing out the poly?
-
- TODO: prefer 0x1p0 format over decimal one
-
- TODO: use 1.0f instead of (float)1 and const float one = 1;
-
- TODO: use uint32_t instead of int32_t (to avoid signed int overflow and right shifts)
-
- TODO: prefer isnan, signbit,.. over inplace bithacks
+
- float is ieee binary32
+
- double is ieee binary64
+
- long double is either ieee binary64 or little-endian 80bit extended precision (x87 fpu)
+(other long double representations may be supported in the future, until then
+long double math functions will be missing on non-supported platforms)
+On soft-float architectures fenv should not be expected to work according to
+the c and ieee standards (ie. rounding modes and exceptions are not supported,
+the fenv functions are dummy ones)
+Floating-point exception flags should be properly set in math functions
+according to c99 annex F, but without using fenv.h
+(eg. overflow flag can be raised by 0x1p900*0x1p900, because this works even
+without fenv support)
+Most functions need to be precise only in nearest rounding mode.
+Returned floating-point values should be correctly rounded in most cases,
+but the last bit may be wrong:
+
+ |error| < 1.5 ulp
+
+should hold.
+(error is the difference between the exact result and the calculated
+floating-point value)
+(in theory correct rounding can be achieved but with big implementation cost,
+see crlibm)
+At least the following functions must be correctly rounded:
+ceil, copysign, fabs, fdim, floor, fma, fmax, fmin, frexp, ldexp,
+modf, nearbyint, nextafter, nexttoward, rint, round, scalbln, scalbn,
+sqrt, trunc.
+Mathematical properties of functions should be as expected
+(monotonicity, range, symmetries).
+If the FPU precision is altered then nothing is guaranteed to work.
+(ie. when long double does not have full 80bit precision on i386 then things may break)
+Signaling NaN is not supported
+Quiet NaN is supported but all NaNs are treated equally without special
+attention to the internal representation of a NaN
+(eg. the sign of NaN may not be preserved).
+Most gcc bug workarounds should be removed from the code
+(STRICT_ASSIGN macro is used when excessive precision is harmful and
+FORCE_EVAL when expressions must be evaluated for their sideeffect, other
+usage of volatile is not justified, hacks around long double constants are
+not justified eventhough gcc can miscompile those with non-default FPU setting)
+Whenever fenv is accessed the FENV_ACCESS pragma of c99 should be used
+(eventhough gcc does not yet support it), and all usage of optional FE_
+macros should be protected by #ifdef
+For bit manipulation of floating-point values an union should be used
+(eg. union {float f; uint32_t i;})
+uint32_t and uint64_t should be used for bit manipulations.
+(eg signed int must not be used in bit shifts etc when it might invoke
+undefined or implementation defined behaviour).
+POSIX namespace rules must be respected.
+c99 hexfloat syntax (0x1.0p0) should be used when it makes the
+code clearer, but not in public header files
+(those should be c++ and ansi c compatible)
+The 'f' suffix should be used for single precision values (0.1f) when the
+value cannot be exactly represented ((float)0.1 is not ok, that style may lead
+to double rounding issues, but eg. 1.0 or 0.5 can be used instead of 1.0f or
+0.5f)
+Prefer classification macros (eg. isnan) over inplace bit hacks.
+For every math function there should be a c implementation.
+(a notable exception now is sqrtl, since most fpu has instruction for it
+and on soft-float architectures long double == double)
+The c implementation of a long double function should use ifdefs with the
+LDBL_MANT_DIG etc constants from float.h for architecture specific
+implementations.
+In the musl source tree math.h functions go to src/math, complex.h functions
+to src/complex and fenv.h functions to src/fenv. And files are named after the
+functions they implement.
Binary representation of floating point numbers matter
-because lots of bithacks are needed in the math code.
+because bit hacks are often needed in the math code.
-float and double bit manipulation can be handled
-in a portable way in c:
+float and double bit manipulation can be handled in a portable way in c using
+union types:
-- float is ieee binary32
-
- double is ieee binary64
+
- union {float f; uint32_t i;};
+
- union {double f; uint64_t i;};
-(C99 annex F
-makes these mandatory)
-
-The endianness may still vary, but that can be worked
-around by using a union with a single large enough
-unsigned int. (The only exception is probably arm/oabi fpa
-where the word order of a double is not consistent with the
-endianness [debian wiki on arm],
-but we probably won't support that)
-
-long double bit manipulation is harder as there are
-various representations:
+(assuming the bits in the object representation of 32bit and 64bit unsigned ints
+map to the floating-point representation according to ieee-754, this is not
+always the case, eg. old
+arm floating-point accelerator
+(FPA) used mixed endian double representation, but musl does not support the old
+arm ABI)
+
+long double bit manipulation is harder as there are various representations
+and some of them don't map to any unsigned integer type:
- ld64: long double is the same as double (ieee binary64)
- ld80: 80bit extended precision format [wikipedia]
@@ -109,24 +134,39 @@ various representations:
(and other non-standard formats are not supported)
-ld64 is easy to handle: all long double functions
-are just wrappers around the corresponding double ones
-(aliasing is non-conformant)
+In case of ld64 the bit manipulation is the same as with double
+and all long double math functions can be just wrappers around the
+corresponding double ones.
+(using symbol aliasing on the linker level is non-conformant
+since functions would not have unique address then)
-ld80 is the most common (i386, x86_64), it means
-64bit significand with explicit msb (inconsistent with other ieee formats),
-15bit exp, 1 sign bit.
+ld80 is the most common long double on linux (i386 and x86_64 abi),
+it means 64bit significand with explicit msb
+(inconsistent with other ieee formats), 15bit exp, 1 sign bit.
+The m68k (and m88k) architecture uses the same format, but different endianness:
+
+- union {long double f; struct{uint64_t m; uint16_t se; uint16_t pad;} i;}; // x86
+
- union {long double f; struct{uint16_t se; uint16_t pad; uint64_t m;} i;}; // m68k
+
+where m is the significand and se is the sign and exponent.
-ld128 is rare (sparc64 with software emulation), it means
-113bit significand with implicit msb, 15bit exp, 1 sign bit.
+ld128 is rare (eg. sparc64 with software emulation), it means
+113bit significand with implicit msb, 15bit exp, 1 sign bit:
+
+- union {long double f; struct{uint16_t se; uint16_t hi; uint32_t mid; uint64_t lo;} i;};
+
-Endianness can vary (although on the supported i386 and x86_64 it is the same)
-and there is no large enough unsigned int to handle it.
-(In case of ld80 internal padding can vary as well, eg
-m68k and m88k cpus use different ld80 than the intel ones.)
-So there can be further variations in the binary representation
-than just ld80 and ld128.
-
+There are other non-conformant long double types: eg. the old SVR4 abi for ppc
+uses 128 bit long doubles, but it's software emulated using
+two doubles.
+(the newer ppc eabi uses ld64).
+The ibm s390 supports the ieee 754-2008 compliant binary128 floating-point
+format, but previous ibm machines (S/370, S/360) used slightly different
+representation.
+
+This variation shows the difficulty to consistently handle
+long double: the solution is to use ifdefs based on float.h and
+on the endianness and write different code for different architectures.
The ugly parts of libm hacking.