htmlify libm.txt, add more content

author nsz <nsz@port70.net>

Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)

committer nsz <nsz@port70.net>

Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)
author nsz <nsz@port70.net>
Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)
committer nsz <nsz@port70.net>
Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)
diff --git a/libm.txt b/libm.txt

index 7d2d44d..226243b 100644 (file)
--- a/libm.txt
+++ b/libm.txt
@@ -1,399 +1 @@
-sources
--------
-
-fdlibm sources
-(freebsd is most up to date probably, but some code is
-only available from openbsd)
-
-freebsd /lib/msun
-svn checkout svn://svn.freebsd.org/base/head/lib/msun
-
-openbsd /src/lib/libm
-cvs -d anoncvs@anoncvs.eu.openbsd.org:/cvs get src/lib/libm
-
-
-code organization
------------------
-
-math code needs some special code organization
-or build system tweaks
-
-* long double:
-(representation issues are below)
-code organization constraints:
-1.) internal .h to bit manipulate float and double
-2.) long double representation specific .h
-3.) long double representation specific .c
-4.) common long double code for all representations
-5.) no long double code when long double == double
-6.) arch specific .s
-
-1. can be put in src/internal
-2. and 3. can be handled like arch specific code
-(separate directories for different long double
-representations and then somehow hook them up
-in the build system)
-4. can be regular source files in math/ but needs
-ifdefs due to 5.
-all double functions need an extra wrapper/alias
-to solve 5.
-6. can be done the usual way
-
-dir tree:
-/include/math.h
-/include/complex.h
-/arch/ld80/ldouble.h - duplicate per arch ?
-/arch/ld128/ldouble.h
-/src/internal/libm.h
-/src/math/ld80/*l.c
-/src/math/ld128/*l.c
-/src/math/arm/*.s
-/src/math/i386/*.s
-/src/math/x86_64/*.s
-/src/math/*.c
-
-in freebsd most of the long double code is common
-to the different representations but there are some
-representation specific files in ld80/ and ld128/
-
-it is possible to have all the long double code
-duplicated and maintained separately in ld80/ and
-ld128/ (or just ld80/ as ld128/ is rarely needed),
-the implementations always have a few differences
-(eventhough most of the logic is the same and the
-constants only differ in precision etc)
-so 4. can be mostly eliminated
-or have no separate ld80/ and ld128/ but use ifdefs
-and macros in the *l.c codes
-os 3. can be eliminated
-
-* complex:
-it makes sense to make complex code optional
-(compiler might not support it)
-this needs some hacks: ifdefed sources or
-some buildsystem hack
-
-* code style and naming:
-fdlibm has many ugly badly indented code
-it makes sense to clean them up a bit if we
-touch them anyway, but it's also good if the
-code can be easily compared to other fdlibms..
-
-fdlibm uses e_ s_ k_ w_ source code prefixes
-(i guess they stand for ieee, standard c, kernel and
-wrapper functions, but i'm not sure, i'd rather drop
-these if fdlibm diffability was not important)
-
-
-representation
---------------
-
-float and double bit manipulation can be handled
-in a portable way in c
-
-float is ieee binary32
-double is ieee binary64
-(assuming old non-standard representations are not supported)
-
-the endianness may still vary, but that can be worked
-around by using a union with a single large enough
-unsigned int
-
-long double bit manipulation is harder as there are
-various representations:
-1.) same as double: ieee binary64
-2.) ld80: 80bit extended precision format
-3.) ld128: quadruple precision, ieee binary128
-(assuming other non-standard formats are not supported)
-
-case 1. is easy to handle: all long double functions
-are aliased to the corresponding double one
-(long double is treated equivalently to double)
-
-case 2. is the most common (i386, x86_64, emulation)
-case 3. is rare and means software emulation (sparc64)
-
-ld80 means 64bit mant, explicit msb, 15bit exp, 1 sign bit
-ld128 means 113bit mant, implicit msb, 15bit exp, 1 sign bit
-
-endianness can vary (and there is no large enough unsigned
-int to handle it), in case of ld80 padding can vary as well
-(eg the 80bit float of m68k and m88k cpus has different
-padding than the intel one)
-
-supporting ld80 and ld128 with the same code can be done,
-but requres ifdefs and macro hacks, treating them separately
-means many code duplications
-
-
-ugly
-----
-
-some notes are from:
-http://www.vinc17.org/research/extended.en.html
-
-1.) double rounding
-the c language allows arithmetic to be evaluated in
-higher precision and on x86 linux the default fpu
-setting is to round the results in extended precision
-(only x87 instructions are affected, sse2 is not)
-(freebsd and openbsd use double precision by default)
-
-rounding a result in extended precision first then
-rounding it to double when storing it may give
-different results than a single rounding to double
-so x = a+b may give different results depending on
-the fpu setting
-(only happens in round to nearest rounding mode,
-but that's the most common)
-
-2.) wider exponent range
-even if the fpu is set to double precision
-(which is not) the x87 registers use wider exponent
-range so underflows (subnormals) may not be treated
-according to the ieee standard (unless every
-computation result is stored and rounded to double)
-
-3.) precision of intermediate results
-c does not require consistent evaluation
-precision: the compiler may store intermediate
-results and round them to double while keep
-other parts in extended precision
-so (a+b)==(a+b) may be false when the two sides
-are kept in different precision
-
-c99 has a way to control this:
-when a result is stored in a variable or a
-type cast is used, then it is guaranteed that
-the precision is appropriate to that type
-so in (double)(a+b)==(double)(a+b) both
-sides are guaranteed to be in double precision
-when the comparision is done
-(this still does not solve 1. though: eg if
-the left side is evaluated with sse2 and the
-right side with x87 then the result may be false)
-
-unfortunately gcc does not respect the standard
-(infamous gcc 323 bug)
-gcc 4.5 fixed it with '-fexcess-precision=standard'
-(it is enabled by '-std=c99', but the default is
-'-fexcess-precision=fast')
-
-the workaround for older gcc is to force the
-compiler to store the intermediate results:
-by using volatile double temporary variables
-or by '-ffloat-store' (slow)
-
-(sometimes the excess precision is good
-but it's hard to rely on it as it is optional
-
-there is a way to check for it though using
-FLT_EVAL_METHOD, float_t and double_t
-
-but even then it's probably easier to
-unconditionally cast the variables of an
-expression to a higher precision when
-that's what we want and don't depend
-on the implicit excess precision)
-
-4.) runtime vs compile time
-runtime and compile time semantics may be different
-gcc often does unwanted or even invalid compile
-time optimizations
-(constant folding where floating point exception
-flags should be raised at runtime, or the result
-depends on runtime floating point environment,
-or constant expressions are just evaluated with
-different precision than at runtime)
-
-the workaround is again using named volatile
-variables like
-static const volatile two52 = 0x1p52;
-
-(?) old gcc truncated long double const expressions
-on x86, the workaround is to put the constant
-together from doubles at runtime
-
-5.) ld80 vs ld128 bit representations (see above)
-the two representations are sufficiently different
-that treating them together is awkward
-
-ld80 does not have implicit msb and the mantissa is
-best handled by a single uint64_t
-ld128 has implicit msb and the mantissa cannot be
-handled in a single unsigned int
-
-(typical operations:
-get/set exp and sign bit (same for ld80 and ld128)
-check x==0, mant==0 (ld80: explicit bit should be masked)
-check |x| < 1, 0.5 (exp + mant manipulation)
-check if subnormal
-check if integral)
-
-in the freebsd fdlibm code a few architecture specific
-macros and a union handle these issues
-
-6.) signed int
-the fdlibm code has many inconsistencies
-naming conventions, 0x1p0 notation vs decimal notation,
-..and integer type used for bitmanipulation
-
-when the bits of a double is manipulated it is unpacked
-into two 32bit words, which can be
-int32_t, uint32_t, u_int32_t
-
-int32_t is used most often which is wrong because of
-implementation defined signed int representation
-(in general signed int is not handled carefully
-scalbn even depends on signed int overflow)
-
-7.) complex is not implemented in gcc properly
-
-x*I turns into (x+0*I)*(0+I) = 0*x + x*I
-so if x=inf then the result is nan+inf*I instead of inf*I
-(an fmul instruction is generated for 0*x)
-
-so a+b*I cannot be used to create complex numbers
-instead some double[2] manipulation should be used to
-make sure there is no fmul in the generated code
-(freebsd uses a static inline cpack function)
-
-it's not clear which is better: using union hacks
-everywhere in the complex code or creal, cimag, cpack
-
-union dcomplex {
-       double complex z;
-       double a[2];
-};
-// for efficiency these should be macros or inline functions
-#define creal(z) ((double)(z))
-#define cimag(z) ((union dcomplex){(z)}.a[1])
-#define cpack(x,y) ((union dcomplex){.a={(x),(y)}}.z)
-
-eg. various implementations for conj:
-
-// proper implementation, but gcc does not do the right thing
-double complex conj(double complex z) {
-       return creal(z) - cimag(z)*I;
-}
-
-// using cpack
-double complex conj(double complex z) {
-       return cpack(creal(z), -cimag(z));
-}
-
-// explicit work around
-double complex conj(double complex z) {
-       union dcomplex u = {z};
-
-       u.a[1] = -u.a[1];
-       return u.z;
-}
-
-8.) complex code is hard to make portable across compilers
-
-eg it's not clear how I (or _Complex_I) should be defined
-in complex.h
-(literal of the float imaginary unit is compiler specific,
-in gcc it can be 1.0fi)
-
-complex support is not mandatory in c99 conformant
-implementations and even if complex is supported
-there might or might not be separate imaginary type
-
-
-libm implementations
---------------------
-
-unix v7 libm by Robert Morris (rhm) around 1979
-see: "Computer Approximations" by Hart & Cheney, 1968
-used in: plan9, go
-
-cephes by Stephen L. Moshier, 1984-1992
-http://www.netlib.org/cephes/
-see: "Methods and Programs for Mathematical Functions" by Moshier 1989
-
-fdlibm by K. C. Ng around 1993, updated in 1995?
-(1993: copyright notice: "Developed at SunPro ..")
-(1995: copyright notice: "Developed at SunSoft ..")
-see: "Argument Reduction for Huge Arguments: Good to the Last Bit" by Ng 1992
-http://www.netlib.org/fdlibm
-used in: netbsd, openbsd, freebsd, bionic, musl
-
-libultim by Abraham Ziv around 2001 (gpl)
-(also known as the ibm accurate portable math lib)
-see: "Fast evaluation of elementary mathematical functions with correctly rounded last bit" by Ziv 1991
-http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/
-used in: glibc
-
-libmcr by K. C. Ng, 2004
-(sun's correctly rounded libm)
-
-mathcw by Nelson H. F. Beebe, 2010?
-http://ftp.math.utah.edu/pub/mathcw/
-(no sources available?)
-
-sleef by Naoki Shibata, 2010
-(simd lib for evaluating elementary functions)
-http://shibatch.sourceforge.net/
-
-crlibm by ens-lyon, 2004-2010
-http://lipforge.ens-lyon.fr/www/crlibm/
-see: crlibm.pdf at http://lipforge.ens-lyon.fr/frs/?group_id=8&release_id=123
-see: "Elementary Functions" by Jean-Michel Muller 2005
-see: "Handbook of Floating-Point Arithmetic" by (many authors from Lyon) 2009
-
-(closed: intel libm, hp libm for itanium, ..)
-
-
-libm tests
-----------
-
-http://www.netlib.org/fp/ucbtest.tgz
-http://www.jhauser.us/arithmetic/TestFloat.html
-http://cant.ua.ac.be/old/ieeecc754.html
-http://www.loria.fr/~zimmerma/mpcheck/
-http://gforge.inria.fr/projects/mpcheck/
-http://people.inf.ethz.ch/gonnet/FPAccuracy/Analysis.html
-http://www.math.utah.edu/~beebe/software/ieee/
-http://www.vinc17.org/research/testlibm/index.en.html
-http://www.vinc17.org/research/fptest.en.html
-tests in crlibm
-
-multiprecision libs (useful for tests)
-http://mpfr.org/
-http://www.multiprecision.org/index.php?prog=mpc
-http://pari.math.u-bordeaux.fr/
-http://www.apfloat.org/apfloat/
-http://code.google.com/p/fastfunlib/
-scs_lib in crlibm
-
-
-other
------
-
-ieee standard: http://754r.ucbtest.org/
-extended precision issues: http://www.vinc17.org/research/extended.en.html
-correctly rounded mult: http://perso.ens-lyon.fr/jean-michel.muller/MultConstant.html
-table based sincos: http://perso.ens-lyon.fr/damien.stehle/IMPROVEDGAL.html
-finding worst cases: http://perso.ens-lyon.fr/damien.stehle/WCLR.html
-finding worst cases: http://perso.ens-lyon.fr/damien.stehle/DECIMALEXP.html
-finding worst cases: http://www.loria.fr/equipes/spaces/slz.en.html
-finding worst cases: http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm
-generating libm functions: http://lipforge.ens-lyon.fr/www/metalibm/
-fast conversion to fixedpoint: http://stereopsis.com/sree/fpu2006.html
-double-double, quad-double arithmetics: http://crd-legacy.lbl.gov/~dhbailey/mpdist/
-papers by kahan: http://www.cs.berkeley.edu/~wkahan/
-fp paper by goldberg: http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html
-math functions in general: http://dlmf.nist.gov/
-
-
-
-
-
-
-
-
-
-
-
+see http://nsz.repo.hu/libm
diff --git a/libm/index.html b/libm/index.html

new file mode 100644 (file)

index 0000000..abdd7f8
--- /dev/null
+++ b/libm/index.html
@@ -0,0 +1,415 @@
+<html><head><title>libm</title></head><body>
+<h2>libm</h2>
+
+<p>This page is about designing libm for the
+<a href="http://www.etalabs.net/musl/">musl</a> libc.
+<p>Writing the math code from scratch is a huge work
+so already existing code is used.
+
+<ul>
+<li><a href="#sources">Sources</a>
+<li><a href="#issues">Design issues</a>
+<li><a href="#representation">Representation</a>
+<li><a href="#ugly">Ugly</a>
+<li><a href="#implementations">libm implementations</a>
+<li><a href="#tests">libm tests</a>
+</ul>
+
+<h3><a name="sources" href="#sources">Sources</a></h3>
+<p>The math code is mostly from freebsd which in turn
+is based on <a href="http://www.netlib.org/fdlibm/">fdlibm</a>.
+<p>sources:
+<ul>
+<li>freebsd /lib/msun (<a href="http://svnweb.FreeBSD.org/base/head/lib/msun/">browse</a>)
+<pre>
+svn checkout svn://svn.freebsd.org/base/head/lib/msun
+</pre>
+<li>openbsd /src/lib/libm (<a href="http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libm/">browse</a>)
+<pre>
+cvs -d <a href="http://openbsd.org/anoncvs.html#CVSROOT">$CVSROOT</a> get src/lib/libm
+</pre>
+</ul>
+
+<h3><a name="issues" href="#issues">Design issues</a></h3>
+<p>The math code is available but there are still many design questions.
+<ul>
+<li>Code organization, build system tweaks:
+<p>There are architecture specific code,
+long double representation specific code
+and complex code.
+<p>With respect to long double, bsd libm code can handle
+ld64, ld80 and ld128 (where ld64 means long double == double),
+for musl it's enough to support ld64 and ld80, but
+keeping ld128 shouldn't hurt much either.
+<p>So far musl had arch specific code and common code,
+it seems a new category is needed:
+long double specific code, which can be ld64, ld80 or ld128.
+These should be tied to the architecture in the build system
+somehow. In freebsd most of the long double code is
+common (using arch specific macros, ifdefs) and only a
+small part of the ld code is maintained separately for ld80 and ld128,
+but it's also possible to keep all ld code separately (with duplications)
+in ld80 and ld128.
+<p>Complex support is optional in c99 so maybe it should
+be optional in the musl build as well. It also can be
+kept separately in cmath/ or with the rest of the code in math/.
+<p>Pending questions:
+<ul>
+<li>Do we want ld128?
+<li>Should we try to use common code for ld80 and ld128?
+<li>How to do ld64: wrap double functions or alias them?
+<li>How to tie the ld* code to the arch in the build system?
+<li>Make complex optional?
+<li>Keep complex in math/ or cmath/?
+</ul>
+
+<li>Workarounds
+<p>The bsd libm code has many workarounds for various
+compiler issues. It's not clear what's the best way
+to handle the uglyness.
+<p>Pending questions:
+<ul>
+<li>Use STDC pragmas (eventhough gcc does not support them)?
+<li>Use volatile consistently to avoid evaluation precision and const folding issues?
+<li>Use special compiler flags against unwanted optimization (-frounding-math, -ffloat-store)?
+<li>Do inline/macro optimization for small functions? (isnan, isinf, signbit, creal, cimag,..)
+<li>In complex code prefer creal(), cimag() or a union to (un)pack re,im?
+</ul>
+
+<li>Code cleanups
+<p>The fdlibm code style is inconsistent with musl (and ugly)
+so it makes sense to clean it up, but keeping easy diffability
+can be useful. There are various inconsistencies and
+correctness issues in the code which might worth addressing
+(eg. reliance on signed overflow should be eliminated).
+<p>Pending questions:
+<ul>
+<li>Keep diffability with freebsd/openbsd code or reformat the code for clarity?
+<li>Keep e_, s_, k_  fdlibm source file name prefixes?
+<li>Should 0x1p0 float format be preferred over decimal format?
+<li>Should isnan, signbit,.. be preferred over inplace bithacks?
+<li>Is unpacking a double into 2 int32_t ok (signed int representation)?
+<li>Is unpacking the mantissa of ld80 into an int64_t ok?
+</ul>
+</ul>
+
+<h3><a name="representation" href="#representation">Representation</a></h3>
+<p>
+Binary representation of floating point numbers matter
+because lots of bithacks are needed in the math code.
+<p>
+float and double bit manipulation can be handled
+in a portable way in c:
+<ul>
+<li>float is ieee binary32
+<li>double is ieee binary64
+</ul>
+(and old non-standard representations are not supported)
+<p>
+The endianness may still vary, but that can be worked
+around by using a union with a single large enough
+unsigned int. (The only exception is probably arm/oabi fpa
+where the word order of a double is not consistent with the
+endianness [<a href="http://wiki.debian.org/ArmEabiPort#ARM_floating_points">debian wiki on arm</a>],
+but we probably won't support that)
+<p>
+long double bit manipulation is harder as there are
+various representations:
+<ul>
+<li>ld64: long double is the same as double (ieee binary64)
+<li>ld80: 80bit extended precision format [<a href="https://en.wikipedia.org/wiki/Extended_precision">wikipedia</a>]
+<li>ld128: quadruple precision, ieee binary128 [<a href="https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format">wikipedia</a>]
+</ul>
+(and other non-standard formats are not supported)
+<p>
+ld64 is easy to handle: all long double functions
+are aliased to the corresponding double one
+(long double is treated equivalently to double)
+<p>
+ld80 is the most common (i386, x86_64), it means
+64bit significand with explicit msb (inconsistent with other ieee formats),
+15bit exp, 1 sign bit.
+<p>
+ld128 is rare (sparc64 with software emulation), it means
+113bit significand with implicit msb, 15bit exp, 1 sign bit.
+<p>
+Endianness can vary (although on the supported i386 and x86_64 it is the same)
+and there is no large enough unsigned int to handle it.
+(In case of ld80 internal padding can vary as well, eg
+m68k and m88k cpus use different ld80 than the intel ones)
+
+
+<h3><a name="ugly" href="#ugly">Ugly</a></h3>
+<p>The ugly parts of libm hacking.
+<p>Some notes are from:
+<a href="http://www.vinc17.org/research/extended.en.html">http://www.vinc17.org/research/extended.en.html</a>
+
+
+<ul>
+<li>Double rounding (x87 issue):
+<p>
+If a value rounded twice the result can be different
+than rounding just once.
+<p>
+The C language allows arithmetic to be evaluated in
+higher precision than the operands have. If we
+use x87 fpu in extended precision mode it will round
+the results twice: round to 80bit when calculating
+and then round to 64bit when storing it, this can
+give different result than a single 64bit rounding.
+(on x86-linux the default fpu setting is to round the
+results in extended precision, this only affects x87 instructions, not see2 etc)
+(afaik freebsd and openbsd use double precision by default)
+<p>
+So x = a+b may give different results depending on
+the fpu setting.
+(only happens in round to nearest rounding mode,
+but that's the most common one)
+<p>
+C99 annex F prohibits double rounding,
+but that's non-normative.
+
+<li>Wider exponent range (x87 issue):
+<p>
+Even if the fpu is set to double precision
+(which is not) the x87 registers use wider exponent
+range (mant:exp is 53:15 instead of 53:11 bits)
+so underflows (subnormals) may not be treated
+as expected. Rounding to double only occurs
+when a value is stored into memory.
+<p>
+Actually this beahviour is allowed by the ieee 754
+standard, but it can cause problems (see
+<a href="http://www.exploringbinary.com/php-hangs-on-numeric-value-2-2250738585072011e-308/">infamous php bug</a>)
+
+<li>Evaluation precision:
+<p>
+C does not require consistent evaluation
+precision: the compiler may store intermediate
+results and round them to double while keep
+other parts in higher precision.
+So (a+b)==(a+b) may be false when the two sides
+are kept in different precision.
+(This is not an x87 specific problem, it matters whenever there
+is a higher precision fp type than the currently used one.
+It goes away if the highest precision (long double) is used
+everywhere, but that can have a huge penalty).
+<p>
+C99 has a way to control this (see
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#5.1.2.3">5.1.2.3 example 4</a>,
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.5">6.3.1.5</a> and
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.8">6.3.1.8</a>):
+when a result is stored in a variable or a
+type cast is used, then it is guaranteed that
+the precision is appropriate to that type.
+So in (double)(a+b)==(double)(a+b) both
+sides are guaranteed to be in double precision
+when the comparision is done.
+<p>
+(This still does not solve the x87 double rounding
+issue though: eg if the left side is evaluated with
+sse2 and the right side with x87 extended precision setting
+and double rounding then the result may be false
+and still conformant to C99)
+<p>
+Unfortunately gcc does not respect the standard
+and even if assingment or cast is used the result
+may be kept in higher precision
+(infamous <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323">gcc bug323</a>
+also see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36578">gcc bug36578</a>).
+gcc 4.5 fixed it with '-fexcess-precision=standard'
+(it is enabled by '-std=c99', but the default is
+'-fexcess-precision=fast')
+<p>
+The workaround for older gcc is to force the
+compiler to store the intermediate results:
+by using volatile double temporary variables
+or by '-ffloat-store' (slow, and of course
+all intermediate results should be assigned
+to some variable, casting is not enough).
+<p>
+(Sometimes the excess precision is good
+but it's hard to rely on it as it is optional
+<p>
+There is a way to check for it though using
+FLT_EVAL_METHOD, float_t and double_t.
+But it's probably easier to unconditionally
+use higher precision variables when that's
+what we want and don't depend on the implicit
+excess precision).
+
+<li>Compiler optimizations:
+<p>
+Runtime and compile time semantics may be different
+gcc often does unwanted or even invalid compile
+time optimizations.
+(constant folding where floating point exception
+flags should be raised at runtime, or the result
+depends on runtime floating point environment,
+or constant expressions are just evaluated with
+different precision than at runtime).
+<p>
+C99 actually allows most of these optimizations
+but they can be turned off with STDC pragmas (see
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.10.6">6.10.6</a>).
+Unfortunately <a href="http://gcc.gnu.org/c99status.html">gcc does not support these pragmas</s>.
+<p>
+FENV_ACCESS ON tells the compiler that the code wants
+to access the floating point environment (eg. set different rounding mode)
+(see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678">gcc bug34678</a>).
+<p>
+(see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37845">gcc bug37845</a> for FP_CONTRACT pragma).
+<p>
+The workaround is again using named volatile
+variables for constants like
+<pre>
+static const volatile two52 = 0x1p52;
+</pre>
+and using the '-frounding-math' gcc flag.
+<p>
+(According the freebsd libm code gcc truncates
+long double const literals on i386.
+I haven't yet verified if this still the case,
+but as a workaround sometimes double-double arithmetics is used:
+initializing the long double constant from two doubles)
+
+<li>ld80 vs ld128
+<p>
+The two representations are sufficiently different
+that treating them together is awkward.
+<p>
+In the freebsd libm code a few architecture specific
+macros and a union handle these issues, but the
+result is often less clear than treating ld80
+and ld128 separately.
+
+<li>Signed int
+<p>
+The freebsd libm code has many inconsistencies
+(naming conventions, 0x1p0 notation vs decimal notation,..),
+one of them is the integer type used for bitmanipulations:
+The bits of a double are unpacked into one of
+int32_t, uint32_t and u_int32_t
+integer types.
+<p>
+int32_t is used most often which is wrong because of
+implementation defined signed int representation.
+<p>
+In general signed int is not handled carefully
+in the libm code: scalbn even depends on signed int overflow.
+
+<li>Complex arithmetics
+<p>
+With gcc x*I turns into (x+0*I)*(0+I) = 0*x + x*I,
+so if x=inf then the result is nan+inf*I instead of inf*I
+(an fmul instruction is generated for 0*x)
+<p>
+(There were various complex constant folding issues as well).
+<p>
+So a+b*I cannot be used to create complex numbers,
+instead some double[2] manipulation should be used to
+make sure there is no fmul in the generated code
+(freebsd uses a static inline cpack function)
+(It's not clear which is better: using union hacks
+everywhere in the complex code or creal, cimag, cpack)
+<p>
+Complex code is hard to make portable across compilers
+as there are no good compiler support, it is rarely used
+and there are various options available in the standard:
+complex support is optional, imaginary type
+is optional.
+<p>
+It's not clear how I (or _Complex_I) should be defined
+in complex.h
+(literal of the float imaginary unit is compiler specific,
+in gcc it can be 1.0fi).
+</ul>
+
+<h3><a name="implementations" href="#implementations">libm implementations</a></h3>
+<ul>
+<li>
+unix v7 libm by Robert Morris (rhm) around 1979<br>
+see: "Computer Approximations" by Hart & Cheney, 1968<br>
+used in: plan9, go
+<li>
+cephes by Stephen L. Moshier, 1984-1992<br>
+<a href="http://www.netlib.org/cephes/">http://www.netlib.org/cephes/</a><br>
+see: "Methods and Programs for Mathematical Functions" by Moshier 1989
+<li>
+fdlibm by K. C. Ng around 1993, updated in 1995?<br>
+(1993: copyright notice: "Developed at SunPro ..")<br>
+(1995: copyright notice: "Developed at SunSoft ..")<br>
+eg see: "Argument Reduction for Huge Arguments: Good to the Last Bit" by Ng 1992<br>
+<a href="http://www.netlib.org/fdlibm">http://www.netlib.org/fdlibm</a><br>
+used in: netbsd, openbsd, freebsd, bionic, musl
+<li>
+libultim by Abraham Ziv around 2001 (lgpl)<br>
+(also known as the ibm accurate portable math lib)<br>
+see: "Fast evaluation of elementary mathematical functions with correctly rounded last bit" by Ziv 1991<br>
+<a href="http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/">http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/</a><br>
+used in: glibc
+<li>
+libmcr by K. C. Ng, 2004<br>
+(sun's correctly rounded libm)
+<li>
+mathcw by Nelson H. F. Beebe, 2010?<br>
+<a href="http://ftp.math.utah.edu/pub/mathcw/">http://ftp.math.utah.edu/pub/mathcw/</a><br>
+(no sources available?)
+<li>
+sleef by Naoki Shibata, 2010<br>
+(simd lib for evaluating elementary functions)<br>
+<a href="http://shibatch.sourceforge.net/">http://shibatch.sourceforge.net/</a>
+<li>
+crlibm by ens-lyon, 2004-2010<br>
+<a href="http://lipforge.ens-lyon.fr/www/crlibm/">http://lipforge.ens-lyon.fr/www/crlibm/</a><br>
+see: <a href="http://lipforge.ens-lyon.fr/frs/?group_id=8&amp;release_id=123">crlibm.pdf</a><br>
+see: "Elementary Functions" by Jean-Michel Muller 2005<br>
+see: "Handbook of Floating-Point Arithmetic" by (many authors from Lyon) 2009<br>
+<li>
+various other closed ones: intel libm, hp libm for itanium,..
+</ul>
+
+<h3><a name="tests" href="#tests">libm tests</a></h3>
+<ul>
+<li><a href="http://www.netlib.org/fp/ucbtest.tgz">ucbtest.tgz</a>
+<li><a href="http://www.jhauser.us/arithmetic/TestFloat.html">TestFloat</a>
+<li><a href="http://cant.ua.ac.be/old/ieeecc754.html">ieeecc754</a>
+<li><a href="http://www.loria.fr/~zimmerma/mpcheck/">mpcheck</a>,
+<a href="http://gforge.inria.fr/projects/mpcheck/">mpcheck</a>
+<li><a href="http://people.inf.ethz.ch/gonnet/FPAccuracy/Analysis.html">FPAccuracy</a>
+<li><a href="http://www.math.utah.edu/~beebe/software/ieee/">beebe's ieee tests</a>
+<li><a href="http://www.vinc17.org/research/testlibm/index.en.html">testlibm</a>
+<li><a href="http://www.vinc17.org/research/fptest.en.html">fptest</a>
+<li>tests in crlibm
+</ul>
+<p>
+multiprecision libs (useful for tests)
+<ul>
+<li><a href="http://mpfr.org/">mpfr</a>
+<li><a href="http://www.multiprecision.org/index.php?prog=mpc">mpc</a>
+<li><a href="http://pari.math.u-bordeaux.fr/">pari</a>
+<li><a href="http://www.apfloat.org/apfloat/">apfloat</a>
+<li><a href="http://code.google.com/p/fastfunlib/">fastfunlib</a>
+<li>scs_lib in crlibm
+</ul>
+
+<p>other links
+<ul>
+<li>ieee standard: http://754r.ucbtest.org/
+<li>extended precision issues: http://www.vinc17.org/research/extended.en.html
+<li>correctly rounded mult: http://perso.ens-lyon.fr/jean-michel.muller/MultConstant.html
+<li>table based sincos: http://perso.ens-lyon.fr/damien.stehle/IMPROVEDGAL.html
+<li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/WCLR.html
+<li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/DECIMALEXP.html
+<li>finding worst cases: http://www.loria.fr/equipes/spaces/slz.en.html
+<li>finding worst cases: http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm
+<li>generating libm functions: http://lipforge.ens-lyon.fr/www/metalibm/
+<li>fast conversion to fixedpoint: http://stereopsis.com/sree/fpu2006.html
+<li>double-double, quad-double arithmetics: http://crd-legacy.lbl.gov/~dhbailey/mpdist/
+<li>papers by kahan: http://www.cs.berkeley.edu/~wkahan/
+<li>fp paper by goldberg: http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html
+<li>math functions in general: http://dlmf.nist.gov/
+</ul>
+
+<p><small>(<a href="/git/?p=www">page history</a>, <a href="/">home</a>)</small>
+</body></html>
author	nsz <nsz@port70.net>
	Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)
committer	nsz <nsz@port70.net>
	Mon, 27 Feb 2012 20:35:56 +0000 (21:35 +0100)
libm.txt		patch \| blob \| history
libm/index.html	[new file with mode: 0644]	patch \| blob