+<html><head><title>libm</title></head><body>
+<h2>libm</h2>
+
+<p>This page is about designing libm for the
+<a href="http://www.etalabs.net/musl/">musl</a> libc.
+<p>Writing the math code from scratch is a huge work
+so already existing code is used.
+
+<ul>
+<li><a href="#sources">Sources</a>
+<li><a href="#issues">Design issues</a>
+<li><a href="#representation">Representation</a>
+<li><a href="#ugly">Ugly</a>
+<li><a href="#implementations">libm implementations</a>
+<li><a href="#tests">libm tests</a>
+</ul>
+
+<h3><a name="sources" href="#sources">Sources</a></h3>
+<p>The math code is mostly from freebsd which in turn
+is based on <a href="http://www.netlib.org/fdlibm/">fdlibm</a>.
+<p>sources:
+<ul>
+<li>freebsd /lib/msun (<a href="http://svnweb.FreeBSD.org/base/head/lib/msun/">browse</a>)
+<pre>
+svn checkout svn://svn.freebsd.org/base/head/lib/msun
+</pre>
+<li>openbsd /src/lib/libm (<a href="http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libm/">browse</a>)
+<pre>
+cvs -d <a href="http://openbsd.org/anoncvs.html#CVSROOT">$CVSROOT</a> get src/lib/libm
+</pre>
+</ul>
+
+<h3><a name="issues" href="#issues">Design issues</a></h3>
+<p>The math code is available but there are still many design questions.
+<ul>
+<li>Code organization, build system tweaks:
+<p>There are architecture specific code,
+long double representation specific code
+and complex code.
+<p>With respect to long double, bsd libm code can handle
+ld64, ld80 and ld128 (where ld64 means long double == double),
+for musl it's enough to support ld64 and ld80, but
+keeping ld128 shouldn't hurt much either.
+<p>So far musl had arch specific code and common code,
+it seems a new category is needed:
+long double specific code, which can be ld64, ld80 or ld128.
+These should be tied to the architecture in the build system
+somehow. In freebsd most of the long double code is
+common (using arch specific macros, ifdefs) and only a
+small part of the ld code is maintained separately for ld80 and ld128,
+but it's also possible to keep all ld code separately (with duplications)
+in ld80 and ld128.
+<p>Complex support is optional in c99 so maybe it should
+be optional in the musl build as well. It also can be
+kept separately in cmath/ or with the rest of the code in math/.
+<p>Pending questions:
+<ul>
+<li>Do we want ld128?
+<li>Should we try to use common code for ld80 and ld128?
+<li>How to do ld64: wrap double functions or alias them?
+<li>How to tie the ld* code to the arch in the build system?
+<li>Make complex optional?
+<li>Keep complex in math/ or cmath/?
+</ul>
+
+<li>Workarounds
+<p>The bsd libm code has many workarounds for various
+compiler issues. It's not clear what's the best way
+to handle the uglyness.
+<p>Pending questions:
+<ul>
+<li>Use STDC pragmas (eventhough gcc does not support them)?
+<li>Use volatile consistently to avoid evaluation precision and const folding issues?
+<li>Use special compiler flags against unwanted optimization (-frounding-math, -ffloat-store)?
+<li>Do inline/macro optimization for small functions? (isnan, isinf, signbit, creal, cimag,..)
+<li>In complex code prefer creal(), cimag() or a union to (un)pack re,im?
+</ul>
+
+<li>Code cleanups
+<p>The fdlibm code style is inconsistent with musl (and ugly)
+so it makes sense to clean it up, but keeping easy diffability
+can be useful. There are various inconsistencies and
+correctness issues in the code which might worth addressing
+(eg. reliance on signed overflow should be eliminated).
+<p>Pending questions:
+<ul>
+<li>Keep diffability with freebsd/openbsd code or reformat the code for clarity?
+<li>Keep e_, s_, k_ fdlibm source file name prefixes?
+<li>Should 0x1p0 float format be preferred over decimal format?
+<li>Should isnan, signbit,.. be preferred over inplace bithacks?
+<li>Is unpacking a double into 2 int32_t ok (signed int representation)?
+<li>Is unpacking the mantissa of ld80 into an int64_t ok?
+</ul>
+</ul>
+
+<h3><a name="representation" href="#representation">Representation</a></h3>
+<p>
+Binary representation of floating point numbers matter
+because lots of bithacks are needed in the math code.
+<p>
+float and double bit manipulation can be handled
+in a portable way in c:
+<ul>
+<li>float is ieee binary32
+<li>double is ieee binary64
+</ul>
+(and old non-standard representations are not supported)
+<p>
+The endianness may still vary, but that can be worked
+around by using a union with a single large enough
+unsigned int. (The only exception is probably arm/oabi fpa
+where the word order of a double is not consistent with the
+endianness [<a href="http://wiki.debian.org/ArmEabiPort#ARM_floating_points">debian wiki on arm</a>],
+but we probably won't support that)
+<p>
+long double bit manipulation is harder as there are
+various representations:
+<ul>
+<li>ld64: long double is the same as double (ieee binary64)
+<li>ld80: 80bit extended precision format [<a href="https://en.wikipedia.org/wiki/Extended_precision">wikipedia</a>]
+<li>ld128: quadruple precision, ieee binary128 [<a href="https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format">wikipedia</a>]
+</ul>
+(and other non-standard formats are not supported)
+<p>
+ld64 is easy to handle: all long double functions
+are aliased to the corresponding double one
+(long double is treated equivalently to double)
+<p>
+ld80 is the most common (i386, x86_64), it means
+64bit significand with explicit msb (inconsistent with other ieee formats),
+15bit exp, 1 sign bit.
+<p>
+ld128 is rare (sparc64 with software emulation), it means
+113bit significand with implicit msb, 15bit exp, 1 sign bit.
+<p>
+Endianness can vary (although on the supported i386 and x86_64 it is the same)
+and there is no large enough unsigned int to handle it.
+(In case of ld80 internal padding can vary as well, eg
+m68k and m88k cpus use different ld80 than the intel ones)
+
+
+<h3><a name="ugly" href="#ugly">Ugly</a></h3>
+<p>The ugly parts of libm hacking.
+<p>Some notes are from:
+<a href="http://www.vinc17.org/research/extended.en.html">http://www.vinc17.org/research/extended.en.html</a>
+
+
+<ul>
+<li>Double rounding (x87 issue):
+<p>
+If a value rounded twice the result can be different
+than rounding just once.
+<p>
+The C language allows arithmetic to be evaluated in
+higher precision than the operands have. If we
+use x87 fpu in extended precision mode it will round
+the results twice: round to 80bit when calculating
+and then round to 64bit when storing it, this can
+give different result than a single 64bit rounding.
+(on x86-linux the default fpu setting is to round the
+results in extended precision, this only affects x87 instructions, not see2 etc)
+(afaik freebsd and openbsd use double precision by default)
+<p>
+So x = a+b may give different results depending on
+the fpu setting.
+(only happens in round to nearest rounding mode,
+but that's the most common one)
+<p>
+C99 annex F prohibits double rounding,
+but that's non-normative.
+
+<li>Wider exponent range (x87 issue):
+<p>
+Even if the fpu is set to double precision
+(which is not) the x87 registers use wider exponent
+range (mant:exp is 53:15 instead of 53:11 bits)
+so underflows (subnormals) may not be treated
+as expected. Rounding to double only occurs
+when a value is stored into memory.
+<p>
+Actually this beahviour is allowed by the ieee 754
+standard, but it can cause problems (see
+<a href="http://www.exploringbinary.com/php-hangs-on-numeric-value-2-2250738585072011e-308/">infamous php bug</a>)
+
+<li>Evaluation precision:
+<p>
+C does not require consistent evaluation
+precision: the compiler may store intermediate
+results and round them to double while keep
+other parts in higher precision.
+So (a+b)==(a+b) may be false when the two sides
+are kept in different precision.
+(This is not an x87 specific problem, it matters whenever there
+is a higher precision fp type than the currently used one.
+It goes away if the highest precision (long double) is used
+everywhere, but that can have a huge penalty).
+<p>
+C99 has a way to control this (see
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#5.1.2.3">5.1.2.3 example 4</a>,
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.5">6.3.1.5</a> and
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.8">6.3.1.8</a>):
+when a result is stored in a variable or a
+type cast is used, then it is guaranteed that
+the precision is appropriate to that type.
+So in (double)(a+b)==(double)(a+b) both
+sides are guaranteed to be in double precision
+when the comparision is done.
+<p>
+(This still does not solve the x87 double rounding
+issue though: eg if the left side is evaluated with
+sse2 and the right side with x87 extended precision setting
+and double rounding then the result may be false
+and still conformant to C99)
+<p>
+Unfortunately gcc does not respect the standard
+and even if assingment or cast is used the result
+may be kept in higher precision
+(infamous <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323">gcc bug323</a>
+also see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36578">gcc bug36578</a>).
+gcc 4.5 fixed it with '-fexcess-precision=standard'
+(it is enabled by '-std=c99', but the default is
+'-fexcess-precision=fast')
+<p>
+The workaround for older gcc is to force the
+compiler to store the intermediate results:
+by using volatile double temporary variables
+or by '-ffloat-store' (slow, and of course
+all intermediate results should be assigned
+to some variable, casting is not enough).
+<p>
+(Sometimes the excess precision is good
+but it's hard to rely on it as it is optional
+<p>
+There is a way to check for it though using
+FLT_EVAL_METHOD, float_t and double_t.
+But it's probably easier to unconditionally
+use higher precision variables when that's
+what we want and don't depend on the implicit
+excess precision).
+
+<li>Compiler optimizations:
+<p>
+Runtime and compile time semantics may be different
+gcc often does unwanted or even invalid compile
+time optimizations.
+(constant folding where floating point exception
+flags should be raised at runtime, or the result
+depends on runtime floating point environment,
+or constant expressions are just evaluated with
+different precision than at runtime).
+<p>
+C99 actually allows most of these optimizations
+but they can be turned off with STDC pragmas (see
+<a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.10.6">6.10.6</a>).
+Unfortunately <a href="http://gcc.gnu.org/c99status.html">gcc does not support these pragmas</s>.
+<p>
+FENV_ACCESS ON tells the compiler that the code wants
+to access the floating point environment (eg. set different rounding mode)
+(see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678">gcc bug34678</a>).
+<p>
+(see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37845">gcc bug37845</a> for FP_CONTRACT pragma).
+<p>
+The workaround is again using named volatile
+variables for constants like
+<pre>
+static const volatile two52 = 0x1p52;
+</pre>
+and using the '-frounding-math' gcc flag.
+<p>
+(According the freebsd libm code gcc truncates
+long double const literals on i386.
+I haven't yet verified if this still the case,
+but as a workaround sometimes double-double arithmetics is used:
+initializing the long double constant from two doubles)
+
+<li>ld80 vs ld128
+<p>
+The two representations are sufficiently different
+that treating them together is awkward.
+<p>
+In the freebsd libm code a few architecture specific
+macros and a union handle these issues, but the
+result is often less clear than treating ld80
+and ld128 separately.
+
+<li>Signed int
+<p>
+The freebsd libm code has many inconsistencies
+(naming conventions, 0x1p0 notation vs decimal notation,..),
+one of them is the integer type used for bitmanipulations:
+The bits of a double are unpacked into one of
+int32_t, uint32_t and u_int32_t
+integer types.
+<p>
+int32_t is used most often which is wrong because of
+implementation defined signed int representation.
+<p>
+In general signed int is not handled carefully
+in the libm code: scalbn even depends on signed int overflow.
+
+<li>Complex arithmetics
+<p>
+With gcc x*I turns into (x+0*I)*(0+I) = 0*x + x*I,
+so if x=inf then the result is nan+inf*I instead of inf*I
+(an fmul instruction is generated for 0*x)
+<p>
+(There were various complex constant folding issues as well).
+<p>
+So a+b*I cannot be used to create complex numbers,
+instead some double[2] manipulation should be used to
+make sure there is no fmul in the generated code
+(freebsd uses a static inline cpack function)
+(It's not clear which is better: using union hacks
+everywhere in the complex code or creal, cimag, cpack)
+<p>
+Complex code is hard to make portable across compilers
+as there are no good compiler support, it is rarely used
+and there are various options available in the standard:
+complex support is optional, imaginary type
+is optional.
+<p>
+It's not clear how I (or _Complex_I) should be defined
+in complex.h
+(literal of the float imaginary unit is compiler specific,
+in gcc it can be 1.0fi).
+</ul>
+
+<h3><a name="implementations" href="#implementations">libm implementations</a></h3>
+<ul>
+<li>
+unix v7 libm by Robert Morris (rhm) around 1979<br>
+see: "Computer Approximations" by Hart & Cheney, 1968<br>
+used in: plan9, go
+<li>
+cephes by Stephen L. Moshier, 1984-1992<br>
+<a href="http://www.netlib.org/cephes/">http://www.netlib.org/cephes/</a><br>
+see: "Methods and Programs for Mathematical Functions" by Moshier 1989
+<li>
+fdlibm by K. C. Ng around 1993, updated in 1995?<br>
+(1993: copyright notice: "Developed at SunPro ..")<br>
+(1995: copyright notice: "Developed at SunSoft ..")<br>
+eg see: "Argument Reduction for Huge Arguments: Good to the Last Bit" by Ng 1992<br>
+<a href="http://www.netlib.org/fdlibm">http://www.netlib.org/fdlibm</a><br>
+used in: netbsd, openbsd, freebsd, bionic, musl
+<li>
+libultim by Abraham Ziv around 2001 (lgpl)<br>
+(also known as the ibm accurate portable math lib)<br>
+see: "Fast evaluation of elementary mathematical functions with correctly rounded last bit" by Ziv 1991<br>
+<a href="http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/">http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/</a><br>
+used in: glibc
+<li>
+libmcr by K. C. Ng, 2004<br>
+(sun's correctly rounded libm)
+<li>
+mathcw by Nelson H. F. Beebe, 2010?<br>
+<a href="http://ftp.math.utah.edu/pub/mathcw/">http://ftp.math.utah.edu/pub/mathcw/</a><br>
+(no sources available?)
+<li>
+sleef by Naoki Shibata, 2010<br>
+(simd lib for evaluating elementary functions)<br>
+<a href="http://shibatch.sourceforge.net/">http://shibatch.sourceforge.net/</a>
+<li>
+crlibm by ens-lyon, 2004-2010<br>
+<a href="http://lipforge.ens-lyon.fr/www/crlibm/">http://lipforge.ens-lyon.fr/www/crlibm/</a><br>
+see: <a href="http://lipforge.ens-lyon.fr/frs/?group_id=8&release_id=123">crlibm.pdf</a><br>
+see: "Elementary Functions" by Jean-Michel Muller 2005<br>
+see: "Handbook of Floating-Point Arithmetic" by (many authors from Lyon) 2009<br>
+<li>
+various other closed ones: intel libm, hp libm for itanium,..
+</ul>
+
+<h3><a name="tests" href="#tests">libm tests</a></h3>
+<ul>
+<li><a href="http://www.netlib.org/fp/ucbtest.tgz">ucbtest.tgz</a>
+<li><a href="http://www.jhauser.us/arithmetic/TestFloat.html">TestFloat</a>
+<li><a href="http://cant.ua.ac.be/old/ieeecc754.html">ieeecc754</a>
+<li><a href="http://www.loria.fr/~zimmerma/mpcheck/">mpcheck</a>,
+<a href="http://gforge.inria.fr/projects/mpcheck/">mpcheck</a>
+<li><a href="http://people.inf.ethz.ch/gonnet/FPAccuracy/Analysis.html">FPAccuracy</a>
+<li><a href="http://www.math.utah.edu/~beebe/software/ieee/">beebe's ieee tests</a>
+<li><a href="http://www.vinc17.org/research/testlibm/index.en.html">testlibm</a>
+<li><a href="http://www.vinc17.org/research/fptest.en.html">fptest</a>
+<li>tests in crlibm
+</ul>
+<p>
+multiprecision libs (useful for tests)
+<ul>
+<li><a href="http://mpfr.org/">mpfr</a>
+<li><a href="http://www.multiprecision.org/index.php?prog=mpc">mpc</a>
+<li><a href="http://pari.math.u-bordeaux.fr/">pari</a>
+<li><a href="http://www.apfloat.org/apfloat/">apfloat</a>
+<li><a href="http://code.google.com/p/fastfunlib/">fastfunlib</a>
+<li>scs_lib in crlibm
+</ul>
+
+<p>other links
+<ul>
+<li>ieee standard: http://754r.ucbtest.org/
+<li>extended precision issues: http://www.vinc17.org/research/extended.en.html
+<li>correctly rounded mult: http://perso.ens-lyon.fr/jean-michel.muller/MultConstant.html
+<li>table based sincos: http://perso.ens-lyon.fr/damien.stehle/IMPROVEDGAL.html
+<li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/WCLR.html
+<li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/DECIMALEXP.html
+<li>finding worst cases: http://www.loria.fr/equipes/spaces/slz.en.html
+<li>finding worst cases: http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm
+<li>generating libm functions: http://lipforge.ens-lyon.fr/www/metalibm/
+<li>fast conversion to fixedpoint: http://stereopsis.com/sree/fpu2006.html
+<li>double-double, quad-double arithmetics: http://crd-legacy.lbl.gov/~dhbailey/mpdist/
+<li>papers by kahan: http://www.cs.berkeley.edu/~wkahan/
+<li>fp paper by goldberg: http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html
+<li>math functions in general: http://dlmf.nist.gov/
+</ul>
+
+<p><small>(<a href="/git/?p=www">page history</a>, <a href="/">home</a>)</small>
+</body></html>