nsz Git - www/blob - libm/index.html

   1 <html><head><title>libm</title></head><body>
   2 <h2>libm</h2>
   3
   4 <p>This page is about libm for the
   5 <a href="http://www.musl-libc.org/">musl</a> libc.
   6
   7 <ul>
   8 <li><a href="#sources">Sources</a>
   9 <li><a href="#rules">General rules</a>
  10 <li><a href="#representation">Representation</a>
  11 <li><a href="#ugly">Ugly</a>
  12 <li><a href="#implementations">libm implementations</a>
  13 <li><a href="#tests">libm tests</a>
  14 </ul>
  15
  16 <h3><a name="sources" href="#sources">Sources</a></h3>
  17 <p>Writing math code from scratch is a huge work so already existing code is
  18 used. Several math functions are taken from the
  19 <a href="http://freebsd.org/">freebsd</a> libm and a few from the
  20 <a href="http://openbsd.org/">openbsd</a> libm implementations.
  21 Both of them are based on <a href="http://www.netlib.org/fdlibm/">fdlibm</a>.
  22 The freebsd libm seems to be the most well maintained and most correct version
  23 of fdlibm.
  24 <p>sources:
  25 <ul>
  26 <li>freebsd /lib/msun (<a href="http://svnweb.FreeBSD.org/base/head/lib/msun/">browse</a>)
  27 <pre>
  28 svn checkout svn://svn.freebsd.org/base/head/lib/msun
  29 </pre>
  30 <li>openbsd /src/lib/libm (<a href="http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libm/">browse</a>)
  31 <pre>
  32 cvs -d <a href="http://openbsd.org/anoncvs.html#CVSROOT">$CVSROOT</a> get src/lib/libm
  33 </pre>
  34 </ul>
  35
  36 <h3><a name="rules" href="#rules">General rules</a></h3>
  37 <ul>
  38 <li>Assumptions about floating-point representation and arithmetics
  39 (see <a href="http://port70.net/~nsz/c/c99/n1256.html#F.2">c99 annex F.2</a>):
  40 <ul>
  41 <li>float is ieee binary32
  42 <li>double is ieee binary64
  43 <li>long double is either ieee binary64 or little-endian 80bit extended precision (x87 fpu)
  44 </ul>
  45 (other long double representations may be supported in the future, until then
  46 long double math functions will be missing on non-supported platforms)
  47 <li>On soft-float architectures fenv should not be expected to work according to
  48 the c and ieee standards (ie. rounding modes and exceptions are not supported,
  49 the fenv functions are dummy ones)
  50 <li>Floating-point exception flags should be properly set in math functions
  51 according to c99 annex F, but without using fenv.h
  52 (eg. overflow flag can be raised by 0x1p900*0x1p900, because this works even
  53 without fenv support)
  54 <li>Most functions need to be precise only in nearest rounding mode.
  55 <li>Returned floating-point values should be correctly rounded in most cases,
  56 but the last bit may be wrong:
  57 <pre>
  58     |error| &lt; 1.5 ulp
  59 </pre>
  60 should hold for most functions.
  61 (error is the difference between the exact result and the calculated
  62 floating-point value)
  63 (in theory correct rounding can be achieved but with big implementation cost,
  64 see <a href="http://lipforge.ens-lyon.fr/www/crlibm/">crlibm</a>)
  65 <li>At least the following functions must be correctly rounded:
  66 ceil, copysign, fabs, fdim, floor, fma, fmax, fmin, fmod, frexp, ldexp, logb,
  67 modf, nearbyint, nextafter, nexttoward, rint, remainder, remquo, round, scalbln,
  68 scalbn, sqrt, trunc.
  69 <li>Mathematical properties of functions should be as expected
  70 (monotonicity, range, symmetries).
  71 <li>If the FPU precision is altered then nothing is guaranteed to work.
  72 (ie. when long double does not have full 80bit precision on i386 then things
  73 may break, this also means that compiler must spill fpu registers at 80bits
  74 precision)
  75 <li>Signaling NaN is not supported
  76 <li>Quiet NaN is supported but all NaNs are treated equally without special
  77 attention to the internal representation of a NaN
  78 (eg. the sign of NaN may not be preserved).
  79 <li>Most gcc bug workarounds should be removed from the code
  80 (FORCE_EVAL is used when expressions must be evaluated for their side-effect, other
  81 usage of volatile is not justified, hacks around long double constants are
  82 not justified eventhough gcc can miscompile those with non-default FPU setting)
  83 <li>When excessive precision is not harmful, temporary variables
  84 should be float_t or double_t (so on i386 no superfluous store is
  85 generated)
  86 <li>Whenever fenv is accessed the FENV_ACCESS pragma of c99 should be used
  87 (eventhough gcc does not yet support it), and all usage of optional FE_
  88 macros should be protected by #ifdef
  89 <li>For bit manipulation of floating-point values an union should be used
  90 (eg. union {float f; uint32_t i;})
  91 <li>uint32_t and uint64_t should be used for bit manipulations.
  92 (eg signed int must not be used in bit shifts etc when it might invoke
  93 undefined or implementation defined behaviour).
  94 <li>POSIX namespace rules must be respected.
  95 <li>c99 hexfloat syntax (0x1.0p0) should be used when it makes the code
  96 clearer, but not in public header files
  97 (those should be c++ and ansi c compatible)
  98 <li>The 'f' suffix should be used for single precision values (0.1f) when the
  99 value cannot be exactly represented or the type of the arithmetics is important
 100 ((float)0.1 is not ok, that style may lead to double rounding issues, but eg.
 101 1.0 or 0.5 may be used instead of 1.0f or 0.5f in some cases)
 102 <li>Prefer classification macros (eg. isnan) over inplace bit hacks.
 103 <li>For every math function there should be a c implementation.
 104 (a notable exception now is sqrtl, since most fpu has instruction for it
 105 and on soft-float architectures long double == double)
 106 <li>The c implementation of a long double function should use ifdefs with the
 107 LDBL_MANT_DIG etc constants from float.h for architecture specific
 108 implementations.
 109 <li>In the musl source tree math.h functions go to src/math, complex.h functions
 110 to src/complex and fenv.h functions to src/fenv. And files are named after the
 111 functions they implement.
 112 </ul>
 113
 114 <h3><a name="representation" href="#representation">Representation</a></h3>
 115 <p>
 116 Binary representation of floating point numbers matter
 117 because bit hacks are often needed in the math code.
 118 (in particular bit hacks are used instead of relational operations for nan
 119 and sign checks becuase relational operators raise invalid fp exception on nan
 120 and they treat -0.0 and +0.0 equally and more often than not these are not desired)
 121 <p>
 122 float and double bit manipulation can be handled in a portable way in c using
 123 union types:
 124 <ul>
 125 <li>union {float f; uint32_t i;};
 126 <li>union {double f; uint64_t i;};
 127 </ul>
 128 (assuming the bits in the object representation of 32bit and 64bit unsigned ints
 129 map to the floating-point representation according to ieee-754, this is not
 130 always the case, eg. old
 131 <a href="http://wiki.debian.org/ArmEabiPort#ARM_floating_points">arm floating-point accelerator</a>
 132 (FPA) used mixed endian double representation, but musl does not support the old
 133 arm ABI)
 134 <p>
 135 long double bit manipulation is harder as there are various representations
 136 and some of them don't map to any unsigned integer type:
 137 <ul>
 138 <li>ld64: long double is the same as double (ieee binary64)
 139 <li>ld80: 80bit extended precision format [<a href="https://en.wikipedia.org/wiki/Extended_precision">wikipedia</a>]
 140 <li>ld128: quadruple precision, ieee binary128 [<a href="https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format">wikipedia</a>]
 141 </ul>
 142 (and other non-standard formats are not supported)
 143 <p>
 144 In case of ld64 the bit manipulation is the same as with double
 145 and all long double math functions can be just wrappers around the
 146 corresponding double ones.
 147 (using symbol aliasing on the linker level is non-conformant
 148 since functions would not have unique address then)
 149 <p>
 150 ld80 is the most common long double on linux (i386 and x86_64 abi),
 151 it means 64bit significand with explicit msb
 152 (inconsistent with other ieee formats), 15bit exp, 1 sign bit.
 153 The m68k (and m88k) architecture uses the same format, but different endianness:
 154 <ul>
 155 <li>union {long double f; struct{uint64_t m; uint16_t se; uint16_t pad;} i;}; // x86
 156 <li>union {long double f; struct{uint16_t se; uint16_t pad; uint64_t m;} i;}; // m68k
 157 </ul>
 158 where m is the significand and se is the sign and exponent.
 159 <p>
 160 hardware ld128 is rare (eg. sparc64 and aarch64 with software emulation), it means
 161 113bit significand with implicit msb, 15bit exp, 1 sign bit:
 162 <ul>
 163 <li>union {long double f; struct{uint16_t se; uint16_t hi; uint32_t mid; uint64_t lo;} i;};
 164 </ul>
 165 <p>
 166 There are other non-conformant long double types: eg. the old SVR4 abi for ppc
 167 uses 128 bit long doubles, but it's software emulated and traditionally
 168 implemented using
 169 <a href="https://en.wikipedia.org/wiki/Quadruple_precision#Double-double_arithmetic">two doubles</a>
 170 (also called ibm long double as this is what ibm aix used on ppc).
 171 The ibm s390 supports the ieee 754-2008 compliant binary128 floating-point
 172 format, but previous ibm machines (S/370, S/360) used slightly different
 173 representation.
 174 <p>
 175 This variation shows the difficulty to consistently handle
 176 long double: the solution is to use ifdefs based on float.h and
 177 on the endianness and write different code for different architectures.
 178
 179 <h3><a name="ugly" href="#ugly">Ugly</a></h3>
 180 <p>The ugly parts of libm hacking.
 181 <p>Some notes are from:
 182 <a href="http://www.vinc17.org/research/extended.en.html">http://www.vinc17.org/research/extended.en.html</a>
 183 <p>Useful info about floating-point in gcc:
 184 <a href="http://gcc.gnu.org/wiki/FloatingPointMath">http://gcc.gnu.org/wiki/FloatingPointMath</a>
 185
 186 <ul>
 187 <li>Double rounding:
 188 <p>
 189 If a value rounded twice the result can be different
 190 than rounding just once.
 191 <p>
 192 The C language allows arithmetic to be evaluated in
 193 higher precision than the operands have. If we
 194 use x87 fpu in extended precision mode it will round
 195 the results twice: round to 80bit when calculating
 196 and then round to 64bit when storing it, this can
 197 give different result than a single 64bit rounding.
 198 (on x86-linux the default fpu setting is to round the
 199 results in extended precision, this only affects x87 instructions, not sse2 etc)
 200 (freebsd and openbsd use double precision by default)
 201 <p>
 202 So x = a+b may give different results depending on
 203 the x87 fpu precision setting.
 204 (only happens in round to nearest rounding mode,
 205 but that's the default)
 206 <p>
 207 (double rounding can happen with float vs double as well)
 208 <p>
 209 Note that the value of the result can only be ruined by
 210 double rounding in nearest rounding mode, but the double
 211 rounding issue haunts directed rounding modes as well:
 212 raising the underflow flag might be omitted.
 213 On x86 with downward rounding
 214 <pre>
 215 (double)(0x1p-1070 + 0x1p-2000L)
 216 </pre>
 217 does not raise underflow (only inexact) eventhough the
 218 final result is an inexact subnormal.
 219
 220
 221 <li>Wider exponent range (x87 issue):
 222 <p>
 223 Even if the fpu is set to double precision
 224 (which is not) the x87 registers use wider exponent
 225 range (mant:exp is 53:15 instead of 53:11 bits)
 226 so underflows (subnormals) may not be treated
 227 as expected. Rounding to double only occurs
 228 when a value is stored into memory.
 229 <p>
 230 Actually this beahviour is allowed by the ieee 754
 231 standard, but it can cause problems (see
 232 <a href="http://www.exploringbinary.com/php-hangs-on-numeric-value-2-2250738585072011e-308/">infamous php bug</a>)
 233
 234 <li>Evaluation precision:
 235 <p>
 236 C with FLT_EVAL_METHOD>=0 requires consistent evaluation
 237 precision, but gcc does not follow this on i386 and may
 238 store intermediate results and round them to double while keep
 239 other parts in extended precision.
 240 (And the precision gcc uses can be inconsistent:
 241 adding a printf to the code may spill 80bit float registers
 242 into memory changing the result of a nearby calculation).
 243 So
 244 <pre>
 245 (a+b)==(a+b)
 246 </pre>
 247 may be false with gcc and FLT_EVAL_METHOD!=0, when the two sides
 248 are kept in different precision.
 249 If the highest precision (long double) is used
 250 everywhere then there is no such issue, but that can have a huge penalty.
 251 (clang uses sse by default on i386 with FLT_EVAL_METHOD==0,
 252 while gcc uses the 80bit x87 fp registers and FLT_EVAL_METHOD==2)
 253 <p>
 254 C99 has a way to control this (see
 255 <a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#5.1.2.3">5.1.2.3 example 4</a>,
 256 <a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.5">6.3.1.5</a> and
 257 <a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.3.1.8">6.3.1.8</a>):
 258 when a result is stored in a variable or a
 259 type cast is used, then it is guaranteed that
 260 the precision is appropriate to that type.
 261 So in
 262 <pre>
 263 (double)(a+b)==(double)(a+b)
 264 </pre>
 265 both sides are guaranteed to be in double precision
 266 when the comparision is done.
 267 <p>
 268 Unfortunately gcc does not respect the standard
 269 and even if assingment or cast is used the result
 270 may be kept in higher precision
 271 (infamous <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323">gcc bug323</a>
 272 also see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36578">gcc bug36578</a>).
 273 gcc 4.5 fixed it with '-fexcess-precision=standard'
 274 (it is enabled by '-std=c99', but the default is
 275 '-fexcess-precision=fast')
 276 (An alternative solution would be if gcc spilled the
 277 registers with temporary results without rounding,
 278 storing the 80 bit registers entirely in memory
 279 which would make the behaviour under FLT_EVAL_METHOD==2
 280 predictable)
 281 <p>
 282 The workaround for older gcc is to force the
 283 compiler to store the intermediate results:
 284 by using volatile double temporary variables
 285 or by '-ffloat-store' (which affects all
 286 intermediate results, but is not guaranteed
 287 by the gcc docs to always work).
 288 <p>
 289 (Sometimes the excess precision is good
 290 then float_t and double_t may be used explicitly)
 291
 292 <li>Float literals
 293 <p>
 294 The standard allows 1 ulp errors in the conversion
 295 of decimal floating-point literals into floating-point
 296 values (it only requires the same result for the same
 297 literal thus <tt>1.1 - 1.1</tt> is always 0,
 298 but <tt>1.1 - 11e-1</tt> maybe +-0x1p-52 or 0).
 299 <p>
 300 A reasonable compiler always use correctly rounded
 301 conversion according to the default (nearest) rounding
 302 mode, but there are exceptions:
 303 the x87 has builtin constants which are faster to load
 304 from hardware than from memory
 305 (and the hw has sticky bit set correctly for rounding).
 306 gcc can recognize these constants so an operation on
 307 <pre>
 308 3.141592653589793238462643383L
 309 </pre>
 310 can turn into code that uses the <tt>fldpi</tt>
 311 instruction instead of memory loads.
 312 The only issue is that fldpi depends on
 313 the current rounding mode at runtime
 314 so the result can indeed be 1 ulp off compared
 315 to the compile-time rounded value.
 316 <p>
 317 According to the freebsd libm code gcc truncates long double
 318 const literals on i386.
 319 I assume this happens because freebsd sets x87 precision to double precision by default
 320 where gcc is hosted and gcc cannot get the constants right for the target platform,
 321 but i did not observe this on linux.
 322 (as a workaround sometimes double-double arithmetics was used
 323 to initialize long doubles on i386, but most of these should be
 324 fixed in musl's math code now)
 325
 326 <li>Compiler optimizations:
 327 <p>
 328 Runtime and compile time semantics may be different
 329 gcc often does unwanted or even invalid compile
 330 time optimizations.
 331 (constant folding where floating point exception
 332 flags should be raised at runtime, or the result
 333 depends on runtime floating point environment,
 334 or constant expressions are just evaluated with
 335 different precision than at runtime).
 336 <p>
 337 C99 actually allows most of these optimizations
 338 but they can be turned off with STDC pragmas (see
 339 <a href="http://repo.or.cz/w/c-standard.git/blob_plain/HEAD:/n1256.html#6.10.6">6.10.6</a>).
 340 Unfortunately <a href="http://gcc.gnu.org/c99status.html">gcc does not support these pragmas</a>
 341 nor clang (<a href="http://llvm.org/bugs/show_bug.cgi?id=8100">clang bug 8100</a>).
 342 <p>
 343 FENV_ACCESS ON tells the compiler that the code wants
 344 to access the floating point environment (eg. set different rounding mode)
 345 (see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678">gcc bug34678</a>).
 346 <p>
 347 (see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37845">gcc bug37845</a> for FP_CONTRACT pragma
 348 which is relevant when the architecture has fma instruction, x86_64, i386, arm do not have it).
 349 <p>
 350 The workaround is again using named volatile
 351 variables for constants like
 352 <pre>
 353 static const volatile two52 = 0x1p52;
 354 </pre>
 355 and using the '-frounding-math' gcc flag.
 356
 357 <li>ld80 vs ld128
 358 <p>
 359 The two representations are sufficiently different
 360 that treating them together is awkward.
 361 (Especially the explicit msb bit in ld80 can cause
 362 different behaviour).
 363 <p>
 364 In the freebsd libm code a few architecture specific
 365 macros and a union handle these issues, but the
 366 result is often less clear than treating ld80
 367 and ld128 separately.
 368
 369 <li>Signed zeros
 370 <p>Signed zeros can be tricky. They cannot be checked
 371 using the usual comparision operators (+0.0 == -0.0 is true),
 372 but they give different results in some cases
 373 (1/-0.0 == -inf) and can make similarly looking
 374 expressions different eg.
 375 (x + 0) is not the same as x, the former is +0.0 when x is -0.0
 376 <p>
 377 (To check for -0, the signbit macro can be used
 378 or the copysign function or bit manipulation.)
 379
 380 <li>Error handling
 381 <p>
 382 Arithmetics may set various floating point exception flags as a side effect.
 383 These can be queried and manipulated (fetestexcept, feraiseexcept,..).
 384 <p>
 385 So special care is needed
 386 when a library function wants to avoid changing
 387 the floating point status flags.
 388 eg. if one wants to check for -0 silently then
 389 <pre>
 390 if (x == 0.0 &amp;&amp; 1/x &lt; 0) { /* x is a -0 */ }
 391 </pre>
 392 is not ok: 1/x raises the divbyzero flag.
 393 <p>
 394 When a library wants to raise a flag deliberately
 395 but feraiseexcept is not available for some reason,
 396 then simple arithmetics can be be used just for their
 397 exception raising side effect
 398 (eg. 1/0.0 to raise divbyzero), however beaware
 399 of compiler optimizations (constant folding and dead code elimination,..).
 400 <p>
 401 Unfortunately gcc does not always take fp exceptions into
 402 account: a simple x = 1e300*1e300; may not raise overflow
 403 exception at runtime, but get optimized into x = +inf.
 404 see compiler optimizations above.
 405 <p>
 406 Another x87 gcc bug related to fp exceptions is that in some cases
 407 relation operators (&lt;, etc) don't raise invalid when an operand is nan
 408 (eventhough this is required by ieee + c99 annex F).
 409 (see <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52451">gcc bug52451</a>).
 410 <p>
 411 The ieee standard defines signaling and quiet nan
 412 floating-point numbers as well.
 413 The c99 standard only considers quiet nan, but it allows
 414 signaling nans to be supported as well.
 415 Without signaling nans x * 1 is equivalent to x,
 416 but if signaling nan is supported then the former
 417 raises an invalid exception.
 418 This may complicate things further if one wants to write
 419 portable fp math code.
 420 <p>
 421 A further libm design issue is the math_errhandling macro:
 422 it specifies the way math function errors can be checked
 423 which is either fp exceptions or the errno variable or both.
 424 The fp exception approach can be supported with enough care
 425 but errno is hard to support: certain library functions
 426 are implemented as a single asm instruction (eg sqrt),
 427 the only way to set errno is to query the fp exception flags
 428 and then set the errno variable based on that.
 429 So eventhough errno may be convenient, in libm it is
 430 not the right thing to do.
 431 <p>
 432 For soft-float targets exceptions are problematic in c99.
 433 The problem is that at context switches the fpu status should
 434 be saved and restored which is done by the kernel on hard-fp
 435 architectures when the state is in an fpu status word.
 436 In case of soft-fp emulation this must be done by the c runtime:
 437 context switches between threads can be supported with thread local
 438 storage of the exception state, but signal handlers may do floating-point
 439 arithmetics which should not alter the fenv state.
 440 Wrapping signal handlers is not possible/difficult for various
 441 reasons and the compiler cannot know which functions will be used
 442 as signal handlers, so the c runtime has no way to guarantee that
 443 signal handlers do not alter the fenv.
 444 C11 fixed this by changing signal handler semantics: fenv is in unspecified state.
 445
 446 <li>Complex arithmetics
 447 <p>
 448 gcc turns
 449 <pre>
 450 x*I
 451 </pre>
 452 into
 453 <pre>
 454 (x+0*I)*(0+I) = 0*x + x*I
 455 </pre>
 456 So if x=inf then the result is nan+inf*I instead of inf*I
 457 (an fmul instruction is generated for 0*x)
 458 <p>
 459 (There were various complex constant folding issues as well in gcc).
 460 <p>
 461 So a+b*I cannot be used to create complex numbers,
 462 instead some double[2] manipulation should be used to
 463 make sure there is no fmul in the generated code
 464 (freebsd uses a static inline cpack function)
 465 (It's not clear which is better: using union hacks
 466 everywhere in the complex code or creal, cimag, cpack)
 467 <p>
 468 Complex code is hard to make portable across compilers
 469 as there are no good compiler support, it is rarely used
 470 and there are various options available in the standard:
 471 complex support is optional, imaginary type
 472 is optional.
 473 <p>
 474 It's not clear how I (or _Complex_I) should be defined
 475 in complex.h
 476 (literal of the float imaginary unit is compiler specific,
 477 in gcc it can be 1.0fi).
 478
 479 <li>Signed int
 480 <p>
 481 The freebsd libm code has many inconsistencies
 482 (naming conventions, 0x1p0 notation vs decimal notation,..),
 483 one of them is the integer type used for bit manipulations:
 484 The bits of a double are unpacked into one of
 485 int, int32_t, uint32_t and u_int32_t
 486 integer types.
 487 <p>
 488 int32_t is used the most often which is not wrong in itself
 489 but it is used incorrectly in many places.
 490 <p>
 491 int is a bit worse because unlike int32_t it is not guaranteed
 492 to be 32bit two's complement representation. (but of course in
 493 practice they are the same)
 494 <p>
 495 The issues found so far are left shift of negative integers
 496 (undefined behaviour), right shift of negative integers
 497 (implementation defined behaviour), signed overflow
 498 (implementation defined behaviour), unsigned to signed conversion
 499 (implementation defined behaviour).
 500 <p>
 501 It is easy to avoid these issues without performance impact,
 502 but a bit of care should be taken around bit manipulations.
 503 </ul>
 504
 505 <h3><a name="implementations" href="#implementations">libm implementations</a></h3>
 506 <ul>
 507 <li>
 508 unix v7 libm by Robert Morris (rhm) around 1979<br>
 509 see: "Computer Approximations" by Hart & Cheney, 1968<br>
 510 used in: plan9, go
 511 <li>
 512 cephes by Stephen L. Moshier, 1984-1992<br>
 513 <a href="http://www.netlib.org/cephes/">http://www.netlib.org/cephes/</a><br>
 514 see: "Methods and Programs for Mathematical Functions" by Moshier 1989
 515 <li>
 516 fdlibm by K. C. Ng around 1993, updated in 1995?<br>
 517 (1993: copyright notice: "Developed at SunPro ..")<br>
 518 (1995: copyright notice: "Developed at SunSoft ..")<br>
 519 eg see: "Argument Reduction for Huge Arguments: Good to the Last Bit" by Ng 1992<br>
 520 <a href="http://www.netlib.org/fdlibm">http://www.netlib.org/fdlibm</a><br>
 521 used in: netbsd, openbsd, freebsd, bionic, musl
 522 <li>
 523 libultim by Abraham Ziv around 2001 (lgpl)<br>
 524 (also known as the ibm accurate portable math lib)<br>
 525 see: "Fast evaluation of elementary mathematical functions with correctly rounded last bit" by Ziv 1991<br>
 526 <a href="http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/">http://web.archive.org/web/20050207110205/http://oss.software.ibm.com/mathlib/</a><br>
 527 used in: glibc
 528 <li>
 529 libmcr by K. C. Ng, 2004<br>
 530 (sun's correctly rounded libm)
 531 <li>
 532 mathcw by Nelson H. F. Beebe, 2010?<br>
 533 <a href="http://ftp.math.utah.edu/pub/mathcw/">http://ftp.math.utah.edu/pub/mathcw/</a><br>
 534 (no sources available?)
 535 <li>
 536 sleef by Naoki Shibata, 2010<br>
 537 (simd lib for evaluating elementary functions)<br>
 538 <a href="http://shibatch.sourceforge.net/">http://shibatch.sourceforge.net/</a>
 539 <li>
 540 crlibm by ens-lyon, 2004-2010<br>
 541 <a href="http://lipforge.ens-lyon.fr/www/crlibm/">http://lipforge.ens-lyon.fr/www/crlibm/</a><br>
 542 see: <a href="http://lipforge.ens-lyon.fr/frs/?group_id=8&amp;release_id=123">crlibm.pdf</a><br>
 543 see: "Elementary Functions" by Jean-Michel Muller 2005<br>
 544 see: "Handbook of Floating-Point Arithmetic" by (many authors from Lyon) 2009<br>
 545 <li>
 546 various other closed ones: intel libm, hp libm for itanium,..
 547 </ul>
 548
 549 <h3><a name="tests" href="#tests">libm tests</a></h3>
 550 <ul>
 551 <li><a href="http://www.netlib.org/fp/ucbtest.tgz">ucbtest.tgz</a>
 552 <li><a href="http://www.jhauser.us/arithmetic/TestFloat.html">TestFloat</a>
 553 <li><a href="http://cant.ua.ac.be/old/ieeecc754.html">ieeecc754</a>
 554 <li><a href="http://www.loria.fr/~zimmerma/mpcheck/">mpcheck</a>,
 555 <a href="http://gforge.inria.fr/projects/mpcheck/">mpcheck</a>
 556 <li><a href="http://people.inf.ethz.ch/gonnet/FPAccuracy/Analysis.html">FPAccuracy</a>
 557 <li><a href="http://www.math.utah.edu/~beebe/software/ieee/">beebe's ieee tests</a>
 558 <li><a href="http://www.math.utah.edu/pub/elefunt/">elefunt</a>
 559 <li><a href="http://www.vinc17.org/research/testlibm/index.en.html">testlibm</a>
 560 <li><a href="http://www.vinc17.org/research/fptest.en.html">fptest</a>
 561 <li>tests in crlibm
 562 </ul>
 563 <p>
 564 multiprecision libs (useful for tests)
 565 <ul>
 566 <li><a href="http://mpfr.org/">mpfr</a>
 567 <li><a href="http://www.multiprecision.org/index.php?prog=mpc">mpc</a>
 568 <li><a href="http://pari.math.u-bordeaux.fr/">pari</a>
 569 <li><a href="http://www.apfloat.org/apfloat/">apfloat</a>
 570 <li><a href="http://code.google.com/p/fastfunlib/">fastfunlib</a>
 571 <li>scs_lib in crlibm
 572 </ul>
 573
 574 <p>other links
 575 <ul>
 576 <li>ieee standard: http://754r.ucbtest.org/
 577 <li>extended precision issues: http://www.vinc17.org/research/extended.en.html
 578 <li>correctly rounded mult: http://perso.ens-lyon.fr/jean-michel.muller/MultConstant.html
 579 <li>table based sincos: http://perso.ens-lyon.fr/damien.stehle/IMPROVEDGAL.html
 580 <li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/WCLR.html
 581 <li>finding worst cases: http://perso.ens-lyon.fr/damien.stehle/DECIMALEXP.html
 582 <li>finding worst cases: http://www.loria.fr/equipes/spaces/slz.en.html
 583 <li>finding worst cases: http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm
 584 <li>generating libm functions: http://lipforge.ens-lyon.fr/www/metalibm/
 585 <li>fast conversion to fixedpoint: http://stereopsis.com/sree/fpu2006.html
 586 <li>double-double, quad-double arithmetics: http://crd-legacy.lbl.gov/~dhbailey/mpdist/
 587 <li>papers by kahan: http://www.cs.berkeley.edu/~wkahan/
 588 <li>fp paper by goldberg: http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html
 589 <li>math functions in general: http://dlmf.nist.gov/
 590 </ul>
 591
 592 <p><small>(<a href="/git/?p=www">page history</a>, <a href="/">home</a>)</small>
 593 </body></html>