GCC 支持 __float128 类型,最大可表示约 104932 的浮点数,精度可达 33 位有效数字。但是,GCC Quad-Precision Math Library 中的 roundq 函数有 BUG:

roundq: round-to-nearest integral value, return __float128

下面就是测试程序(roundq-bug.c):

#include <stdio.h>
#include <quadmath.h>

void test(__float128 x)
{
  printf("     x: %f\n", (double)x);
  printf("roundq: %f\n", (double)roundq(x));
  printf("\n");
}

int main(void)
{
  test(2147483647.6);
  test(2147483648.6);
}

这个程序的运行结果如下所示:

$ gcc roundq-bug.c -lquadmath && ./a.out

     x: 2147483647.600000
roundq: 2147483648.000000

     x: 2147483648.600000
roundq: 2147483648.600000

使用的 gcc 版本如下所示:

$ gcc --version
gcc (GCC) 7.1.1 20170621
Copyright © 2017 Free Software Foundation, Inc.
本程序是自由软件;请参看源代码的版权声明。本软件没有任何担保;
包括没有适销性和某一专用目的下的适用性担保。

从 roundq.c 编译

参考资料6参考资料7中的 roundq.c 和 quadmath-imp.h 两个文件放到当前目录,然后注释掉 quadmath-imp.h 的第 27 行,再使用以下命令编译和运行(即把 -lquadmath 替换为 roundq.c):

$ gcc roundq-bug.c roundq.c && ./a.out

得到的结果也是一样的。然后就可以着手修改 roundq.c,试图修复这个 Bug。

修复 Bug

根据参考资料4第 18 楼 Jakub Jelinekattachment 41744,已经修复了这个 Bug,但是没有进行充分的测试。这个补丁可能要在 gcc 8.0 中发布。

Jakub Jelinek 2017-07-13 12:19:33 UTC           Comment 18

Created attachment 41744 [details]
gcc8-pr65757.patch

Here is a full version, it compiles, no further testing so far.
I guess I can bootstrap/regtest it next, but don't have time for further
testing. Guess it would be nice to tweak glibc math testsuite to test
__float128 with it, I vaguely remember Tobias doing that years ago, but not sure
if he had any patches for that.
...

下面是经过整理后的正确的 roundq.c

#include "quadmath-imp.h"

__float128 roundq(__float128 x)
{
  uint64_t i1, i0;
  GET_FLT128_WORDS64(i0, i1, x);
  int32_t j0 = ((i0 >> 48) & 0x7fff) - 0x3fff;
  if (j0 < 48) {
    if (j0 < 0) {
      i0 &= 0x8000000000000000ULL;
      if (j0 == -1) i0 |= 0x3fff000000000000LL;
      i1 = 0;
    } else {
      uint64_t i = 0x0000ffffffffffffLL >> j0;
      if (((i0 & i) | i1) == 0) return x; // x is integral.
      i0 += 0x0000800000000000LL >> j0;
      i0 &= ~i;
      i1 = 0;
    }
  } else if (j0 > 111) {
    if (j0 == 0x4000) return x + x; // Inf or NaN.
    else return x;
  } else {
    uint64_t i = -1ULL >> (j0 - 48);
    if ((i1 & i) == 0) return x; // x is integral.
    uint64_t j = i1 + (1LL << (111 - j0));
    if (j < i1) i0 += 1;
    i1 = j;
    i1 &= ~i;
  }
  SET_FLT128_WORDS64(x, i0, i1);
  return x;
}

下面是 quadmath-imp.h 中相关的(经过整理后的)内容:

// The Intel x86 and x86-64 series of processors use little-endian.
// Main union type we use to manipulate the floating-point type.
typedef union
{
  __float128 value;

  struct
  {
    uint64_t mant_low:64;
    uint64_t mant_high:48;
    unsigned exponent:15;
    unsigned negative:1;
  } ieee;

  struct
  {
    uint64_t low;
    uint64_t high;
  } words64;

  struct
  {
    uint32_t w3;
    uint32_t w2;
    uint32_t w1;
    uint32_t w0;
  } words32;

  struct
  {
    uint64_t mant_low:64;
    uint64_t mant_high:47;
    unsigned quiet_nan:1;
    unsigned exponent:15;
    unsigned negative:1;
  } nan;

} ieee854_float128;

// Get two 64 bit ints from a long double.
#define GET_FLT128_WORDS64(ix0,ix1,d)  \
do {                                   \
  ieee854_float128 u;                  \
  u.value = (d);                       \
  (ix0) = u.words64.high;              \
  (ix1) = u.words64.low;               \
} while (0)

// Set a long double from two 64 bit ints.
#define SET_FLT128_WORDS64(d,ix0,ix1)  \
do {                                   \
  ieee854_float128 u;                  \
  u.words64.high = (ix0);              \
  u.words64.low = (ix1);               \
  (d) = u.value;                       \
} while (0)

编译和运行:

$ gcc roundq-bug.c roundq.c && ./a.out

     x: 2147483647.600000
roundq: 2147483648.000000

     x: 2147483648.600000
roundq: 2147483649.000000

运行结果是正确的。

参考资料

  1. GCC Quad-Precision Math Library
  2. GCC libquadmath: Math Library Routines
  3. GCC: Additional Floating Types
  4. GCC Bugzilla: Bug 65757: gfortran gives incorrect result for anint with real*16 argument
  5. GCC Bugzilla: Bug 81412: roundq
  6. GCC: roundq.c
  7. GCC: quadmath-imp.h