如何在日常的WTF中实现pilloried函数?

The Daily WTF for 2008-11-28 pillories the following code:

《每日WTF》更新:2008-11-28

static char *nice_num(long n)
{
    int neg = 0, d = 3;
    char *buffer = prtbuf;
    int bufsize = 20;

    if (n < 0)
    {
        neg = 1;
        n = -n;
    }
    buffer += bufsize;
    *--buffer = '\0';

    do
    {
        *--buffer = '0' + (n % 10);
        n /= 10;
        if (--d == 0)
        {
            d = 3;
            *--buffer = ',';
        }
    }
    while (n);

    if (*buffer == ',') ++buffer;
    if (neg) *--buffer = '-';
    return buffer;
}

How would you write it?

你怎么写?

7 个解决方案

#1

If you're a seasoned C programmer, you'll realize this code isn't actually that bad. It's relatively straightforward (for C), and it's blazingly fast. It has three problems:

如果您是一个经验丰富的C程序员，您会发现这段代码实际上并没有那么糟糕。它相对简单(对于C)，而且速度非常快。它有三个问题:

It fails on the edge case of LONG_MIN (-2,147,483,648), since negating this number produces itself in twos-complement
- It assumes 32-bit integers - for 64-bit longs, a 20-byte buffer is not big enough
- 它假定是32位整数——对于64位的长字符，20字节的缓冲区不够大
- It's not thread-safe - it uses a global static buffer, so multiple threads calling it at the same time will result in a race condition
- 它不是线程安全的——它使用全局静态缓冲区，因此同时调用它的多个线程将导致竞争条件
失败的边缘LONG_MIN(-2147483648),因为否定这个数字生产本身在twos-complement假定32位整数——64位长,20-byte缓冲区不够大它不是线程安全的,它使用一个全局静态缓冲区,所以多个线程同时称这将导致竞态条件

Problem #1 is easily solved with a special case. To address #2, I'd separate the code into two functions, one for 32-bit integers and one for 64-bit integers. #3 is a little harder - we have to change the interface to make completely thread-safe.

问题#1很容易用一个特殊的例子解决。要处理#2，我将代码分为两个函数，一个用于32位整数，一个用于64位整数。#3有点难——我们必须更改接口以使线程完全安全。

Here is my solution, based on this code but modified to address these problems:

下面是我的解决方案，基于这段代码，但经过修改以解决这些问题:

static int nice_num(char *buffer, size_t len, int32_t n)
{
  int neg = 0, d = 3;
  char buf[16];
  size_t bufsize = sizeof(buf);
  char *pbuf = buf + bufsize;

  if(n < 0)
  {
    if(n == INT32_MIN)
    {
      strncpy(buffer, "-2,147,483,648", len);
      return len <= 14;
    }

    neg = 1;
    n = -n;
  }

  *--pbuf = '\0';

  do
  {
    *--pbuf = '0' + (n % 10);
    n /= 10;
    if(--d == 0)
    {
      d = 3;
      *--pbuf = ',';
    }
  }
  while(n > 0);

  if(*pbuf == ',') ++pbuf;
  if(neg) *--pbuf = '-';

  strncpy(buffer, pbuf, len);
  return len <= strlen(pbuf);
}

Explanation: it creates a local buffer on the stack and then fills that in in the same method as the initial code. Then, it copies it into a parameter passed into the function, making sure not to overflow the buffer. It also has a special case for INT32_MIN. The return value is 0 if the original buffer was large enough, or 1 if the buffer was too small and the resulting string was truncated.

说明:它在堆栈上创建一个本地缓冲区，然后用与初始代码相同的方法填充该缓冲区。然后，它将它复制到传递给函数的参数中，确保不会溢出缓冲区。它还有一个INT32_MIN的特殊情况。如果原始缓冲区足够大，返回值为0;如果缓冲区太小，结果字符串被截断，返回值为1。

#2

Hmm... I guess I shouldn't admit this, but my int to string routine for an embedded system work in pretty much exactly the same way (but without putting in the commas).

嗯…我想我不应该承认这一点，但是我对嵌入式系统的字符串例程的处理方式几乎完全相同(但不添加逗号)。

It's not particularly straightforward, but I wouldn't call it a WTF if you're working on a system that you can't use snprintf() on.

它不是特别简单，但是如果您正在处理一个不能使用snprintf()的系统，我不会称它为WTF。

The guy who wrote the above probably noted that the printf() family of routines can't do comma grouping, so he came up with his own.

写上述内容的人可能注意到printf()例程家族不能做逗号分组，所以他想出了自己的方法。

Footnote: there are some libraries where the printf() style formatting does support grouping, but they are not standard. And I know that the posted code doesn't support other locales that group using '.'. But that's hardly a WTF, just a bug possibly.

脚注:在一些库中，printf()样式的格式确实支持分组，但它们不是标准的。我知道发布的代码不支持使用“。”的其他地区。但这并不是什么该死的东西，可能只是个bug。

#3

That's probably pretty close to the way I would write it actually. The only thing I can immediately see that is wrong with the solution is that is doesn't work for LONG_MIN on machines where LONG_MIN is -(LONG_MAX + 1), which is most machines nowadays. I might use localeconv to get the thousands separator instead of assuming comma, and I might more carefully calculate the buffer size, but the algorithm and implementation seem pretty straight-forward to me, not really much of a WTF for C (there are much better solutions for C++).

这和我写它的方式很接近。我能立即看到的唯一错误的解决方案是，在LONG_MIN是-(LONG_MAX + 1)的机器上，对于LONG_MIN是无效的，而LONG_MAX + 1是目前大多数机器。我可能会使用localeconv来获得数千个分隔符，而不是假设使用逗号，我可能会更仔细地计算缓冲区大小，但是算法和实现对我来说似乎非常简单，对于C来说并没有太多的WTF (c++有更好的解决方案)。

#4

Lisp:

(defun pretty-number (x) (format t "~:D" x))

I'm suprised how easily I could do this. I'm not even past the first chapter in my Lisp book. xD (Or should I say, ~:D)

我很容易就能做到这一点。我的口齿不清书的第一章还没过。xD(或者应该说，~:D)

#5

size_t
signed_as_text_grouped_on_powers_of_1000(char *s, ssize_t max, int n)
{
    if (max <= 0)
        return 0;

    size_t r=0;
    bool more_groups = n/1000 != 0;
    if (more_groups)
    {
       r = signed_as_text_grouped_on_powers_of_1000(s, max, n/1000);
       r += snprintf(s+r, max-r, ",");
       n = abs(n%1000);
       r += snprintf(s+r, max-r, "%03d",n);
    } else
       r += snprintf(s+r, max-r, "% 3d", n);

    return r;
}

Unfortunately, this is about 10x slower than the original.

不幸的是，这个速度比原来慢了10倍。

#6

In pure C:

在纯C:

#include <stdio.h>
#include <limits.h>

static char *prettyNumber(long num, int base, char separator)
{
#define bufferSize      (sizeof(long) * CHAR_BIT)
        static char buffer[bufferSize + 1];
        unsigned int pos = 0;

        /* We're walking backwards because numbers are right to left. */
        char *p = buffer + bufferSize;
        *p = '\0';

        int negative = num < 0;

        do
        {
                char digit = num % base;
                digit += '0';

                *(--p) = digit;
                ++pos;

                num /= base;

                /* This the last of a digit group? */
                if(pos % 3 == 0)
                {
/* TODO Make this a user setting. */
#ifndef IM_AMERICAN
#       define IM_AMERICAN_BOOL 0
#else
#       define IM_AMERICAN_BOOL 1
#endif
                        /* Handle special thousands case. */
                        if(!IM_AMERICAN_BOOL && pos == 3 && num < base)
                        {
                                /* DO NOTHING */
                        }
                        else
                        {
                                *(--p) = separator;
                        }
                }
        } while(num);

        if(negative)
                *(--p) = '-';

        return p;
#undef bufferSize
}

int main(int argc, char **argv)
{
        while(argc > 1)
        {
                long num = 0;

                if(sscanf(argv[1], "%ld", &num) != 1)
                        continue;

                printf("%ld = %s\n", num, prettyNumber(num, 10, ' '));

                --argc;
                ++argv;
        };

        return 0;
}

Normally I'd return an alloc'd buffer, which would need to be free'd by the user. This addition is trivial.

通常我会返回一个alloc的缓冲区，它需要用户的*。这是微不足道的。

#7

I got bored and made this naive implementation in Perl. Works.

我厌倦了用Perl实现这个简单的实现。的工作原理。


sub pretify {
    my $num = $_[0];
    my $numstring = sprintf( "%f", $num );

    # Split into whole/decimal
    my ( $whole, $decimal ) = ( $numstring =~ /(^\d*)(.\d+)?/ );
    my @chunks;
    my $output = '';

    # Pad whole into multiples of 3
    $whole = q{ } x ( 3 - ( length $whole ) % 3 ) . $whole;

    # Create an array of all 3 parts.
    @chunks = $whole =~ /(.{3})/g;

    # Reassemble with commas
    $output = join ',', @chunks;
    if ($decimal) {
        $output .= $decimal;
    }

    # Strip Padding ( and spurious commas )
    $output =~ s/^[ ,]+//;

    # Strip excess tailing zeros
    $output =~ s/0+$//;

    # Ending with . is ugly
    $output =~ s/\.$//;
    return $output;
}

print "\n", pretify 100000000000000000000000000.0000;
print "\n", pretify 10_202_030.45;
print "\n", pretify 10_101;
print "\n", pretify 0;
print "\n", pretify 0.1;
print "\n", pretify 0.0001;
print "\n";

#1