为什么数组的最大大小“太大”?

时间:2022-07-06 09:05:41

I'm under the same impression as this answer, that size_t is always guaranteed by the standard to be large enough to hold the largest possible type of a given system.

我和这个答案的印象是一样的,size_t总是被标准保证足够大以容纳给定系统的最大可能类型。

However, this code fails to compile on gcc/Mingw:

然而,该代码未能在gcc/Mingw上编译:

#include <stdint.h>
#include <stddef.h>

typedef uint8_t array_t [SIZE_MAX];

error: size of array 'array_t' is too large

错误:数组'array_t'的大小太大

Am I misunderstanding something in the standard here? Is size_t allowed to be too large for a given implementation? Or is this another bug in Mingw?

我是否误解了这里的标准?对于给定的实现,size_t是否允许太大?或者这是明月的另一个bug ?


EDIT: further research shows that

编辑:进一步的研究表明

typedef uint8_t array_t [SIZE_MAX/2];   // does compile
typedef uint8_t array_t [SIZE_MAX/2+1]; // does not compile

Which happens to be the same as

它们恰好是相同的

#include <limits.h>

typedef uint8_t array_t [LLONG_MAX];           // does compile
typedef uint8_t array_t [LLONG_MAX+(size_t)1]; // does not compile

So I'm now inclined to believe that this is a bug in Mingw, because setting the maximum allowed size based on a signed integer type doesn't make any sense.

所以我现在倾向于认为这是Mingw中的一个错误,因为根据带符号整数类型设置最大允许大小没有任何意义。

5 个解决方案

#1


58  

The limit SIZE_MAX / 2 comes from the definitions of size_t and ptrdiff_t on your implementation, which choose that the types ptrdiff_t and size_t have the same width.

limit SIZE_MAX / 2来自于您的实现的size_t和ptrdiff_t的定义,它选择了ptrdiff_t和size_t类型相同的宽度。

C Standard mandates1 that type size_t is unsigned and type ptrdiff_t is signed.

C标准命令1,类型size_t是无符号的,类型ptrdiff_t是有符号的。

The result of difference between two pointers, will always2 have the type ptrdiff_t. This means that, on your implementation, the size of the object must be limited to PTRDIFF_MAX, otherwise a valid difference of two pointers could not be represented in type ptrdiff_t, leading to undefined behavior.

两个指针之间的差异的结果,将始终具有ptrdiff_t类型。这意味着,在实现上,对象的大小必须限制在PTRDIFF_MAX,否则两个指针的有效差异不能在类型ptrdiff_t中表示,从而导致未定义的行为。

Thus the value SIZE_MAX / 2 equals the value PTRDIFF_MAX. If the implementation choose to have the maximum object size be SIZE_MAX, then the width of the type ptrdiff_t would have to be increased. But it is much easier to limit the maximum size of the object to SIZE_MAX / 2, then it is to have the type ptrdiff_t have a greater or equal positive range than that of type size_t.

因此,SIZE_MAX / 2的值等于PTRDIFF_MAX的值。如果实现选择最大对象大小为SIZE_MAX,则必须增加类型ptrdiff_t的宽度。但是,将对象的最大大小限制为SIZE_MAX / 2要容易得多,因此,要使类型ptrdiff_t具有比类型size_t更大或相等的正范围。

Standard offers these3 comments4 on the topic.

标准提供了关于这个话题的3个评论。


(Quoted from ISO/IEC 9899:201x)

(引用ISO / IEC 9899:201x)

1 (7.19 Common definitions 2)
The types are
ptrdiff_t
which is the signed integer type of the result of subtracting two pointers;
size_t
which is the unsigned integer type of the result of the sizeof operator;

1 (7.19 Common definition 2)类型为ptrdiff_t,它是由两个指针减去的结果的符号整数类型;size_t是sizeof运算符结果的无符号整数类型;

2 (6.5.6 Additive operators 9)
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the header. If the result is not representable in an object of that type, the behavior is undefined.

2(6.5.6加性运算符9)当两个指针被减去时,两个指针都指向同一个数组对象的元素,或者一个指针指向数组对象的最后一个元素;结果是两个数组元素的下标的不同。结果的大小是实现定义的,其类型(带符号整数类型)是头中定义的ptrdiff_t。如果结果在该类型的对象中不可表示,则该行为未定义。

3 (K.3.4 Integer types 3)
Extremely large object sizes are frequently a sign that an object’s size was calculated incorrectly. For example, negative numbers appear as very large positive numbers when converted to an unsigned type like size_t. Also, some implementations do not support objects as large as the maximum value that can be represented by type size_t.

3 (K.3.4整型类型3)超大的对象大小通常是对象大小计算不正确的标志。例如,当转换成size_t这样的无符号类型时,负数会以非常大的正数形式出现。此外,有些实现不支持使用size_t类型表示的最大值来表示的对象。

4 (K.3.4 Integer types 4)
For those reasons, it is sometimes beneficial to restrict the range of object sizes to detect programming errors. For implementations targeting machines with large address spaces, it is recommended that RSIZE_MAX be defined as the smaller of the size of the largest object supported or (SIZE_MAX >> 1), even if this limit is smaller than the size of some legitimate, but very large, objects. Implementations targeting machines with small address spaces may wish to define RSIZE_MAX as SIZE_MAX, which means that there is no object size that is considered a runtime-constraint violation.

4 (K.3.4整型类型4)由于这些原因,有时限制对象大小的范围以检测编程错误是有益的。对于针对具有大地址空间的机器的实现,建议将RSIZE_MAX定义为所支持的最大对象的最小大小(SIZE_MAX >> 1),即使这个限制小于某些合法但非常大的对象的大小。针对具有小地址空间的机器的实现可能希望将RSIZE_MAX定义为SIZE_MAX,这意味着没有对象大小被认为违反了运行时约束。

#2


10  

The of range size_t is guaranteed to be sufficient to store the size of the largest object supported by the implementation. The reverse is not true: you are not guaranteed to be able to create an object whose size fills the entire range of size_t.

range size_t保证足以存储实现支持的最大对象的大小。反之则不成立:您不能保证能够创建一个对象,其大小填充size_t的整个范围。

Under such circumstances the question is: what does SIZE_MAX stand for? The largest supported object size? Or the largest value representable in size_t? The answer is: it is the latter, i.e. SIZE_MAX is (size_t) -1. You are not guaranteed to be able to create objects SIZE_MAX bytes large.

在这种情况下,问题是:SIZE_MAX代表什么?最大的支持对象大小?或者size_t中可表示的最大值?答案是:它是后者,即SIZE_MAX为(size_t) -1。您不能保证能够创建大字节的SIZE_MAX对象。

The reason behind that is that in addition to size_t, implementations must also provide ptrdiff_t, which is intended (but not guaranteed) to store the difference between two pointers pointing into the same array object. Since type ptrdiff_t is signed, the implementations are faced with the following choices:

这背后的原因是,除了size_t之外,实现还必须提供ptrdiff_t,它的目的是(但不能保证)存储指向同一个数组对象的两个指针之间的差异。由于类型ptrdiff_t是签名的,实现面临以下选择:

  1. Allow array objects of size SIZE_MAX and make ptrdiff_t wider than size_t. It has to be wider by at least one bit. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX or smaller.

    允许大小为SIZE_MAX的数组对象,使ptrdiff_t比size_t宽。它至少要宽一点。这样的ptrdiff_t可以容纳指向size SIZE_MAX或更小的数组的两个指针之间的任何差异。

  2. Allow array objects of size SIZE_MAX and use ptrdiff_t of the same width as size_t. Accept the fact that pointer subtraction can overflow and cause undefined behavior, if the pointers are farther than SIZE_MAX / 2 elements apart. The language specification does not prohibit this approach.

    允许大小为SIZE_MAX的数组对象,并使用与size_t宽度相同的ptrdiff_t。如果指针之间的距离超过SIZE_MAX / 2元素,那么接受以下事实:指针减法可能会溢出并导致未定义的行为。语言规范没有禁止这种方法。

  3. Use ptrdiff_t of the same width as size_t and restrict the maximum array object size by SIZE_MAX / 2. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX / 2 or smaller.

    使用与size_t宽度相同的ptrdiff_t,通过SIZE_MAX / 2限制最大数组对象大小。这样的ptrdiff_t可以容纳指向SIZE_MAX / 2或更小数组的两个指针之间的任何差异。

You are simply dealing with an implementation that decided to follow the third approach.

您只是在处理决定遵循第三种方法的实现。

#3


5  

It looks very much like implementation-specific behaviour.

它看起来非常像特定于实现的行为。

I'm running here Mac OS, and with gcc 6.3.0 the biggest size I can compile your definition with is SIZE_MAX/2; with SIZE_MAX/2 + 1 it does not compile anymore.

我在这里运行的是Mac OS,使用gcc 6.3.0我能编译你的定义的最大尺寸是SIZE_MAX/2;使用SIZE_MAX/2 + 1,它不再编译。

On the other side, witch clang 4.0.0 the biggest one is SIZE_MAX/8, and SIZE_MAX/8 + 1 breaks.

另一方面,女巫clang 4.0.0最大的是SIZE_MAX/8, SIZE_MAX/8 + 1。

#4


1  

Just reasoning from scratch, size_t is a type that can hold the size of any object. The size of any object is limited by the width of the address bus (ignoring multiplexing and systems that can handle eg 32 and 64 bit code, call that "code width"). Anologous to MAX_INT which is the largest integer value, SIZE_MAX is the largest value of size_t. Thus, an object of size SIZE_MAX is all addressable memory. It s reasonable that an implementation flags that as an error, however, I agree that it is an error only in a case where an actual object is allocated, be it on the stack or in global memory. (A call to malloc for that amount will fail anyway)

size_t是一个可以容纳任意对象大小的类型。任何对象的大小都受地址总线的宽度限制(忽略multiplexing和能够处理eg 32和64位代码的系统,称为“代码宽度”)。MAX_INT是最大的整数值,SIZE_MAX是size_t的最大值。因此,size SIZE_MAX的对象都是可寻址内存。实现将其标记为错误是合理的,但是,我同意,只有在分配实际对象的情况下,无论是在堆栈上还是在全局内存中,它都是错误的。(给malloc打了这个电话,无论如何都会失败。)

#5


1  

First of all, size_t is used to hold the result of sizeof operator. So, it is guaranteed to have a size that can hold the "value" of SIZE_MAX. Theoretically, that should allow you to define any object with size SIZE_MAX.

首先,size_t被用来保存sizeof运算符的结果。因此,它的大小保证可以保存SIZE_MAX的“值”。理论上,这应该允许您定义任何大小为SIZE_MAX的对象。

Then, if I remember correctly, you are limited by the upper limit of system-wide resources. This is not bounds imposed by C standard, rather the OS/environment.

然后,如果我没记错的话,您受到系统范围资源上限的限制。这不是C标准强加的界限,而是操作系统/环境。

Check ulimit -a output. You can also modify the limits using ulimit -s <size> for the stack, as long as the array you're defining is stored in stack. Otherwise, for global arrays, probably you need to check for the allowed size in .DATA or .BSS as per your OS. So, this is environment dependent (or implementation dependent).

检查ulimit - a输出。您还可以使用ulimit -s 修改堆栈的限制,只要您定义的数组存储在堆栈中。否则,对于全局数组,可能需要根据操作系统检查. data或. bss中允许的大小。因此,这与环境有关(或与实现有关)。

#1


58  

The limit SIZE_MAX / 2 comes from the definitions of size_t and ptrdiff_t on your implementation, which choose that the types ptrdiff_t and size_t have the same width.

limit SIZE_MAX / 2来自于您的实现的size_t和ptrdiff_t的定义,它选择了ptrdiff_t和size_t类型相同的宽度。

C Standard mandates1 that type size_t is unsigned and type ptrdiff_t is signed.

C标准命令1,类型size_t是无符号的,类型ptrdiff_t是有符号的。

The result of difference between two pointers, will always2 have the type ptrdiff_t. This means that, on your implementation, the size of the object must be limited to PTRDIFF_MAX, otherwise a valid difference of two pointers could not be represented in type ptrdiff_t, leading to undefined behavior.

两个指针之间的差异的结果,将始终具有ptrdiff_t类型。这意味着,在实现上,对象的大小必须限制在PTRDIFF_MAX,否则两个指针的有效差异不能在类型ptrdiff_t中表示,从而导致未定义的行为。

Thus the value SIZE_MAX / 2 equals the value PTRDIFF_MAX. If the implementation choose to have the maximum object size be SIZE_MAX, then the width of the type ptrdiff_t would have to be increased. But it is much easier to limit the maximum size of the object to SIZE_MAX / 2, then it is to have the type ptrdiff_t have a greater or equal positive range than that of type size_t.

因此,SIZE_MAX / 2的值等于PTRDIFF_MAX的值。如果实现选择最大对象大小为SIZE_MAX,则必须增加类型ptrdiff_t的宽度。但是,将对象的最大大小限制为SIZE_MAX / 2要容易得多,因此,要使类型ptrdiff_t具有比类型size_t更大或相等的正范围。

Standard offers these3 comments4 on the topic.

标准提供了关于这个话题的3个评论。


(Quoted from ISO/IEC 9899:201x)

(引用ISO / IEC 9899:201x)

1 (7.19 Common definitions 2)
The types are
ptrdiff_t
which is the signed integer type of the result of subtracting two pointers;
size_t
which is the unsigned integer type of the result of the sizeof operator;

1 (7.19 Common definition 2)类型为ptrdiff_t,它是由两个指针减去的结果的符号整数类型;size_t是sizeof运算符结果的无符号整数类型;

2 (6.5.6 Additive operators 9)
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the header. If the result is not representable in an object of that type, the behavior is undefined.

2(6.5.6加性运算符9)当两个指针被减去时,两个指针都指向同一个数组对象的元素,或者一个指针指向数组对象的最后一个元素;结果是两个数组元素的下标的不同。结果的大小是实现定义的,其类型(带符号整数类型)是头中定义的ptrdiff_t。如果结果在该类型的对象中不可表示,则该行为未定义。

3 (K.3.4 Integer types 3)
Extremely large object sizes are frequently a sign that an object’s size was calculated incorrectly. For example, negative numbers appear as very large positive numbers when converted to an unsigned type like size_t. Also, some implementations do not support objects as large as the maximum value that can be represented by type size_t.

3 (K.3.4整型类型3)超大的对象大小通常是对象大小计算不正确的标志。例如,当转换成size_t这样的无符号类型时,负数会以非常大的正数形式出现。此外,有些实现不支持使用size_t类型表示的最大值来表示的对象。

4 (K.3.4 Integer types 4)
For those reasons, it is sometimes beneficial to restrict the range of object sizes to detect programming errors. For implementations targeting machines with large address spaces, it is recommended that RSIZE_MAX be defined as the smaller of the size of the largest object supported or (SIZE_MAX >> 1), even if this limit is smaller than the size of some legitimate, but very large, objects. Implementations targeting machines with small address spaces may wish to define RSIZE_MAX as SIZE_MAX, which means that there is no object size that is considered a runtime-constraint violation.

4 (K.3.4整型类型4)由于这些原因,有时限制对象大小的范围以检测编程错误是有益的。对于针对具有大地址空间的机器的实现,建议将RSIZE_MAX定义为所支持的最大对象的最小大小(SIZE_MAX >> 1),即使这个限制小于某些合法但非常大的对象的大小。针对具有小地址空间的机器的实现可能希望将RSIZE_MAX定义为SIZE_MAX,这意味着没有对象大小被认为违反了运行时约束。

#2


10  

The of range size_t is guaranteed to be sufficient to store the size of the largest object supported by the implementation. The reverse is not true: you are not guaranteed to be able to create an object whose size fills the entire range of size_t.

range size_t保证足以存储实现支持的最大对象的大小。反之则不成立:您不能保证能够创建一个对象,其大小填充size_t的整个范围。

Under such circumstances the question is: what does SIZE_MAX stand for? The largest supported object size? Or the largest value representable in size_t? The answer is: it is the latter, i.e. SIZE_MAX is (size_t) -1. You are not guaranteed to be able to create objects SIZE_MAX bytes large.

在这种情况下,问题是:SIZE_MAX代表什么?最大的支持对象大小?或者size_t中可表示的最大值?答案是:它是后者,即SIZE_MAX为(size_t) -1。您不能保证能够创建大字节的SIZE_MAX对象。

The reason behind that is that in addition to size_t, implementations must also provide ptrdiff_t, which is intended (but not guaranteed) to store the difference between two pointers pointing into the same array object. Since type ptrdiff_t is signed, the implementations are faced with the following choices:

这背后的原因是,除了size_t之外,实现还必须提供ptrdiff_t,它的目的是(但不能保证)存储指向同一个数组对象的两个指针之间的差异。由于类型ptrdiff_t是签名的,实现面临以下选择:

  1. Allow array objects of size SIZE_MAX and make ptrdiff_t wider than size_t. It has to be wider by at least one bit. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX or smaller.

    允许大小为SIZE_MAX的数组对象,使ptrdiff_t比size_t宽。它至少要宽一点。这样的ptrdiff_t可以容纳指向size SIZE_MAX或更小的数组的两个指针之间的任何差异。

  2. Allow array objects of size SIZE_MAX and use ptrdiff_t of the same width as size_t. Accept the fact that pointer subtraction can overflow and cause undefined behavior, if the pointers are farther than SIZE_MAX / 2 elements apart. The language specification does not prohibit this approach.

    允许大小为SIZE_MAX的数组对象,并使用与size_t宽度相同的ptrdiff_t。如果指针之间的距离超过SIZE_MAX / 2元素,那么接受以下事实:指针减法可能会溢出并导致未定义的行为。语言规范没有禁止这种方法。

  3. Use ptrdiff_t of the same width as size_t and restrict the maximum array object size by SIZE_MAX / 2. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX / 2 or smaller.

    使用与size_t宽度相同的ptrdiff_t,通过SIZE_MAX / 2限制最大数组对象大小。这样的ptrdiff_t可以容纳指向SIZE_MAX / 2或更小数组的两个指针之间的任何差异。

You are simply dealing with an implementation that decided to follow the third approach.

您只是在处理决定遵循第三种方法的实现。

#3


5  

It looks very much like implementation-specific behaviour.

它看起来非常像特定于实现的行为。

I'm running here Mac OS, and with gcc 6.3.0 the biggest size I can compile your definition with is SIZE_MAX/2; with SIZE_MAX/2 + 1 it does not compile anymore.

我在这里运行的是Mac OS,使用gcc 6.3.0我能编译你的定义的最大尺寸是SIZE_MAX/2;使用SIZE_MAX/2 + 1,它不再编译。

On the other side, witch clang 4.0.0 the biggest one is SIZE_MAX/8, and SIZE_MAX/8 + 1 breaks.

另一方面,女巫clang 4.0.0最大的是SIZE_MAX/8, SIZE_MAX/8 + 1。

#4


1  

Just reasoning from scratch, size_t is a type that can hold the size of any object. The size of any object is limited by the width of the address bus (ignoring multiplexing and systems that can handle eg 32 and 64 bit code, call that "code width"). Anologous to MAX_INT which is the largest integer value, SIZE_MAX is the largest value of size_t. Thus, an object of size SIZE_MAX is all addressable memory. It s reasonable that an implementation flags that as an error, however, I agree that it is an error only in a case where an actual object is allocated, be it on the stack or in global memory. (A call to malloc for that amount will fail anyway)

size_t是一个可以容纳任意对象大小的类型。任何对象的大小都受地址总线的宽度限制(忽略multiplexing和能够处理eg 32和64位代码的系统,称为“代码宽度”)。MAX_INT是最大的整数值,SIZE_MAX是size_t的最大值。因此,size SIZE_MAX的对象都是可寻址内存。实现将其标记为错误是合理的,但是,我同意,只有在分配实际对象的情况下,无论是在堆栈上还是在全局内存中,它都是错误的。(给malloc打了这个电话,无论如何都会失败。)

#5


1  

First of all, size_t is used to hold the result of sizeof operator. So, it is guaranteed to have a size that can hold the "value" of SIZE_MAX. Theoretically, that should allow you to define any object with size SIZE_MAX.

首先,size_t被用来保存sizeof运算符的结果。因此,它的大小保证可以保存SIZE_MAX的“值”。理论上,这应该允许您定义任何大小为SIZE_MAX的对象。

Then, if I remember correctly, you are limited by the upper limit of system-wide resources. This is not bounds imposed by C standard, rather the OS/environment.

然后,如果我没记错的话,您受到系统范围资源上限的限制。这不是C标准强加的界限,而是操作系统/环境。

Check ulimit -a output. You can also modify the limits using ulimit -s <size> for the stack, as long as the array you're defining is stored in stack. Otherwise, for global arrays, probably you need to check for the allowed size in .DATA or .BSS as per your OS. So, this is environment dependent (or implementation dependent).

检查ulimit - a输出。您还可以使用ulimit -s 修改堆栈的限制,只要您定义的数组存储在堆栈中。否则,对于全局数组,可能需要根据操作系统检查. data或. bss中允许的大小。因此,这与环境有关(或与实现有关)。