C中的文字和变量有什​​么区别(有符号和无符号短整数)?

时间:2022-06-24 11:37:11

I have seen the following code in the book Computer Systems: A Programmer's Perspective, 2/E. This works well and creates the desired output. The output can be explained by the difference of signed and unsigned representations.

我在“计算机系统:程序员的角度”一书中看到了以下代码,2 / E.这很好用并创建所需的输出。输出可以通过有符号和无符号表示的区别来解释。

#include<stdio.h>
int main() {
    if (-1 < 0u) {
        printf("-1 < 0u\n");
    }
    else {
        printf("-1 >= 0u\n");
    }
    return 0;
}

The code above yields -1 >= 0u, however, the following code which shall be the same as above, does not! In other words,

上面的代码产生-1> = 0u,但是,下面的代码应该与上面相同,不会!换一种说法,

#include <stdio.h>

int main() {

    unsigned short u = 0u;
    short x = -1;
    if (x < u)
        printf("-1 < 0u\n");
    else
        printf("-1 >= 0u\n");
    return 0;
}

yields -1 < 0u. Why this happened? I cannot explain this.

产量-1 <0u。为什么会这样?我无法解释这一点。

Note that I have seen similar questions like this, but they do not help.

请注意,我已经看到过类似的问题,但它们没有帮助。

PS. As @Abhineet said, the dilemma can be solved by changing short to int. However, how can one explains this phenomena? In other words, -1 in 4 bytes is 0xff ff ff ff and in 2 bytes is 0xff ff. Given them as 2s-complement which are interpreted as unsigned, they have corresponding values of 4294967295 and 65535. They both are not less than 0 and I think in both cases, the output needs to be -1 >= 0u, i.e. x >= u.

PS。正如@Abhineet所说,这种困境可以通过将short改为int来解决。但是,怎么能解释这种现象呢?换句话说,4个字节中的-1是0xff ff ff ff,2个字节中的-1是0xff ff。将它们作为2s补码解释为无符号,它们的对应值为4294967295和65535.它们都不小于0,我认为在这两种情况下,输出都需要为-1> = 0u,即x> = ü。

A sample output for it on a little endian Intel system:

它在小端英特尔系统上的示例输出:

For short:

-1 < 0u
u =
 00 00
x =
 ff ff

For int:

-1 >= 0u
u =
 00 00 00 00
x =
 ff ff ff ff

4 个解决方案

#1


10  

The code above yields -1 >= 0u

上面的代码产生-1> = 0u

All integer literals (numeric constansts) have a type and therefore also a signedness. By default, they are of type int which is signed. When you append the u suffix, you turn the literal into unsigned int.

所有整数文字(数字常量)都有一个类型,因此也有一个签名。默认情况下,它们是int类型,已签名。附加u后缀时,将文字转换为unsigned int。

For any C expression where you have one operand which is signed and one which is unsiged, the rule of balacing (formally: the usual arithmetic conversions) implicitly converts the signed type to unsigned.

对于任何一个C表达式,你有一个被签名的操作数和一个未被取消的操作数,balacing规则(正式地说:通常的算术转换)隐式地将signed类型转换为unsigned。

Conversion from signed to unsigned is well-defined (6.3.1.3):

从有符号到无符号的转换是明确定义的(6.3.1.3):

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

否则,如果新类型是无符号的,则通过重复地添加或减去一个可以在新类型中表示的最大值来转换该值,直到该值在新类型的范围内。

For example, for 32 bit integers on a standard two's complement system, the max value of an unsigned integer is 2^32 - 1 (4294967295, UINT_MAX in limits.h). One more than the maximum value is 2^32. And -1 + 2^32 = 4294967295, so the literal -1 is converted to an unsigned int with the value 4294967295. Which is larger than 0.

例如,对于标准二进制补码系统上的32位整数,无符号整数的最大值为2 ^ 32 - 1(4294967295,limit.h中的UINT_MAX)。超过最大值的是2 ^ 32。并且-1 + 2 ^ 32 = 4294967295,因此文字-1将转换为无符号整数,其值为4294967295.这大于0。


When you switch types to short however, you end up with a small integer type. This is the difference between the two examples. Whenever a small integer type is part of an expression, the integer promotion rule implicitly converts it to a larger int (6.3.1.1):

但是,当您将类型切换为short时,最终会得到一个小整数类型。这是两个例子之间的区别。只要小整数类型是表达式的一部分,整数提升规则就会隐式地将其转换为更大的int(6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果int可以表示原始类型的所有值(由宽度限制,对于位字段),该值将转换为int;否则,它将转换为unsigned int。这些被称为整数促销。整数促销不会更改所有其他类型。

If short is smaller than int on the given platform (as is the case on 32 and 64 bit systems), any short or unsigned short will therefore always get converted to int, because they can fit inside one.

如果short小于给定平台上的int(在32位和64位系统上就是这种情况),那么任何short或unsigned short都将总是转换为int,因为它们可以放在一个中。

So for the expression if (x < u), you actually end up with if((int)x < (int)u) which behaves as expected (-1 is lesser than 0).

因此,对于表达式if(x ),实际上最终会得到if((int)x>

#2


3  

You're running into C's integer promotion rules.

您正在使用C的整数提升规则。

Operators on types smaller than int automatically promote their operands to int or unsigned int. See comments for more detailed explanations. There is a further step for binary (two-operand) operators if the types still don't match after that (e.g. unsigned int vs. int). I won't try to summarize the rules in more detail than that. See Lundin's answer.

小于int的类型的运算符会自动将其操作数提升为int或unsigned int。有关详细说明,请参阅注释。如果类型在此之后仍然不匹配(例如unsigned int与int),则二进制(双操作数)运算符还有一个步骤。我不会试图更详细地总结规则。见Lundin的回答。

This blog post covers this in more detail, with a similar example to yours: signed and unsigned char. It quotes the C99 spec:

这篇博文更详细地介绍了这一点,与您的类似示例:signed和unsigned char。它引用了C99规范:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果int可以表示原始类型的所有值,则该值将转换为int;否则,它将转换为unsigned int。这些被称为整数促销。整数促销不会更改所有其他类型。


You can play around with this more easily on something like godbolt, with a function that returns one or zero. Just look at the compiler output to see what ends up happening.

你可以在诸如godbolt之类的东西上更容易地玩这个,其功能是返回1或0。只需查看编译器输出即可查看最终发生的情况。

#define mytype short

int main() {
    unsigned mytype u = 0u;
    mytype x = -1;
    return (x < u);
}

#3


2  

Other than what you seem to assume , this is not a property of the particular width of the types, here 2 byte versus 4 bytes, but a question of the rules that are to be applied. The integer promotion rules state that short and unsigned short are converted to int on all platforms where the corresponding range of values fit into int. Since this is the case here, both values are preserved and obtain the type int. -1 is perfectly representable in int as is 0. So the test results in -1 is smaller than 0.

除了你似乎假设的,这不是类型的特定宽度的属性,这里是2字节对4字节,而是要应用的规则的问题。整数提升规则声明short和unsigned short在相应的值范围适合int的所有平台上都转换为int。由于这是这种情况,因此保留两个值并获取int类型。 -1在int中完全可表示为0.因此-1中的测试结果小于0。

In the case of testing -1 against 0u the common conversion choses the unsigned type as a common type to which both are converted. -1 converted to unsigned is the value UINT_MAX, which is larger than 0u.

在针对0u测试-1的情况下,公共转换选择无符号类型作为转换两者的公共类型。 -1转换为unsigned的值是UINT_MAX,大于0u。

This is a good example, why you should never use "narrow" types to do arithmetic or comparison. Only use them if you have a sever size constraint. This will rarely be the case for simple variables, but mostly for large arrays where you can really gain from storing in a narrow type.

这是一个很好的例子,为什么你不应该使用“窄”类型来进行算术或比较。仅在具有服务器大小约束时才使用它们。对于简单变量,很少会出现这种情况,但主要用于大型数组,在这些数组中,您可以通过窄类型存储获得真正的收益。

#4


0  

0u is not unsigned short, it's unsigned int.

0u不是unsigned short,它是unsigned int。

Edit:: The explanation to the behavior, How comparison is performed ?

编辑::对行为的解释,如何进行比较?

As answered by Jens Gustedt,

正如Jens Gustedt所回答的,

This is called "usual arithmetic conversions" by the standard and applies whenever two different integer types occur as operands of the same operator.

这被标准称为“通常的算术转换”,并且只要两个不同的整数类型作为同一运算符的操作数出现就适用。

In essence what is does

实质上是做什么的

if the types have different width (more precisely what the standard calls conversion rank) then it converts to the wider type if both types are of same width, besides really weird architectures, the unsigned of them wins Signed to unsigned conversion of the value -1 with whatever type always results in the highest representable value of the unsigned type.

如果类型具有不同的宽度(更准确地说是标准调用转换排名),那么如果两个类型具有相同的宽度,则转换为更宽的类型,除了非常奇怪的体系结构之外,它们的无符号赢得签名到值的无符号转换-1无论何种类型,总是会产生无符号类型的最高可表示值。

The more explanatory blog written by him could be found here.

可以在这里找到他写的更具说明性的博客。

#1


10  

The code above yields -1 >= 0u

上面的代码产生-1> = 0u

All integer literals (numeric constansts) have a type and therefore also a signedness. By default, they are of type int which is signed. When you append the u suffix, you turn the literal into unsigned int.

所有整数文字(数字常量)都有一个类型,因此也有一个签名。默认情况下,它们是int类型,已签名。附加u后缀时,将文字转换为unsigned int。

For any C expression where you have one operand which is signed and one which is unsiged, the rule of balacing (formally: the usual arithmetic conversions) implicitly converts the signed type to unsigned.

对于任何一个C表达式,你有一个被签名的操作数和一个未被取消的操作数,balacing规则(正式地说:通常的算术转换)隐式地将signed类型转换为unsigned。

Conversion from signed to unsigned is well-defined (6.3.1.3):

从有符号到无符号的转换是明确定义的(6.3.1.3):

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

否则,如果新类型是无符号的,则通过重复地添加或减去一个可以在新类型中表示的最大值来转换该值,直到该值在新类型的范围内。

For example, for 32 bit integers on a standard two's complement system, the max value of an unsigned integer is 2^32 - 1 (4294967295, UINT_MAX in limits.h). One more than the maximum value is 2^32. And -1 + 2^32 = 4294967295, so the literal -1 is converted to an unsigned int with the value 4294967295. Which is larger than 0.

例如,对于标准二进制补码系统上的32位整数,无符号整数的最大值为2 ^ 32 - 1(4294967295,limit.h中的UINT_MAX)。超过最大值的是2 ^ 32。并且-1 + 2 ^ 32 = 4294967295,因此文字-1将转换为无符号整数,其值为4294967295.这大于0。


When you switch types to short however, you end up with a small integer type. This is the difference between the two examples. Whenever a small integer type is part of an expression, the integer promotion rule implicitly converts it to a larger int (6.3.1.1):

但是,当您将类型切换为short时,最终会得到一个小整数类型。这是两个例子之间的区别。只要小整数类型是表达式的一部分,整数提升规则就会隐式地将其转换为更大的int(6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果int可以表示原始类型的所有值(由宽度限制,对于位字段),该值将转换为int;否则,它将转换为unsigned int。这些被称为整数促销。整数促销不会更改所有其他类型。

If short is smaller than int on the given platform (as is the case on 32 and 64 bit systems), any short or unsigned short will therefore always get converted to int, because they can fit inside one.

如果short小于给定平台上的int(在32位和64位系统上就是这种情况),那么任何short或unsigned short都将总是转换为int,因为它们可以放在一个中。

So for the expression if (x < u), you actually end up with if((int)x < (int)u) which behaves as expected (-1 is lesser than 0).

因此,对于表达式if(x ),实际上最终会得到if((int)x>

#2


3  

You're running into C's integer promotion rules.

您正在使用C的整数提升规则。

Operators on types smaller than int automatically promote their operands to int or unsigned int. See comments for more detailed explanations. There is a further step for binary (two-operand) operators if the types still don't match after that (e.g. unsigned int vs. int). I won't try to summarize the rules in more detail than that. See Lundin's answer.

小于int的类型的运算符会自动将其操作数提升为int或unsigned int。有关详细说明,请参阅注释。如果类型在此之后仍然不匹配(例如unsigned int与int),则二进制(双操作数)运算符还有一个步骤。我不会试图更详细地总结规则。见Lundin的回答。

This blog post covers this in more detail, with a similar example to yours: signed and unsigned char. It quotes the C99 spec:

这篇博文更详细地介绍了这一点,与您的类似示例:signed和unsigned char。它引用了C99规范:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果int可以表示原始类型的所有值,则该值将转换为int;否则,它将转换为unsigned int。这些被称为整数促销。整数促销不会更改所有其他类型。


You can play around with this more easily on something like godbolt, with a function that returns one or zero. Just look at the compiler output to see what ends up happening.

你可以在诸如godbolt之类的东西上更容易地玩这个,其功能是返回1或0。只需查看编译器输出即可查看最终发生的情况。

#define mytype short

int main() {
    unsigned mytype u = 0u;
    mytype x = -1;
    return (x < u);
}

#3


2  

Other than what you seem to assume , this is not a property of the particular width of the types, here 2 byte versus 4 bytes, but a question of the rules that are to be applied. The integer promotion rules state that short and unsigned short are converted to int on all platforms where the corresponding range of values fit into int. Since this is the case here, both values are preserved and obtain the type int. -1 is perfectly representable in int as is 0. So the test results in -1 is smaller than 0.

除了你似乎假设的,这不是类型的特定宽度的属性,这里是2字节对4字节,而是要应用的规则的问题。整数提升规则声明short和unsigned short在相应的值范围适合int的所有平台上都转换为int。由于这是这种情况,因此保留两个值并获取int类型。 -1在int中完全可表示为0.因此-1中的测试结果小于0。

In the case of testing -1 against 0u the common conversion choses the unsigned type as a common type to which both are converted. -1 converted to unsigned is the value UINT_MAX, which is larger than 0u.

在针对0u测试-1的情况下,公共转换选择无符号类型作为转换两者的公共类型。 -1转换为unsigned的值是UINT_MAX,大于0u。

This is a good example, why you should never use "narrow" types to do arithmetic or comparison. Only use them if you have a sever size constraint. This will rarely be the case for simple variables, but mostly for large arrays where you can really gain from storing in a narrow type.

这是一个很好的例子,为什么你不应该使用“窄”类型来进行算术或比较。仅在具有服务器大小约束时才使用它们。对于简单变量,很少会出现这种情况,但主要用于大型数组,在这些数组中,您可以通过窄类型存储获得真正的收益。

#4


0  

0u is not unsigned short, it's unsigned int.

0u不是unsigned short,它是unsigned int。

Edit:: The explanation to the behavior, How comparison is performed ?

编辑::对行为的解释,如何进行比较?

As answered by Jens Gustedt,

正如Jens Gustedt所回答的,

This is called "usual arithmetic conversions" by the standard and applies whenever two different integer types occur as operands of the same operator.

这被标准称为“通常的算术转换”,并且只要两个不同的整数类型作为同一运算符的操作数出现就适用。

In essence what is does

实质上是做什么的

if the types have different width (more precisely what the standard calls conversion rank) then it converts to the wider type if both types are of same width, besides really weird architectures, the unsigned of them wins Signed to unsigned conversion of the value -1 with whatever type always results in the highest representable value of the unsigned type.

如果类型具有不同的宽度(更准确地说是标准调用转换排名),那么如果两个类型具有相同的宽度,则转换为更宽的类型,除了非常奇怪的体系结构之外,它们的无符号赢得签名到值的无符号转换-1无论何种类型,总是会产生无符号类型的最高可表示值。

The more explanatory blog written by him could be found here.

可以在这里找到他写的更具说明性的博客。