如何在C中将struct转换为char数组

时间:2022-09-06 10:53:28

I'm trying to convert a struct to a char array to send over the network. However, I get some weird output from the char array when I do.

我正在尝试将结构转换为char数组以通过网络发送。但是,当我这样做时,我从char数组得到了一些奇怪的输出。

#include <stdio.h>

struct x
{
   int x;
} __attribute__((packed));


int main()
{
   struct x a;
   a.x=127;
   char *b = (char *)&a;
   int i;
   for (i=0; i<4; i++)
      printf("%02x ", b[i]);
   printf("\n");
   for (i=0; i<4; i++)
      printf("%d ", b[i]);
   printf("\n");
   return 0;
}

Here is the output for various values of a.x (on an X86 using gcc):
127:
7f 00 00 00
127 0 0 0

这是a.x的各种值的输出(在使用gcc的X86上):127:7f 00 00 00 127 0 0 0

128:
ffffff80 00 00 00
-128 0 0 0

128:ffffff80 00 00 00 -128 0 0 0

255:
ffffffff 00 00 00
-1 0 0 0

255:ffffffff 00 00 00 -1 0 0 0

256:
00 01 00 00
0 1 0 0

256:00 01 00 00 0 1 0 0

I understand the values for 127 and 256, but why do the numbers change when going to 128? Why wouldn't it just be: 80 00 00 00 128 0 0 0

我理解127和256的值,但为什么数字在转到128时会改变?为什么不会这样:80 00 00 00 128 0 0 0

Am I forgetting to do something in the conversion process or am I forgetting something about integer representation?

我忘记在转换过程中做某事或者我忘记了整数表示的某些事情?

*Note: This is just a small test program. In a real program I have more in the struct, better variable names, and I convert to little-endian.
*Edit: formatting

*注意:这只是一个小测试程序。在一个真正的程序中,我在结构中有更多,更好的变量名,我转换为little-endian。 *编辑:格式化

10 个解决方案

#1


8  

The x format specifier by itself says that the argument is an int, and since the number is negative, printf requires eight characters to show all four non-zero bytes of the int-sized value. The 0 modifier tells to pad the output with zeros, and the 2 modifier says that the minimum output should be two characters long. As far as I can tell, printf doesn't provide a way to specify a maximum width, except for strings.

x格式说明符本身表示该参数是一个int,并且由于该数字是负数,因此printf需要八个字符来显示int大小值的所有四个非零字节。 0修饰符告诉用零填充输出,2修饰符表示最小输出应该是两个字符长。据我所知,printf没有提供指定最大宽度的方法,除了字符串。

Now then, you're only passing a char, so bare x tells the function to use the full int that got passed instead — due to default argument promotion for "..." parameters. Try the hh modifier to tell the function to treat the argument as just a char instead:

现在,你只传递一个char,所以裸x告诉函数使用传递的完整int - 由于“...”参数的默认参数提升。尝试使用hh修饰符告诉函数将参数视为char而不是:

printf("%02hhx", b[i]);

#2


11  

What you see is the sign preserving conversion from char to int. The behavior results from the fact that on your system, char is signed (Note: char is not signed on all systems). That will lead to negative values if a bit-pattern yields to a negative value for a char. Promoting such a char to an int will preserve the sign and the int will be negative too. Note that even if you don't put a (int) explicitly, the compiler will automatically promote the character to an int when passing to printf. The solution is to convert your value to unsigned char first:

你看到的是保持从char转换为int的符号。该行为是由于在您的系统上char已签名(注意:char未在所有系统上签名)。如果位模式产生char的负值,那将导致负值。将这样的char提升为int将保留符号,而int也将是负数。请注意,即使您没有显式地放置(int),编译器也会在传递给printf时自动将字符提升为int。解决方案是首先将您的值转换为unsigned char:

for (i=0; i<4; i++)
   printf("%02x ", (unsigned char)b[i]);

Alternatively, you can use unsigned char* from the start on:

或者,您可以从一开始就使用unsigned char *:

unsigned char *b = (unsigned char *)&a;

And then you don't need any cast at the time you print it with printf.

然后在使用printf打印时不需要任何演员表。

#3


8  

char is a signed type; so with two's complement, 0x80 is -128 for an 8-bit integer (i.e. a byte)

char是签名类型;所以使用二进制补码,对于一个8位整数(即一个字节),0x80为-128

#4


5  

Treating your struct as if it were a char array is undefined behavior. To send it over the network, use proper serialization instead. It's a pain in C++ and even more so in C, but it's the only way your app will work independently of the machines reading and writing.

将结构视为char数组是未定义的行为。要通过网络发送,请使用正确的序列化。这是C ++的痛苦,在C中更是如此,但它是你的应用程序独立于机器读写的唯一方式。

http://en.wikipedia.org/wiki/Serialization#C

#5


2  

Converting your structure to characters or bytes the way you're doing it, is going to lead to issues when you do try to make it network neutral. Why not address that problem now? There are a variety of different techniques you can use, all of which are likely to be more "portable" than what you're trying to do. For instance:

以您的方式将结构转换为字符或字节,当您尝试使其网络中立时,将导致问题。为什么不现在解决这个问题呢?您可以使用各种不同的技术,所有这些技术都可能比您尝试的更“便携”。例如:

  • Sending numeric data across the network in a machine-neutral fashion has long been dealt with, in the POSIX/Unix world, via the functions htonl, htons, ntohl and ntohs. See, for example, the byteorder(3) manual page on a FreeBSD or Linux system.
  • 在POSIX / Unix世界中,通过函数htonl,htons,ntohl和ntohs长期以来以机器中立的方式在网络上发送数字数据。例如,请参阅FreeBSD或Linux系统上的byteorder(3)手册页。

  • Converting data to and from a completely neutral representation like JSON is also perfectly acceptable. The amount of time your programs spend converting the data between JSON and native forms is likely to pale in comparison to the network transmission latencies.
  • 将数据转换为完全中性的表示形式(如JSON)也是完全可以接受的。与网络传输延迟相比,您的程序在JSON和本机表单之间转换数据所花费的时间可能会很少。

#6


1  

char is a signed type so what you are seeing is the two-compliment representation, casting to (unsigned char*) will fix that (Rowland just beat me).

char是一个签名类型,所以你看到的是两个赞美表示,转换为(unsigned char *)将解决这个问题(Rowland只是打败了我)。

On a side note you may want to change

另外,您可能想要更改

for (i=0; i<4; i++) {
//...
}

to

for (i=0; i<sizeof(x); i++) {
//...
}

#7


1  

The signedness of char array is not the root of the problem! (It is -a- problem, but not the only problem.)

char数组的签名不是问题的根源! (这是问题,但不是唯一的问题。)

Alignment! That's the key word here. That's why you should NEVER try to treat structs like raw memory. Compliers (and various optimization flags), operating systems, and phases of the moon all do strange and exciting things to the actual location in memory of "adjacent" fields in a structure. For example, if you have a struct with a char followed by an int, the whole struct will be EIGHT bytes in memory -- the char, 3 blank, useless bytes, and then 4 bytes for the int. The machine likes to do things like this so structs can fit cleanly on pages of memory, and such like.

对准!这是关键词。这就是为什么你永远不应该试图像原始记忆一样对待结构。编译器(和各种优化标志),操作系统和月亮阶段都对结构中“相邻”字段的存储器中的实际位置做了奇怪和令人兴奋的事情。例如,如果你有一个带有char后跟一个int的结构,那么整个结构将是内存中的EIGHT字节 - char,3个空白,无用的字节,然后是int的4个字节。机器喜欢做这样的事情,所以结构可以很好地适应内存页面等等。

Take an introductory course to machine architecture at your local college. Meanwhile, serialize properly. Never treat structs like char arrays.

在当地大学学习机械建筑的入门课程。同时,正确序列化。永远不要像char数组那样处理结构。

#8


1  

When you go to send it, just use:

当你去发送它时,只需使用:

(char*)&CustomPacket

to convert. Works for me.

转换。适合我。

#9


0  

You may want to convert to a unsigned char array.

您可能希望转换为unsigned char数组。

#10


-1  

Unless you have very convincing measurements showing that every octet is precious, don't do this. Use a readable ASCII protocol like SMTP, NNTP, or one of the many other fine Internet protocols codified by the IETF.

除非你有非常令人信服的测量表明每个八位字节都是珍贵的,否则不要这样做。使用可读的ASCII协议,如SMTP,NNTP或IETF编写的许多其他精细Internet协议之一。

If you really must have a binary format, it's still not safe just to shove out the bytes in a struct, because the byte order, basic sizes, or alignment constraints may differ from host to host. You must design your wire protcol to use well-defined sizes and to use a well defined byte order. For your implementation, either use macros like ntohl(3) or use shifting and masking to put bytes into your stream. Whatever you do, make sure your code produces the same results on both big-endian and little-endian hosts.

如果你真的必须有二进制格式,那么仅仅推断结构中的字节仍然是不安全的,因为字节顺序,基本大小或对齐约束可能因主机而异。您必须将wire protcol设计为使用定义良好的大小并使用定义良好的字节顺序。对于您的实现,要么使用像ntohl(3)这样的宏,要么使用移位和屏蔽将字节放入流中。无论您做什么,请确保您的代码在big-endian和little-endian主机上产生相同的结果。

#1


8  

The x format specifier by itself says that the argument is an int, and since the number is negative, printf requires eight characters to show all four non-zero bytes of the int-sized value. The 0 modifier tells to pad the output with zeros, and the 2 modifier says that the minimum output should be two characters long. As far as I can tell, printf doesn't provide a way to specify a maximum width, except for strings.

x格式说明符本身表示该参数是一个int,并且由于该数字是负数,因此printf需要八个字符来显示int大小值的所有四个非零字节。 0修饰符告诉用零填充输出,2修饰符表示最小输出应该是两个字符长。据我所知,printf没有提供指定最大宽度的方法,除了字符串。

Now then, you're only passing a char, so bare x tells the function to use the full int that got passed instead — due to default argument promotion for "..." parameters. Try the hh modifier to tell the function to treat the argument as just a char instead:

现在,你只传递一个char,所以裸x告诉函数使用传递的完整int - 由于“...”参数的默认参数提升。尝试使用hh修饰符告诉函数将参数视为char而不是:

printf("%02hhx", b[i]);

#2


11  

What you see is the sign preserving conversion from char to int. The behavior results from the fact that on your system, char is signed (Note: char is not signed on all systems). That will lead to negative values if a bit-pattern yields to a negative value for a char. Promoting such a char to an int will preserve the sign and the int will be negative too. Note that even if you don't put a (int) explicitly, the compiler will automatically promote the character to an int when passing to printf. The solution is to convert your value to unsigned char first:

你看到的是保持从char转换为int的符号。该行为是由于在您的系统上char已签名(注意:char未在所有系统上签名)。如果位模式产生char的负值,那将导致负值。将这样的char提升为int将保留符号,而int也将是负数。请注意,即使您没有显式地放置(int),编译器也会在传递给printf时自动将字符提升为int。解决方案是首先将您的值转换为unsigned char:

for (i=0; i<4; i++)
   printf("%02x ", (unsigned char)b[i]);

Alternatively, you can use unsigned char* from the start on:

或者,您可以从一开始就使用unsigned char *:

unsigned char *b = (unsigned char *)&a;

And then you don't need any cast at the time you print it with printf.

然后在使用printf打印时不需要任何演员表。

#3


8  

char is a signed type; so with two's complement, 0x80 is -128 for an 8-bit integer (i.e. a byte)

char是签名类型;所以使用二进制补码,对于一个8位整数(即一个字节),0x80为-128

#4


5  

Treating your struct as if it were a char array is undefined behavior. To send it over the network, use proper serialization instead. It's a pain in C++ and even more so in C, but it's the only way your app will work independently of the machines reading and writing.

将结构视为char数组是未定义的行为。要通过网络发送,请使用正确的序列化。这是C ++的痛苦,在C中更是如此,但它是你的应用程序独立于机器读写的唯一方式。

http://en.wikipedia.org/wiki/Serialization#C

#5


2  

Converting your structure to characters or bytes the way you're doing it, is going to lead to issues when you do try to make it network neutral. Why not address that problem now? There are a variety of different techniques you can use, all of which are likely to be more "portable" than what you're trying to do. For instance:

以您的方式将结构转换为字符或字节,当您尝试使其网络中立时,将导致问题。为什么不现在解决这个问题呢?您可以使用各种不同的技术,所有这些技术都可能比您尝试的更“便携”。例如:

  • Sending numeric data across the network in a machine-neutral fashion has long been dealt with, in the POSIX/Unix world, via the functions htonl, htons, ntohl and ntohs. See, for example, the byteorder(3) manual page on a FreeBSD or Linux system.
  • 在POSIX / Unix世界中,通过函数htonl,htons,ntohl和ntohs长期以来以机器中立的方式在网络上发送数字数据。例如,请参阅FreeBSD或Linux系统上的byteorder(3)手册页。

  • Converting data to and from a completely neutral representation like JSON is also perfectly acceptable. The amount of time your programs spend converting the data between JSON and native forms is likely to pale in comparison to the network transmission latencies.
  • 将数据转换为完全中性的表示形式(如JSON)也是完全可以接受的。与网络传输延迟相比,您的程序在JSON和本机表单之间转换数据所花费的时间可能会很少。

#6


1  

char is a signed type so what you are seeing is the two-compliment representation, casting to (unsigned char*) will fix that (Rowland just beat me).

char是一个签名类型,所以你看到的是两个赞美表示,转换为(unsigned char *)将解决这个问题(Rowland只是打败了我)。

On a side note you may want to change

另外,您可能想要更改

for (i=0; i<4; i++) {
//...
}

to

for (i=0; i<sizeof(x); i++) {
//...
}

#7


1  

The signedness of char array is not the root of the problem! (It is -a- problem, but not the only problem.)

char数组的签名不是问题的根源! (这是问题,但不是唯一的问题。)

Alignment! That's the key word here. That's why you should NEVER try to treat structs like raw memory. Compliers (and various optimization flags), operating systems, and phases of the moon all do strange and exciting things to the actual location in memory of "adjacent" fields in a structure. For example, if you have a struct with a char followed by an int, the whole struct will be EIGHT bytes in memory -- the char, 3 blank, useless bytes, and then 4 bytes for the int. The machine likes to do things like this so structs can fit cleanly on pages of memory, and such like.

对准!这是关键词。这就是为什么你永远不应该试图像原始记忆一样对待结构。编译器(和各种优化标志),操作系统和月亮阶段都对结构中“相邻”字段的存储器中的实际位置做了奇怪和令人兴奋的事情。例如,如果你有一个带有char后跟一个int的结构,那么整个结构将是内存中的EIGHT字节 - char,3个空白,无用的字节,然后是int的4个字节。机器喜欢做这样的事情,所以结构可以很好地适应内存页面等等。

Take an introductory course to machine architecture at your local college. Meanwhile, serialize properly. Never treat structs like char arrays.

在当地大学学习机械建筑的入门课程。同时,正确序列化。永远不要像char数组那样处理结构。

#8


1  

When you go to send it, just use:

当你去发送它时,只需使用:

(char*)&CustomPacket

to convert. Works for me.

转换。适合我。

#9


0  

You may want to convert to a unsigned char array.

您可能希望转换为unsigned char数组。

#10


-1  

Unless you have very convincing measurements showing that every octet is precious, don't do this. Use a readable ASCII protocol like SMTP, NNTP, or one of the many other fine Internet protocols codified by the IETF.

除非你有非常令人信服的测量表明每个八位字节都是珍贵的,否则不要这样做。使用可读的ASCII协议,如SMTP,NNTP或IETF编写的许多其他精细Internet协议之一。

If you really must have a binary format, it's still not safe just to shove out the bytes in a struct, because the byte order, basic sizes, or alignment constraints may differ from host to host. You must design your wire protcol to use well-defined sizes and to use a well defined byte order. For your implementation, either use macros like ntohl(3) or use shifting and masking to put bytes into your stream. Whatever you do, make sure your code produces the same results on both big-endian and little-endian hosts.

如果你真的必须有二进制格式,那么仅仅推断结构中的字节仍然是不安全的,因为字节顺序,基本大小或对齐约束可能因主机而异。您必须将wire protcol设计为使用定义良好的大小并使用定义良好的字节顺序。对于您的实现,要么使用像ntohl(3)这样的宏,要么使用移位和屏蔽将字节放入流中。无论您做什么,请确保您的代码在big-endian和little-endian主机上产生相同的结果。