在WinDbg中表示wchar_t和char

时间:2022-09-01 08:26:54

Note:

/*
* Trivial code
*/
wchar_t *greeting = L"Hello World!";
char *greeting_ = "Hello World!";

WinDbg:

0:000> ?? greeting
wchar_t * 0x00415810
"Hello World!"
0:000> ?? greeting_
char * 0x00415800
"Hello World!"

0:000> db 0x00415800
00415800  48 65 6c 6c 6f 20 57 6f-72 6c 64 21 00 00 00 00  Hello World!....
00415810  48 00 65 00 6c 00 6c 00-6f 00 20 00 57 00 6f 00  H.e.l.l.o. .W.o.
00415820  72 00 6c 00 64 00 21 00-00 00 00 00 00 00 00 00  r.l.d.!.........

Question:

  • What is the purpose of the NULL character: 00 between ASCII characters in wchar_t - Win32?
  • NULL字符的用途是什么:在wchar_t - Win32中的ASCII字符之间是00?

3 个解决方案

#1


wchar_t is a wide-character string, so each character takes 2 bytes of storage. 'H' as a wchar_t is 0x0048. Since x86 is little-endian, you see the bytes in memory in order 48 00.

wchar_t是一个宽字符串,因此每个字符占用2个字节的存储空间。作为wchar_t的'H'是0x0048。由于x86是little-endian,因此您可以按顺序看到内存中的字节数为48 00。

db in windbg will dump the bytes and provide how its viewed as an ASCII string, hence the H.E.L. ... output you see. You can use 'du' to dump the memory as a unicode string.

windbg中的db将转储字节并提供它被视为ASCII字符串的方式,因此H.E.L. ...你看到的输出。您可以使用'du'将内存转储为unicode字符串。

#2


The answer is that wchar_t characters are 16-bit quantities, thus requiring two bytes each. Each represents a UTF-16 character. Since the letters you're using are within the ASCII range, they have values < 256, so the high byte is zero for each 2-byte pair.

答案是wchar_t字符是16位量,因此每个字符需要两个字节。每个代表一个UTF-16字符。由于您使用的字母在ASCII范围内,因此它们的值<256,因此每个2字节对的高字节为零。

#3


wchar_t is for unicode while char is for standard 8 bits ascii

wchar_t用于unicode,而char用于标准8位ascii

in wchar_t, every character is represented on 16 bits, but "standard" characters sit on the lower half of the chart. Traditionnal chinese for example would have other values than 00 for those bytes.

在wchar_t中,每个字符都以16位表示,但“标准”字符位于图表的下半部分。例如,传统的中文对于那些字节将具有除00之外的其他值。

#1


wchar_t is a wide-character string, so each character takes 2 bytes of storage. 'H' as a wchar_t is 0x0048. Since x86 is little-endian, you see the bytes in memory in order 48 00.

wchar_t是一个宽字符串,因此每个字符占用2个字节的存储空间。作为wchar_t的'H'是0x0048。由于x86是little-endian,因此您可以按顺序看到内存中的字节数为48 00。

db in windbg will dump the bytes and provide how its viewed as an ASCII string, hence the H.E.L. ... output you see. You can use 'du' to dump the memory as a unicode string.

windbg中的db将转储字节并提供它被视为ASCII字符串的方式,因此H.E.L. ...你看到的输出。您可以使用'du'将内存转储为unicode字符串。

#2


The answer is that wchar_t characters are 16-bit quantities, thus requiring two bytes each. Each represents a UTF-16 character. Since the letters you're using are within the ASCII range, they have values < 256, so the high byte is zero for each 2-byte pair.

答案是wchar_t字符是16位量,因此每个字符需要两个字节。每个代表一个UTF-16字符。由于您使用的字母在ASCII范围内,因此它们的值<256,因此每个2字节对的高字节为零。

#3


wchar_t is for unicode while char is for standard 8 bits ascii

wchar_t用于unicode,而char用于标准8位ascii

in wchar_t, every character is represented on 16 bits, but "standard" characters sit on the lower half of the chart. Traditionnal chinese for example would have other values than 00 for those bytes.

在wchar_t中,每个字符都以16位表示,但“标准”字符位于图表的下半部分。例如,传统的中文对于那些字节将具有除00之外的其他值。