C ++:解除引用的[x]语义:C风格与指针数组

时间:2022-09-30 22:28:28

I know that a c style array is stored as a contiguous block of memory. That is why the following code:

我知道c样式数组存储为连续的内存块。这就是为什么以下代码:

int main (int argc, char *argv[]) {
    int arr[3][3];
    *(*arr + 5) = 5;
    std::cout << arr[1][2] << std::endl;
    return 0;
}

prints 5. I assume that for c style arrays *(*arr + 5) = 5; is roughly equals to the code which the compiler produces for arr[1][2] = 5; isn't it? (Q1)

打印5.我假设对于c样式数组*(* arr + 5)= 5;大致等于编译器为arr [1] [2] = 5生成的代码;不是吗? (Q1)

If so, then the semantics of arr[1][2] (i.e. shifting on one block of memory) is totally different from doing the same on a multidimensional pointer array where every level of nesting results in a pointer being dereferenced. Is that right? (Q2)

如果是这样,则arr [1] [2]的语义(即,在一个存储器块上移位)与在多维指针阵列上执行相同操作完全不同,其中每个嵌套级别导致指针被解除引用。是对的吗? (Q2)

Is there any situation where I need to pay attention to that myself? I.e. where the compiler does not know himself what kind of array he is dealing with? (Q3)

有什么情况我需要自己注意吗?即编译器不知道自己在处理什么样的数组? (Q3)

(Qx marks my questions)

(Qx标志着我的问题)

Thank you in advance and Regards

提前谢谢你的问候

5 个解决方案

#1


7  

On one hand, you're talking about a two dimensional array, which you can imagine looks something like this in memory:

一方面,你在谈论一个二维数组,你可以想象在内存中看起来像这样:

 0,0 0,1 0,2 1,0 1,1 1,2 2,0 2,1 2,2
┌───┬───┬───┬───┬───┬───┬───┬───┬───┐
│int│int│int│int│int│int│int│int│int│
└───┴───┴───┴───┴───┴───┴───┴───┴───┘

On the other hand, when you have an array of pointers to arrays, it looks like this:

另一方面,当你有一个指向数组的指针数组时,它看起来像这样:

┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

Both of these can be indexed using the same [i][j] syntax. The [] operator is defined as x[i] being equivalent to *((x) + (i)). If we index x[1][1], we have *((*((x) + (1)) + (1)).

这两个都可以使用相同的[i] [j]语法进行索引。 []运算符定义为x [i]等于*((x)+(i))。如果我们索引x [1] [1],我们有*((*((x)+(1))+(1))。

In the first case above, first the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is itself an array, so we have a int (*)[3]), then we add 1 to it to move along to the next subarray and dereference it. This subarray then also undergoes array-to-pointer conversion to get a pointer to its first element, which we add 1 to again and dereference. So what we end up with the end is the 2nd element in the 2nd subarray.

在上面的第一种情况中,首先数组名称x经历数组到指针的转换,以获得指向其第一个元素的指针(它本身就是一个数组,所以我们有一个int(*)[3]),然后我们加1它移动到下一个子阵列并取消引用它。这个子阵列然后也经历了数组到指针的转换,以获得指向其第一个元素的指针,我们再次添加1并取消引用。所以我们最终得到的结果是第二个子阵列中的第二个元素。

In the second case, we are dealing with an array of pointers. First the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is a int*). Then we add 1 to it to move to the next pointer in the array and it is dereferenced to get the int* itself. Then we add 1 to this int* to move along to the next int after the one it is currently pointing at and we dereference that. That again gives us the 2nd element in the 2nd array of ints.

在第二种情况下,我们正在处理一组指针。首先,数组名称x经历数组到指针的转换,以获得指向其第一个元素的指针(这是一个int *)。然后我们向它添加1以移动到数组中的下一个指针,它被解引用以获取int *本身。然后我们向这个int *添加1,以便在它当前指向的那个之后移动到下一个int,我们取消引用它。这再次为我们提供了第二个int数组中的第二个元素。

So, given all that:

所以,鉴于这一切:

  1. Yes, since the elements of the 2D array are contiguous, you can do pointer arithmetic where you treat it as a 1D array with 9 ints in it.

    是的,因为2D数组的元素是连续的,所以您可以将指针算术视为一维数组,其中包含9个整数。

  2. Yes, the memory layout of the data in each case is different, so the operations that occur when indexing into them are different.

    是的,每种情况下数据的内存布局都不同,因此索引它们时发生的操作是不同的。

  3. The compiler always knows which type it is dealing with. The compiler won't let you, for example, attempt to convert a 2D array to a int**. They are simply incompatible. However, you generally need to make sure you know what the memory layout of your data is.

    编译器总是知道它正在处理哪种类型。例如,编译器不会让您尝试将2D数组转换为int **。它们完全不兼容。但是,您通常需要确保知道数据的内存布局。


Sometimes you have the following kind of layout where you have an int**, particularly common when you dynamically allocate an array of pointers that point to dynamically allocated arrays:

有时您有以下类型的布局,其中有一个int **,当您动态分配指向动态分配的数组的指针数组时,这种情况很常见:

┌─────┐
│int**│
└─╂───┘
  ▼
┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

The process of indexing this layout is almost exactly the same as the second case above. The only difference is that the first array-to-pointer conversion is not necessary because we already have a pointer.

索引此布局的过程几乎与上面的第二种情况完全相同。唯一的区别是第一个数组到指针的转换不是必需的,因为我们已经有了一个指针。

#2


0  

Q1: This is called pointer arithmetic. An array of integers is technically just an int* in practical use, and adding 1 to a pointer makes the pointer point to the next item in the list. Address-wise it's equivalent to raising the in-memory address with sizeof(int).

Q1:这称为指针算术。在实际使用中,整数数组在技术上只是一个int *,并且向指针添加1会使指针指向列表中的下一个项目。地址方式相当于使用sizeof(int)提高内存地址。

Q2: While syntactically equivalent, the compiler treats it differently because it knows to treat differently because of the declaration. If arr is an int** the internals are compiled as a multidimensional pointer array, if it's an int[x][y] it just does the maths for you inside and keeps it flat, hence why the pointer arithmetic still works.

Q2:虽然在语法上是等价的,但编译器对它的处理方式却不同,因为它知道因声明而区别对待。如果arr是一个int **,内部结构被编译成一个多维指针数组,如果它是一个int [x] [y],它只是为你做内部的数学运算并保持平坦,因此指针运算仍然有效。

Q3: No, if the compiler wouldn't know what it was dealing with it would be a fundamental flaw in the language. This is actually pretty well defined.

问题3:不,如果编译器不知道它处理的是什么,那将是该语言的一个根本缺陷。这实际上定义得很好。

#3


0  

A1: Static multidimension arrays (2D in this case) can be accessed using either 2D or 1D indexes. The elements are one after the other w/o gaps.

A1:可以使用2D或1D索引访问静态多维数组(在这种情况下为2D)。元素是一个接一个地没有间隙。

A2: the compiler knows the relevant offsets at compile time so a[x][y] is actually a single pointer dereference.

A2:编译器在编译时知道相关的偏移量,因此[x] [y]实际上是单指针解除引用。

A3: The compiler always know the type and will produce code accordingly.

A3:编译器总是知道类型并相应地生成代码。

#4


0  

Answer 1:

You are correct. A multiimensional array is only a contiguous block of arrays, where that arrays are contiguous blocks of elements too. So when you create an array like this:

你是对的。多维数组只是一个连续的数组块,其中该数组也是连续的元素块。所以当你创建这样的数组时:

    int array[n1][n2]...[nm];

What the compiler does is to allocate a n1 * n2 * ... * nm length unidimensional contiguous array. Thats why your *(*array + 5) works.
In fact when you index elements of a multidimensional array: array[index_1][index_2]...[index_m] = 0; What the compiler really does is something like this:

编译器的作用是分配一个n1 * n2 * ... * nm长度的一维连续数组。这就是为什么你的*(*数组+5)有效。实际上,当您索引多维数组的元素时:array [index_1] [index_2] ... [index_m] = 0;编译器真正做的是这样的:

    array[index_1 * ( n1 * n2 * ... * n(m-1) ) + index_2 * ( n1 * n2 * ... * n(m-2) ) + .. + index_m] = 0;

Answer 2:

Multidimensional arrays are not the same as pointers to pointers to pointers to pointers... Multidimensional arrays are, as I said before, continious regions of aligned memory, and dynamic arrays of pointers to dynamic arrays of pointers to... are arrays which elements are pointers. Thats is, things that are dessigned to be dereferenced. Thats why treating a **array as an array[][] may fail, and biceversa. For more info about this topic, see this thread: casting char[][] to char** causes segfault?

多维数组与指向指针指针的指针不同......正如我之前所说,多维数组是对齐内存的连续区域,以及指向动态数组指针的动态指针数组是哪些元素的数组是指针。多数民众赞成就是那些被取消引用的东西。这就是为什么将**数组作为数组[] []处理可能会失败,而biceversa。有关此主题的更多信息,请参阅此主题:将char [] []转换为char **导致segfault?

#5


-1  

if you define

如果你定义

int arr[A][B];

then

arr[x][y]

is actually

*(arr+x*B+y) //array name can be treated as memory location or const pointer

so, double dereference in your code is wrong. <== This line is wrong. Sorry.

所以,你的代码中的双重取消引用是错误的。 <==这行错了。抱歉。

#1


7  

On one hand, you're talking about a two dimensional array, which you can imagine looks something like this in memory:

一方面,你在谈论一个二维数组,你可以想象在内存中看起来像这样:

 0,0 0,1 0,2 1,0 1,1 1,2 2,0 2,1 2,2
┌───┬───┬───┬───┬───┬───┬───┬───┬───┐
│int│int│int│int│int│int│int│int│int│
└───┴───┴───┴───┴───┴───┴───┴───┴───┘

On the other hand, when you have an array of pointers to arrays, it looks like this:

另一方面,当你有一个指向数组的指针数组时,它看起来像这样:

┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

Both of these can be indexed using the same [i][j] syntax. The [] operator is defined as x[i] being equivalent to *((x) + (i)). If we index x[1][1], we have *((*((x) + (1)) + (1)).

这两个都可以使用相同的[i] [j]语法进行索引。 []运算符定义为x [i]等于*((x)+(i))。如果我们索引x [1] [1],我们有*((*((x)+(1))+(1))。

In the first case above, first the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is itself an array, so we have a int (*)[3]), then we add 1 to it to move along to the next subarray and dereference it. This subarray then also undergoes array-to-pointer conversion to get a pointer to its first element, which we add 1 to again and dereference. So what we end up with the end is the 2nd element in the 2nd subarray.

在上面的第一种情况中,首先数组名称x经历数组到指针的转换,以获得指向其第一个元素的指针(它本身就是一个数组,所以我们有一个int(*)[3]),然后我们加1它移动到下一个子阵列并取消引用它。这个子阵列然后也经历了数组到指针的转换,以获得指向其第一个元素的指针,我们再次添加1并取消引用。所以我们最终得到的结果是第二个子阵列中的第二个元素。

In the second case, we are dealing with an array of pointers. First the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is a int*). Then we add 1 to it to move to the next pointer in the array and it is dereferenced to get the int* itself. Then we add 1 to this int* to move along to the next int after the one it is currently pointing at and we dereference that. That again gives us the 2nd element in the 2nd array of ints.

在第二种情况下,我们正在处理一组指针。首先,数组名称x经历数组到指针的转换,以获得指向其第一个元素的指针(这是一个int *)。然后我们向它添加1以移动到数组中的下一个指针,它被解引用以获取int *本身。然后我们向这个int *添加1,以便在它当前指向的那个之后移动到下一个int,我们取消引用它。这再次为我们提供了第二个int数组中的第二个元素。

So, given all that:

所以,鉴于这一切:

  1. Yes, since the elements of the 2D array are contiguous, you can do pointer arithmetic where you treat it as a 1D array with 9 ints in it.

    是的,因为2D数组的元素是连续的,所以您可以将指针算术视为一维数组,其中包含9个整数。

  2. Yes, the memory layout of the data in each case is different, so the operations that occur when indexing into them are different.

    是的,每种情况下数据的内存布局都不同,因此索引它们时发生的操作是不同的。

  3. The compiler always knows which type it is dealing with. The compiler won't let you, for example, attempt to convert a 2D array to a int**. They are simply incompatible. However, you generally need to make sure you know what the memory layout of your data is.

    编译器总是知道它正在处理哪种类型。例如,编译器不会让您尝试将2D数组转换为int **。它们完全不兼容。但是,您通常需要确保知道数据的内存布局。


Sometimes you have the following kind of layout where you have an int**, particularly common when you dynamically allocate an array of pointers that point to dynamically allocated arrays:

有时您有以下类型的布局,其中有一个int **,当您动态分配指向动态分配的数组的指针数组时,这种情况很常见:

┌─────┐
│int**│
└─╂───┘
  ▼
┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

The process of indexing this layout is almost exactly the same as the second case above. The only difference is that the first array-to-pointer conversion is not necessary because we already have a pointer.

索引此布局的过程几乎与上面的第二种情况完全相同。唯一的区别是第一个数组到指针的转换不是必需的,因为我们已经有了一个指针。

#2


0  

Q1: This is called pointer arithmetic. An array of integers is technically just an int* in practical use, and adding 1 to a pointer makes the pointer point to the next item in the list. Address-wise it's equivalent to raising the in-memory address with sizeof(int).

Q1:这称为指针算术。在实际使用中,整数数组在技术上只是一个int *,并且向指针添加1会使指针指向列表中的下一个项目。地址方式相当于使用sizeof(int)提高内存地址。

Q2: While syntactically equivalent, the compiler treats it differently because it knows to treat differently because of the declaration. If arr is an int** the internals are compiled as a multidimensional pointer array, if it's an int[x][y] it just does the maths for you inside and keeps it flat, hence why the pointer arithmetic still works.

Q2:虽然在语法上是等价的,但编译器对它的处理方式却不同,因为它知道因声明而区别对待。如果arr是一个int **,内部结构被编译成一个多维指针数组,如果它是一个int [x] [y],它只是为你做内部的数学运算并保持平坦,因此指针运算仍然有效。

Q3: No, if the compiler wouldn't know what it was dealing with it would be a fundamental flaw in the language. This is actually pretty well defined.

问题3:不,如果编译器不知道它处理的是什么,那将是该语言的一个根本缺陷。这实际上定义得很好。

#3


0  

A1: Static multidimension arrays (2D in this case) can be accessed using either 2D or 1D indexes. The elements are one after the other w/o gaps.

A1:可以使用2D或1D索引访问静态多维数组(在这种情况下为2D)。元素是一个接一个地没有间隙。

A2: the compiler knows the relevant offsets at compile time so a[x][y] is actually a single pointer dereference.

A2:编译器在编译时知道相关的偏移量,因此[x] [y]实际上是单指针解除引用。

A3: The compiler always know the type and will produce code accordingly.

A3:编译器总是知道类型并相应地生成代码。

#4


0  

Answer 1:

You are correct. A multiimensional array is only a contiguous block of arrays, where that arrays are contiguous blocks of elements too. So when you create an array like this:

你是对的。多维数组只是一个连续的数组块,其中该数组也是连续的元素块。所以当你创建这样的数组时:

    int array[n1][n2]...[nm];

What the compiler does is to allocate a n1 * n2 * ... * nm length unidimensional contiguous array. Thats why your *(*array + 5) works.
In fact when you index elements of a multidimensional array: array[index_1][index_2]...[index_m] = 0; What the compiler really does is something like this:

编译器的作用是分配一个n1 * n2 * ... * nm长度的一维连续数组。这就是为什么你的*(*数组+5)有效。实际上,当您索引多维数组的元素时:array [index_1] [index_2] ... [index_m] = 0;编译器真正做的是这样的:

    array[index_1 * ( n1 * n2 * ... * n(m-1) ) + index_2 * ( n1 * n2 * ... * n(m-2) ) + .. + index_m] = 0;

Answer 2:

Multidimensional arrays are not the same as pointers to pointers to pointers to pointers... Multidimensional arrays are, as I said before, continious regions of aligned memory, and dynamic arrays of pointers to dynamic arrays of pointers to... are arrays which elements are pointers. Thats is, things that are dessigned to be dereferenced. Thats why treating a **array as an array[][] may fail, and biceversa. For more info about this topic, see this thread: casting char[][] to char** causes segfault?

多维数组与指向指针指针的指针不同......正如我之前所说,多维数组是对齐内存的连续区域,以及指向动态数组指针的动态指针数组是哪些元素的数组是指针。多数民众赞成就是那些被取消引用的东西。这就是为什么将**数组作为数组[] []处理可能会失败,而biceversa。有关此主题的更多信息,请参阅此主题:将char [] []转换为char **导致segfault?

#5


-1  

if you define

如果你定义

int arr[A][B];

then

arr[x][y]

is actually

*(arr+x*B+y) //array name can be treated as memory location or const pointer

so, double dereference in your code is wrong. <== This line is wrong. Sorry.

所以,你的代码中的双重取消引用是错误的。 <==这行错了。抱歉。