大型二维阵列的Seg故障

时间:2022-09-06 15:19:39

I am writing a program to do some analysis on DNA sequences. Everything works fine except for this thing. I want to declare a 2D array of size m*n where m and n are read from an input file. Now the issue is that if m and n goes too large. As an example if m = 200 and n = 50000 then I get a seg fault at the line where I declare my array.

我正在写一个程序来分析DNA序列。除了这个,其他的都可以。我要声明一个大小为m*n的二维数组,其中m和n是从输入文件中读取的。现在的问题是如果m和n太大。举个例子,如果m = 200 n = 50000那么我在声明数组的那行就会得到一个seg错误。

array[m][n];

Any ideas how to overcome this. I do need such an array as my entire logic depends on how to process this array.

有什么办法可以克服这个。我确实需要这样一个数组,因为我的整个逻辑取决于如何处理这个数组。

5 个解决方案

#1


4  

Probably you are running out of stack space.
Can you not allocate the array dynamically on heap using malloc?

可能您正在耗尽堆栈空间。你不能使用malloc在堆上动态分配数组吗?

You may want to have a look at this answer if you do not know how to do that.

如果你不知道怎么做,你可能想看看这个答案。

#2


1  

Not sure what type you're using but for the following code I've assumed int.

不确定您使用的是哪种类型,但是对于下面的代码,我假设是int类型。

Rather than doing this:

而不是这样做:

int array[200][50000];

Try doing this:

试着这样做:

int** array = (int**)malloc(200);
for (int i = 0; i < 200; i++)
{
    array[i] = (int*)malloc(50000);
}

This will allocate "heap" memory rather than "stack" memory. You are asking for over 300mb (if you're using a 32bit type) so you probably don't have that much "stack" memory.

这将分配“堆”内存而不是“堆栈”内存。您需要超过300mb(如果您使用的是32位类型),那么您可能没有那么多“堆栈”内存。

Make sure to cleanup after you're done with the array with:

在处理完数组后,请确保清理:

for (int i = 0; i < 200; i++)
{
    free(array[i]);
}
free(array);

Feel free to use m and n instead of the constants I used above!

请随意使用m和n,而不是我上面使用的常量!

Edit: I originally wrote this in C++, and converted to C. I am a little more rusty with C memory allocation/deallocation, but I believe I got it right.

编辑:我最初是用c++写的,然后转换成C。我对C内存分配/释放有点生疏了,但我相信我写对了。

#3


1  

As others have said it is not a good idea to allocate a large VLA (variable length array) on the stack. Allocate it with malloc:

正如其他人所说,在堆栈上分配一个大的VLA(可变长度数组)不是一个好主意。用malloc分配:

double (*array)[n] = malloc(sizeof(double[m][n]));

and you have an object as before, that is that the compiler perfectly knows how to address individual elements array[i][j] and the allocation still gives you one consecutive blob in memory.

和之前一样,你有一个对象,那就是编译器完全知道如何处理单个元素数组[i][j],而分配仍然给你一个连续的内存blob。

Just don't forget to do

只是别忘了做

free(array);

at the end of your scope.

在你的范围的尽头。

#4


0  

You are likely running out of stack space.

您可能会耗尽堆栈空间。

Windows for instance gives each thread 1MB stack. Assuming the array contains integers and you are creating it on the stack you are creating a 40MB stack variable.

例如,Windows为每个线程提供1MB堆栈。假设数组包含整数,并且您正在堆栈上创建它,那么您正在创建一个40MB堆栈变量。

You should instead dynamically allocate it on the heap.

相反,应该动态地在堆上分配它。

#5


0  

The array (if local) is allocated in the stack. There is certain limits imposed on the stack size for a process/thread. If the stack is overgrown it will cause issues.

数组(如果是本地的)被分配到堆栈中。进程/线程的堆栈大小有一定的限制。如果堆栈过大,就会引起问题。

But you can allocate the array in heap using malloc . Typical heap size could be 4GB (this can be more or less depending on OS/Architecture). Check the return value of malloc to make sure that memory for the array is correctly allocated.

但是可以使用malloc在堆中分配数组。典型的堆大小可能是4GB(这多少取决于OS/体系结构)。检查malloc的返回值,以确保正确分配了数组的内存。

#1


4  

Probably you are running out of stack space.
Can you not allocate the array dynamically on heap using malloc?

可能您正在耗尽堆栈空间。你不能使用malloc在堆上动态分配数组吗?

You may want to have a look at this answer if you do not know how to do that.

如果你不知道怎么做,你可能想看看这个答案。

#2


1  

Not sure what type you're using but for the following code I've assumed int.

不确定您使用的是哪种类型,但是对于下面的代码,我假设是int类型。

Rather than doing this:

而不是这样做:

int array[200][50000];

Try doing this:

试着这样做:

int** array = (int**)malloc(200);
for (int i = 0; i < 200; i++)
{
    array[i] = (int*)malloc(50000);
}

This will allocate "heap" memory rather than "stack" memory. You are asking for over 300mb (if you're using a 32bit type) so you probably don't have that much "stack" memory.

这将分配“堆”内存而不是“堆栈”内存。您需要超过300mb(如果您使用的是32位类型),那么您可能没有那么多“堆栈”内存。

Make sure to cleanup after you're done with the array with:

在处理完数组后,请确保清理:

for (int i = 0; i < 200; i++)
{
    free(array[i]);
}
free(array);

Feel free to use m and n instead of the constants I used above!

请随意使用m和n,而不是我上面使用的常量!

Edit: I originally wrote this in C++, and converted to C. I am a little more rusty with C memory allocation/deallocation, but I believe I got it right.

编辑:我最初是用c++写的,然后转换成C。我对C内存分配/释放有点生疏了,但我相信我写对了。

#3


1  

As others have said it is not a good idea to allocate a large VLA (variable length array) on the stack. Allocate it with malloc:

正如其他人所说,在堆栈上分配一个大的VLA(可变长度数组)不是一个好主意。用malloc分配:

double (*array)[n] = malloc(sizeof(double[m][n]));

and you have an object as before, that is that the compiler perfectly knows how to address individual elements array[i][j] and the allocation still gives you one consecutive blob in memory.

和之前一样,你有一个对象,那就是编译器完全知道如何处理单个元素数组[i][j],而分配仍然给你一个连续的内存blob。

Just don't forget to do

只是别忘了做

free(array);

at the end of your scope.

在你的范围的尽头。

#4


0  

You are likely running out of stack space.

您可能会耗尽堆栈空间。

Windows for instance gives each thread 1MB stack. Assuming the array contains integers and you are creating it on the stack you are creating a 40MB stack variable.

例如,Windows为每个线程提供1MB堆栈。假设数组包含整数,并且您正在堆栈上创建它,那么您正在创建一个40MB堆栈变量。

You should instead dynamically allocate it on the heap.

相反,应该动态地在堆上分配它。

#5


0  

The array (if local) is allocated in the stack. There is certain limits imposed on the stack size for a process/thread. If the stack is overgrown it will cause issues.

数组(如果是本地的)被分配到堆栈中。进程/线程的堆栈大小有一定的限制。如果堆栈过大,就会引起问题。

But you can allocate the array in heap using malloc . Typical heap size could be 4GB (this can be more or less depending on OS/Architecture). Check the return value of malloc to make sure that memory for the array is correctly allocated.

但是可以使用malloc在堆中分配数组。典型的堆大小可能是4GB(这多少取决于OS/体系结构)。检查malloc的返回值,以确保正确分配了数组的内存。