Allocating array of size 10^5 * 10^5 in c using malloc

问题

I asked the same problem two times (see here Getting segmentation fault while using malloc ) and improved my code. But I am unable to allocate memory for larger value of m and n . The heart of my code is :

#include<stdio.h>
#include<stdlib.h>

int main()
{
    int i,j,n,m,p = 0 ;
    int sum[100000] = {0};

    scanf("%d%d%d",&n,&m,&p);

    /*for ( i = 0 ; i < n ; i++ )
    {
        for( j = 0 ; j < m ; j++ )
        {
            a[i][j] = j + 1 ;
        }
    }*/
    int **a = malloc( n * sizeof *a );
    for(i=0; i<n; i++)
    {
        a[i] = malloc( m * sizeof **a); 

           for(j=0; j<m; j++){
            a[i][j] = j + 1 ;

           }
       }



    while ( p-- )
    {
        //printf("I am in while loop\n");
        scanf("%d%d",&i,&j);
        {
            a[i-1][j-1] += 1 ;  
        }
    }

    /*for ( i = 0 ; i < n ; i++ )
    {
        for( j = 0 ; j < m ; j++ )
        {
            printf("%d ",a[i][j]);
        }
        printf("\n");
    }*/
    for ( i = 0 ; i < n ; i++ )
    {
        for( j = 0 ; j < m - 1; j++ )
        {
            p = a[i][j+1] - a[i][j];
            if ( p < 0 )
            {
                sum[i] = -1 ;
                break;
            }
            else
            {
                sum[i]+=p;
            }
            //printf("%d ",p);

        }
    }
    for ( i = 0 ; i < n ; i++ )
    {
        printf("%d\n",sum[i] );
    }

    return 0;
}

The constraints are: 1 ≤ n, m, p ≤ 10 ^ 5

I am still getting run time error for larger value. I know I am allocating 10 GB of memory at a time, but it is the requirement of the problem. My question is whether it is possible in C to allocate this much memory at a time or not ? If it is possible then please tell me. If not then should I study C++.

回答1:

Unless I misunderstood the question, you actually need much more than 10GB, don't you? You want to store 10^10 array elements, and each element is the size of a pointer. On a 64-bit system, won't each element be 8 bytes and you will need 80GB of memory rather than 10GB?

回答2:

The main part of your problem is dealing with an array of (potentially) 100.000×100.000 items. Such big sizes of problems are often found in programming contests. They serve as traps for those who stick to simplistic solutions. When such number appears it usually means one needs to put some effort in optimizing data structures OR algorithm OR ...the problem itself.

You haven't provided the original problem and you try to make others to solve an allocation issue. That seems a kind of X-Y problem to me (see Meta Stack Exchange – What is the XY problem?) — you are struggling to allocate ten billion integer variables, but you didn't check if you actually need them.

Let's see what your program tries to do. First it reads the size of problem: values m and n, which define the array size, and value p which is a number of additional input lines to read. Then it tries to allocate an array and fill its rows with consecutive natural numbers from 1 to m, corresponding to the column number (by mathematical convention those values start form 1, unlike indices in C language, which start from 0).

Next it loads p pairs of indices and for each pair it increments the indicated item of the array.

Then it scans each row of the array, calculates a difference between adjacent items and verifies if the difference is positive or at least zero: if so, it accumulates the calculated difference in a sum[] array, otherwise (that is if the difference is negative) it abandons the row and sets the corresponding sum[i] to -1 to indicate some kind of 'failure'.

Finally it prints the calculated sums.

Now, let's start from the end: do you really need the sum[] array? Well, no, you don't. Each sum is calculated in a single pass along the array's row and then not used anymore. Those sums aren't read nor overwritten, they are not used in any further calculations—they just sit there and wait to be printed. And they eventually get printed in the same order in which the were calculated. So...? So you don't need to store them! Just print each sum as soon as it is calculated and forget it:

    for ( i = 0 ; i < n ; i++ )
    {
        int *row = a[i];
        int sum = 0;

        for( j = 0 ; j < m - 1; j++ )
        {
            // calculate and test diff, then...

            diff = row[j+1] - row[j];
            if ( diff < 0 )
                ....
            else
                sum += diff;
        }

        printf("%d\n", sum);    // sum is done
    }

Hey, wait... Do you see what your sum is? It's a sum of all diffs, and the sum of diffs is...? It is a difference between the last and the first item! So you don't need to iterate it, just calculate it at once and then decide if the value ought to be cancelled:

    for ( i = 0 ; i < n ; i++ )
    {
        int *row = a[i];
        int sum = row[m-1] - row[0];

        for( j = 0 ; j < m - 1; j++ )
            if ( row[j+1] - row[j] < 0 ) {
                sum = -1;
                break;
            }

        printf("%d\n", sum);
    }

Okay, that was a minor optimization. Now let's step backwards in the code and ask the next question: — Do you really need the a array? What do you need it for?
— BLAH, what a damn DUMB STUPID QUESTION! you yell, of course I need it! I need it to store my data!

— Oh, really?... And why do you need to store your data, my friend?
— Sigh... I need my data to calculate my results. Can't you see?

Well, see, I can't. Instead I can see you don't need to store your data. At least not all your data.

How do you think, could you calculate your results without the a array in a special case p==0? Of course you could! The a array is initialized in such a way that for every item a[i][j]==j+1, so you know that the difference between every two adjacent items is 1 hence is positive, so the sum in each row is (m-1)*1. And we have calculated that, right now, without using a, right?

And how about p==1? In this case we added 1 to some item in the array. Suppose it was a[3][5]. How did that change the result? First, we know that only the row number 3 was affected. Next, incrementing the item on position 5 increased the difference a[3][5]-a[3][4] and decreased a[3][6]-a[3][5]. So during the last phase we don't need a – if we stored those p data items of increments, we can immediately restore any a[i][j]==j+1 and apply the increments. Again, without actually using the stored values of a, right?

And similary for bigger values p – all you need to store are pairs of indices loaded in while(p--) loop. And that will be at most one hundred thousand items, not ten billion items to store, because p is guaranteed to not exceed 1e5.

Of course there are some additional difficulties in this solution. But they certainly can be solved.

First, you should not scan all (possibly up to 1e5) items on each virtual a[][] access, as that would make the whole program incredibly slow, far beyond what is acceptable. So those '+1' patches should be organized in some fast-access data structure, for example a sorted array for binary search, or a Binary Search Tree, or possibly some hash-map.
Next, when processing the i-th row, you need only patches for that row. So you can keep those increments grouped by i so that you can easily find a bigger group which contains just the portion you need.
Additionally, you might aggregate those data: when you read the same pair of indices, don't store another +1 patch, instead modify the previous patch to +2. That complicates the loading routine a bit, as you need to seek for data and either update what was stored before, or add a new item if not found. However, it saves you unnecessary scanning of the tree or the array for duplicating indices.

Yet another minor optimization: when you scan a row, you use every item twice, once as a the minuend, then as a subtrahend (except the first and last item, which are used only once). So the code might look something like this:

    for ( i = 0 ; i < n ; i++ )
    {
        // int *row = a[i];  // no a[][], no row anymore

        int item_last  =  m + sumOfPatches( i, m-1 );
        int item_first =  1 + sumOfPatches( i,  0  );
        int sum = item_last - item_first;

        int item_prev = item_first;

        for( j = 0 ; j < m - 1; j++ )
        {
            // a[i][j+1] was originally initialized to (j+2)

            int item_next = j+2 + sumOfPatches( i, j+1 )
            if ( item_next - item_prev < 0 )
            {
                sum = -1;
                break;
            }

            // the item which is 'next' in current iteration
            // will be a 'prev' in the next iteration

            item_prev = item_next;
        }

        printf("%d\n", sum);
    }

As a final optimisation you could remove j+2 term from item_next calculation, because this part increases by one for each iteration, so when subtracting item_next - item_prev it reduces just to a constant diffference (j+2)-(j+1) == 1.

The relevant part of source code would then look like:

        int patch_last = sumOfPatches( i, m-1 );
        int patch_first = sumOfPatches( i,  0  );
        int sum = (m + patch_last) - (1 + patch_first);

        int patch_prev = patch_first;

        for( j = 0 ; j < m - 1; j++ )
        {
            int patch_next = sumOfPatches( i, j+1 );
            if ( 1 + patch_next - patch_prev < 0 )
            {
                sum = -1;
                break;
            }

            patch_prev = patch_next;
        }

HTH (although it's rather 'a very, very late answer'...)

回答3:

You should test if malloc returns null, it does when it can't alloc.

With a 64 bits operating system and a 64 bits compiler, allocating 10 GB should be possible. Though as you fill-in the memory as you allocate it, you must have at least 10 GB or virtual memory (swap space counts as memory, though this will be slow).

来源：https://stackoverflow.com/questions/23584664/allocating-array-of-size-105-105-in-c-using-malloc

标签

arrays

memory

memory-management