Saturday, January 7, 2012

Pointer Arrays & Multidimensional Arrays or Dynamic Arrays vs Static Arrays

1.What is the use of array of Pointers.How are they initialised?
2.How Multidimensional arrays are used ?
3.Array of Pointers[Dynamic Array] vs Multidimensional arrays[Static Array] ( a [10][20] vs *b[10] )

STATIC ARRAY

int x[3][4]={0,1,2,3,10,11,12,13,20,21,22,23};
Two dimansional array is stored as a row major format in memory--A contiguous statically allocated memory segment and the rows of tha array are stored there one after another;And just as in one dimensional case,a single pointer suffices to hold the base address.

Thus, as in the one-dimensional case, in the multi-dimensional case the pointer representing the array is passed by value; this allows direct access to the actual array locally from within the function, as if the array were passed by reference. In order to facilitate access to array items through indexing, the compiler must have additional information (in the two-dimensional case, the row size) in order to translate the indexing to a proper indirection.
Row Major access formula--x[i][j]is *(x+i*n+j) where n is the row size
i.e.
If you have a conventional two dimensional array, and you want to compute the address for arr[ row ][ col ] and you have ROWS rows (where ROWS is some constant) and COLS columns, then the formula for the address in memory is:
addr( & arr[ row ][ col ] ) = addr( arr ) + 
[ sizeof( int ) * COLS * row ]+ [ sizeof( int ) * col ]
 
For example: x[0][0] is translated to *(x+0*4+0)=*x, hence the item with the value 0 is being accessed; and x[1][2] is translated to *(x+1*4+2)=*(x+6), hence the item with the value 12 is being accessed.

NOTE:It is essential for the compiler to know the row size(means the number of elemnt in each row which is j);While parsing the array's definition, the compiler extracts the row size - for instance, in int x[3][4], the last index limit (4) is the size it needs to know

Difference between one and two dimensional memory:
One Dimensional diagram:


There is a subtle difference between the indexing mechanism of one-dimensional arrays and that used for multi-dimensional arrays. For one-dimensional arrays, the indexing really is the indirection and so x[i]=*(x+i), whereas for multi-dimensional arrays the indexing is only based on the indirection with some additional information needed and so x[i][j]=*(x+i*n+j) (where n is the additional information in this example). Hence, we cannot simply view a two-dimensional array as a pointer with a "body" as it is for the one-dimensional case, The compiler represents the two-dimensional array x as depicted in above and "knows" the row size of x (which happens to be 4) from the definition of x; thus it can translate any expression x[i][j] to *(x+i*4+j). Throughout the compilation, the compiler simply must treat the pointer int* x differently from "regular" int* pointers - unlike the one-dimensional case, for which the compiler treats int* x like any other pointer.

We have just illustrated that a multi-dimensional array is accessed at run time as a one-dimensional array, for the stored version of the array really is one-dimensional. The multiple indexes are transformed to proper single indexes via the row-major access formula. Thus, as in the one-dimensional case, in the multi-dimensional case the pointer representing the array is passed by value; this allows direct access to the actual array locally from within the function, as if the array were passed by reference. In order to facilitate access to array items through indexing, the compiler must have additional information (in the two-dimensional case, the row size) in order to translate the indexing to a proper indirection.

It is essential for the compiler to know the row size, but how does it obtain this value? While parsing the array's definition, the compiler extracts the row size - for instance, in int x[3][4], the last index limit (4) is the size it needs to know. However when declaring a one-dimensional array (most often as a function argument) we need not bother to specify the size. Can we do the same for multi-dimensional arrays? Clearly not, for the compiler must have this additional information in order to do its job. Thus, we may leave out the size of the first index but cannot leave out the size of any other index. We may therefore declare x either as x[3][4] or as x[][4].

Three-dimensional array int x[3][4][2]. The row-major format will store the items in the following order (we list only the indexes):

   000,001,010,011,020,021,030,031,100,101,110,111,120,121,130,
      131,200,201,210,211,220,221,230,231

The first eight entries constitute a two-dimensional array x0[4][2] with the first index fixed at 0 (i.e., x0[i][j]=x[0][i][j]), consisting of four rows of size 2: {000,001}, {010,011}, {020,021}, {030,031}. The middle eight entries constitute a two-dimensional array x1[4][2] with the first index fixed at 1, consisting of four rows of size 2: {100,101}, {110,111}, {120,121}, {130,131}. Finally, the last eight entries constitute a two-dimensional array x2[4][2] with the first index fixed at 2, also consisting of four rows of size 2: {200,201}, {210,211}, {220,221}, {230,231}. The row-major formula comes to x[i][j][k]=*(x+i*8+j*2+k).

DYNAMIC ARRAY

The treatment of multi-dimensional arrays as if they were one-dimensional poses some problems for the dynamic version. In the one-dimensional case we need only provide a "body" for a pointer and, voilĂ , we have an array. With multi-dimensional arrays we cannot do this, for the compiler cannot treat a pointer p as an array and translate p[i][j] to *(p+i*n+j). Not only is the compiler unaware of what n can possibly be - even the programmer might not know. We must therefore rely on the AUTOMATIC INDEXING as indirection for the one-dimensional case.

Let us illustrate this on int** p. We have p[i][j]=*(p[i]+j)=*(*(p+i)+j) and this does make sense, since int** p, so *(p+i) is of the type int* and hence *(*(p+i)+j) is of the type int. We thus require a two-stage solution, for p+i must point somewhere, and application of the indirection operator *(p+i) gets the address stored there; hence *(p+i)+j points somewhere, and another application of the indirection operator *(*(p+i)+j) retrieves the number stored there. Here is a code to create a dynamic two-dimensional int array p[3][4]:

int** p;

p = malloc(3*sizeof(int*));
if (p == NULL) error();
for(i = 0; i < 3; i++)
 p[i]= malloc(4*sizeof(int));



Note that we have actually created four one-dimensional arrays. One is an array of pointers; let us call this the row array, as each item of this array is a pointer to a one-dimensional int array representing a row. The pointer p that represents the whole array points to the row array. In a sense we have created a dynamic one-dimensional array of dynamic one-dimensional arrays. For a three-dimensional dynamic array we would do exactly the same and create a one-dimensional dynamic array of two-dimensional dynamic arrays, and similarly for higher dimensions.

Advantage & Disadvantage---

Despite the high cost of their creation - dynamic multi-dimensional arrays may facilitate FASTER EXECUTION in comparison to static arrays. Access to a static array must always be calculated using the row-major formula and hence employs MULTIPLICATION, an expensive operation. Yet access to a dynamic array is just a simple series of dereferencing, which is much less expensive. Thus, for a program with a large number of accesses to the array, the dynamic version may improve over-all performance. For a program running in a system with constrained memory size and needing a large multi-dimensional array, it may be advantageous for the dynamic version of the array to consist of many unrelated smaller segments rather than a single large one as the static version requires. However, there are also potential drawbacks: a dynamic multi-dimensional array that spans a wide portion of the address space may complicate and slow down caching and/or paging and thus degrade performance.

Incase of Memory constaint e should avoid dynamic memory allocation as it takes EXTRA MEMORY for pointers along with the normal one dimensional memory storage

1 comment:

  1. 3.Differences:--

    int a[10][20];

    1)a[10][20] is a true 2 dimensional array:200 int sized locations have been set aside
    2)rectangular subscript calculation 20*row+col is used to find the element a[row][col].

    int *b[10];

    1)Only 10 pointers are allocated and doesnt initialize them;initialization must be done either statically or with code
    2)If we assume each element of b does point to a twenty element array,then there will be 200 ints set aside PLUS TEN CELLS for the pointers
    3)The only ADVANTAGE of Pointer Array is that the rows of the array may be of different lengths i.e each element of b need not point to a twenty element vector;

    Most frequent use of arrays of pointers is to store chracater strings of diverse lengths as
    shown below:

    char *name[]={"Illegal Month","Jan","Feb"};
    Only 22 bytes(14+4+4) are allocated

    char aname[][15]={"Illegal Month","Jan","Feb"};
    45(15+15+15) bytes are allocated

    ReplyDelete