Is a conversion from a pointer to type to a pointer to array of type safe?
A few days ago I stumbled on a code where an extensive use of conversions from pointer to type to pointer to array of type was made to give a bi-dimensional view of a linear vector in memory. A simple example of such a technique is reported below for clarity:
#include <stdio.h>
#include <stdlib.h>
void print_matrix(const unsigned int nrows, const unsigned int ncols, double (*A)[ncols]) {
// Here I can access memory using A[ii][jj]
// instead of A[ii*ncols + jj]
for(int ii = 0; ii < nrows; ii++) {
for(int jj = 0; jj < ncols; jj++)
printf("%4.4g",A[ii][jj]);
printf("n");
}
}
int main() {
const unsigned int nrows = 10;
const unsigned int ncols = 20;
// Here I allocate a portion of memory to which I could access
// using linear indexing, i.e. A[ii]
double * A = NULL;
A = malloc(sizeof(double)*nrows*ncols);
for (int ii = 0; ii < ncols*nrows; ii++)
A[ii] = ii;
print_matrix(nrows,ncols,A);
printf("n");
print_matrix(ncols,nrows,A);
free(A);
return 0;
}
Given that a pointer to type is not compatible with a pointer to array of type , I would like to ask if there are risks associated with this casting, or if I can assume that this casting will work as intended on any platform.
It is guaranteed that a multidimensional array T arr[M][N]
has the same memory layout as a single-dimensional array with the same total number of elements T arr[M * N]
. The layout is the same because arrays are contiguous (6.2.5p20), and because sizeof array / sizeof array[0]
is guaranteed to return the number of elements in the array (6.5.3.4p7).
However, it does not follow that it is safe to cast a pointer to type to a pointer to array of type, or vice versa. Firstly, alignment is an issue; although an array of a type with fundamental alignment must also have fundamental alignment (by 6.2.8p2) it is not guaranteed that the alignments are the same. Because an array contains objects of the base type, the alignment of the array type must be at least as strict as the alignment of the base object type, but it can be stricter (not that I've ever seen such a case). However, this is not relevant for allocated memory, as malloc
is guaranteed to return a pointer suitably allocated for any fundamental alignment (7.22.3p1). This does mean that you cannot safely cast a pointer to automatic or static memory to an array pointer, although the reverse is allowed:
int a[100];
void f() {
int b[100];
static int c[100];
int *d = malloc(sizeof int[100]);
int (*p)[10] = (int (*)[10]) a; // possibly incorrectly aligned
int (*q)[10] = (int (*)[10]) b; // possibly incorrectly aligned
int (*r)[10] = (int (*)[10]) c; // possibly incorrectly aligned
int (*s)[10] = (int (*)[10]) d; // OK
}
int A[10][10];
void g() {
int B[10][10];
static int C[10][10];
int (*D)[10] = (int (*)[10]) malloc(sizeof int[10][10]);
int *p = (int *) A; // OK
int *q = (int *) B; // OK
int *r = (int *) C; // OK
int *s = (int *) D; // OK
}
Next, it is not guaranteed that casting between array and non-array types actually results in a pointer to the correct location , as the casting rules (6.3.2.3p7) do not cover this usage. It's highly unlikely though that this would result in anything other than a pointer to the correct location, and a cast via char *
does have guaranteed semantics. When going from a pointer to array type to pointer to base type, it's better to just indirect the pointer:
void f(int (*p)[10]) {
int *q = *p; // OK
assert((int (*)[10]) q == p); // not guaranteed
assert((int (*)[10]) (char *) q == p); // OK
}
What are the semantics of array subscripting? As is well known, the []
operation is just syntactic sugar for addition and indirection, so the semantics are those of the +
operator; as 6.5.6p8 describes, the pointer operand must point to a member of an array that is large enough that the result falls within the array or just past the end. This is a problem for casts in both direction; when casting to a pointer to array type, the addition is invalid as there does not exist a multidimensional array at that location; and when casting to a pointer to base type, the array at that location only has the size of the inner array bound:
int a[100];
((int (*)[10]) a) + 3; // invalid - no int[10][N] array
int b[10][10];
(*b) + 3; // OK
(*b) + 23; // invalid - out of bounds of int[10] array
This is where we start to see actual issues on common implementations , not just theory. Because an optimiser is entitled to assume that undefined behavior does not occur, accessing a multidimensional array through a base object pointer can be assumed not to alias any elements outside those in the first inner array:
int a[10][10];
void f(int n) {
for (int i = 0; i < n; ++i)
(*a)[i] = 2 * a[2][3];
}
The optimiser can assume access to a[2][3]
does not alias (*a)[i]
and hoist it outside the loop:
int a[10][10];
void f_optimised(int n) {
int intermediate_result = 2 * a[2][3];
for (int i = 0; i < n; ++i)
(*a)[i] = intermediate_result;
}
This will of course give unexpected results if f
is called with n = 50
.
Finally it's worth asking whether this applies to allocated memory. 7.22.3p1 specifies that the pointer returned by malloc
"may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated"; there's nothing about further casting the returned pointer to another object type, so the conclusion is that the type of the allocated memory is fixed by the first pointer type the returned void
pointer is cast to; if you cast to double *
then you can't further cast to double (*)[n]
, and if you cast to double (*)[n]
you can only use double *
to access the first n
elements.
As such, I'd say that if you want to be absolutely safe you should not cast between pointer and pointer to array types, even with the same base type. The fact that layout is the same is irrelevant except for memcpy
and other accesses via a char
pointer.
UPDATE : the strikethrough part is true, but irrelevant.
As I posted in the comment, the question is really whether in a two-dimensional array, the subarrays (rows) contain internal padding. There shall certainly be no padding inside each row, as the standard defines arrays to be contiguous. Also, the outer array shall introduce no padding. In fact, scanning through the C standard, I find no mention of padding in the context of arrays, so I interpret "contiguous" to mean that there's never any padding at the end of a subarray inside a multidimensional array. Since sizeof(array) / sizeof(array[0])
is guaranteed to return the number of elements in an array, there can be no such padding.
That means that the layout of a multidimensional array of nrows
rows and ncols
columns must be the same as that of an 1-d array of nrows * ncols
. So, to avoid the incompatible type error, you could do
void *A = malloc(sizeof(double[nrows][ncols]));
// check for NULL
double *T = A;
for (size_t i=0; i<nrows*ncols; i++)
T[i] = 0;
then pass to print_array
. This should avoid the potential pitfall of pointer aliasing; pointers of different types are not permitted to point into the same array unless at least one of them has type void*
, char*
or unsigned char*
.
The C standard allows conversion of a pointer to an object (or incomplete) type to a pointer to a different object (or incomplete) type.
There are a few caveats though :
if the resulting pointer is not correctly aligned, the behavior is undefined. The standard does not guarantee that in this case. In reality, it's unlikely though.
the standard only states one valid use of the resulting pointer, and that's to convert it back to the original pointer type. In that case, the standard guarantees the latter (the resulting pointer converted back to the original pointer type) will compare equal to the original pointer. Using the resulting pointer for anything else, is not covered by the standard.
the standard requires an explicit cast when performing such conversions, which is missing from the print_matrix
function calls in the code you posted.
So, according to the letter of the standard, the usage in the code sample is outside of its scope. In practice though, this will probably work fine on most platforms - assuming the compiler allows it.
链接地址: http://www.djcxy.com/p/64710.html