Why isn't sizeof for a struct equal to the sum of sizeof of each member?

为什么'sizeof'运算符返回的结构体积比结构体的总尺寸要大?


This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:

  • Mis-aligned access might be a hard error (often SIGBUS ).
  • Mis-aligned access might be a soft error.
  • Either corrected in hardware, for a modest performance-degradation.
  • Or corrected by emulation in software, for a severe performance-degradation.
  • In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
  • Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):

    struct X
    {
        short s; /* 2 bytes */
                 /* 2 padding bytes */
        int   i; /* 4 bytes */
        char  c; /* 1 byte */
                 /* 3 padding bytes */
    };
    
    struct Y
    {
        int   i; /* 4 bytes */
        char  c; /* 1 byte */
                 /* 1 padding byte */
        short s; /* 2 bytes */
    };
    
    struct Z
    {
        int   i; /* 4 bytes */
        short s; /* 2 bytes */
        char  c; /* 1 byte */
                 /* 1 padding byte */
    };
    
    const int sizeX = sizeof(struct X); /* = 12 */
    const int sizeY = sizeof(struct Y); /* = 8 */
    const int sizeZ = sizeof(struct Z); /* = 8 */
    

    One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).

    IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.


    Packing and byte alignment, as described in the C FAQ here:

    It's for alignment. Many processors can't access 2- and 4-byte quantities (eg ints and long ints) if they're crammed in every-which-way.

    Suppose you have this structure:

    struct {
        char a[3];
        short int b;
        long int c;
        char d[3];
    };
    

    Now, you might think that it ought to be possible to pack this structure into memory like this:

    +-------+-------+-------+-------+
    |           a           |   b   |
    +-------+-------+-------+-------+
    |   b   |           c           |
    +-------+-------+-------+-------+
    |   c   |           d           |
    +-------+-------+-------+-------+
    

    But it's much, much easier on the processor if the compiler arranges it like this:

    +-------+-------+-------+
    |           a           |
    +-------+-------+-------+
    |       b       |
    +-------+-------+-------+-------+
    |               c               |
    +-------+-------+-------+-------+
    |           d           |
    +-------+-------+-------+
    

    In the packed version, notice how it's at least a little bit hard for you and me to see how the b and c fields wrap around? In a nutshell, it's hard for the processor, too. Therefore, most compilers will pad the structure (as if with extra, invisible fields) like this:

    +-------+-------+-------+-------+
    |           a           | pad1  |
    +-------+-------+-------+-------+
    |       b       |     pad2      |
    +-------+-------+-------+-------+
    |               c               |
    +-------+-------+-------+-------+
    |           d           | pad3  |
    +-------+-------+-------+-------+
    

    If you want the structure to have a certain size with GCC for example use __attribute__((packed)) .

    On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.

    Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.

    So it is a matter of alignment basically.

    You need to have good reasons to change it.

    链接地址: http://www.djcxy.com/p/12512.html

    上一篇: 无法修改C字符串

    下一篇: 为什么不是sizeof等于每个成员的sizeof之和?