Array memory allocation and usage

2018-06-04 15:51:25

This is from memory, so I could be misusing a few words, but the meaning should be understandable

I'm currently at University, doing a BIT majoring in programming - We started C++, and when we started using arrays, our C++ teacher (a teacher with strange ideas and programming rules, such as no comments whatsoever allowed) told us that we should make our array sizes in multiples of 4 to be more efficient:

char exampleArrayChar[ 4 ]; //Allowed
float exampleArrayChar[ 6 ]; //NOT Allowed
int exampleArrayChar[ 8 ]; //Allowed

He said the reason behind this was because of the way the computer does its memory allocation.

The computer allocated the memory address / locations for each array element in groups of four bytes - so a 8 element array was done in 2 memory groups.

So the issue was that if you made an array of size 6, it would assign 2 groups of 4, but then mark 2 of those bytes (out of the 8) as invalid / void, rendering them unusable until the whole array was released from memory.

While this to me sounds plausible in regard to other computer math (such as 1gb = 1024mb instead of exactly 1000) I am interested in knowing:

Just how true this is, and what the gain would be, if any, by doing this

Is this just C++? (I would assume not, but still worth asking)

Just some more info on this all round - for example, why 4? Isn't computer math normally binary, and thus 2's?

Looking around on the web I've been unable to find anything of major use or relevance.

float exampleArrayChar[ 6 ]; //NOT Allowed

Considering that float is 4 bytes (almost universally, due to the widespread adoption of IEEE-754 encoding for floating-point numbers), any number of floats is already going to be a multiple of four bytes. Your example would be 24, and not problematic. On x86 (and x64), SSE instructions really prefer data to be 8 byte aligned... again having a float array of size 6 elements = 24 bytes won't interfere with this.

The only larger multiple that really matters for alignment is the size of the cache line, which varies widely with implementation (code compiled for x86 may find itself running on CPUs with 32 byte, 128 byte, or other cache sizes, all from the same machine code). And yes cache alignment can make a big performance difference, but alignment to cache lines is not necessarily better, in fact it is often much worse to be aligned because it evokes collisions on cache mapping which are similar to false sharing as far as the performance hit is concerned.

See What is "cache-friendly" code? and Why is my program slow when looping over exactly 8192 elements? and other questions linked from those.

But as soon as your professor got to No comments whatsoever allowed you should be in the Dean's office demanding a refund of your tuition.

Assuming your teacher really told you what you are saying above, your teacher would be wrong (which wouldn't surprise me at all). What is true, however, is that when you allocate memory from the heap, the memory chunks being allocate are probably a multiple of some power of 2 because memory would end up getting fragmented in unfortunate ways otherwise, at least, when using a general purpose memory allocator. Thus, you may be wasting a couple of bytes. I wouldn't bother with these details, however, unless you have many object: get the program right first using the semantically correct approach.

The best way to deal with these issues, however, is not to use arrays in the first place but rather to use std::vector<T> or std::deque<T> .

That teacher has some really funky (and wrong ideas). To explain why stuff is let's check each statement.

... our C++ teacher ... told us that we should make our array sizes in multiples of 4 to be more efficient... He said the reason behind this was because of the way the computer allocated memory spaces (spaces might not be the correct work - basicly the memory address for each element in the array)

C (and C++) grant that if you allocate memory, regardless of type, it will be there, otherwise a runtime error occurs. What he might have said is that it would be a good practice (it isn't) to allocate more space to, say, accommodate some overflow error. However C and C++ grant that all memory in a static (by this I mean not declared dynamically through new() , although I'm not sure about this one) array is contiguous. When declaring something, use only the amount of memory (resources) that you thing will need to use.

The computer allocated the memory address / locations for each array element in groups of four - so a 8 element array was done in 2 memory groups (again, groups isn't the correct word)

An int is at least 4 bytes, this being the only sentence where it would make sense to hear the before quote. Since you can refer to any byte in memory through a reference, there isn't a explicit need to divide memory in groups of four, unless some environmental issue arises.

While this to me sounds plausible in regard to other computer math (such as 1gb = 1024mb instead of exactly 1000)...

Although this is best discussed in other places, GB and GiB are separate things; look it on Wikipedia. However, and when regarding memory, there's an informal convention that the Byte unit has its' multiples ordered on 210, 220, 230 and so forth.

Finally, as the previous commenter said, the Right Way To Do Things in C++ is through the new container classes, being std::vector<T> the most array-like container. Leave this kind of array declarations for something really simple or for legacy C code.

链接地址: http://www.djcxy.com/p/15086.html

上一篇: 最多不能超过50％。矩阵乘法的理论性能

下一篇: 数组内存分配和使用