GCC c++11 using a lot of RAM with STL bitset<UINT
While compiling my program today, I noticed something very strange (which I'm sure can be explained somehow) in the memory consumption pattern of GCC (compilation step). The process called "cc1plus" was using approximately 10 GB of RAM for a program with less than 10 000 lines of code. After commenting and uncommenting lines of code I finally found the "culprit":
std::bitset<UINT_MAX>
If you want to test yourselves, please try this very short program:
#include <iostream>
#include <bitset>
#include <climits>
std::bitset<UINT_MAX> justAVar;
int main()
{
std::cout << UINT_MAX << std::endl;
return 0;
}
compile it using -std=c++11 or -std=c++0x and please be aware that even this little example will use a lot of RAM when compiling (in my case 7 GB on both boxes for gcc and 2.6 for clang) so if you have to push the reset button, don't whine and curse the gods like you haven't been warned about it.
My test machines:
Setup 1: Debian 7.0 64, gcc 4.8.1
Setup 2: Ubuntu 12.04 64, gcc 4.7.3, gcc 4.8.1, clang 3.0.6 (the last one using -std=c++0x)
One hint about the implementation of the bitset constructor (guarded by if def for C++11 and C++0x, obviously), as one of my kind colleagues showed me: it is declared using constexpr
And now the question: Can someone please explain me (us) what it is going on in this case with all these compilers?
Thank you
Looking at the bitset
implementation, which begins like this:
template<size_t _Nw>
struct _Base_bitset
{
typedef unsigned long _WordT;
_WordT _M_w[_Nw];
constexpr _Base_bitset()
: _M_w() { }
we can create a minimal testcase like this:
template<unsigned N>
struct bset
{
unsigned int v[N/32];
constexpr bset() : v() {}
};
bset<1000000000> x;
The bitset must be initialized by constant initialization:
3.6.2 Initialization of non-local variables [basic.start.init]
...
Constant initialization is performed:
...
— if an object with static or thread storage duration is initialized by a constructor call, if the constructor is a constexpr constructor ...
In practical terms and in the general case, it means evaluating at compile time the memory image of the constructed object and allocating it in the .data
section.
Well, it turns out that if the memory image is just a lot of zeroes, gcc is smart enough to figure that out and allocate the objects in .bss
, but it looks like, it first has to create the image and examine it.
Of course, a better approach would be to infer that if the only member of the bitset
, is value-initialized and that member is an array and the elements of the array have no constructors and therefore their value-initialization is zero-initialization, then array is zero-initialized, then the object is zero-initialized and be done with that.
A std::bitset
does not allocate its storage dynamically, it is contained in the object itself. This means that the compiler is most likely trying to track its state to allow constant folding and other optimizations. The fact that constexpr
is used in some places also contributes to this. This, including some overhead to track the value of individual parts of the bitset
will cause it to allocate tons of internal structures.
I'm not sure in which cases this is triggered and it might be different on different compilers/versions or depend on certain settings.
UINT_MAX
is about 4 billion on most modern machines. Because a std::bitset<N>
stores N
bits, this translates into 0.5 Gigabytes of memory (4 billion / 8) required for such a std::bitset
. The overhead you see is probably due to internal compiler optimizations.
Some small experiments on http://coliru.stacked-crooked.com/:
So at least for Clang, the lack of constexpr
in C++98 will allow you to compile this program without problems on reasonable resources (not sure on how much memory Coliru allows clients).