Why would one replace default new and delete operators?

Why should would one replace the default operator new and delete with a custom new and delete operators?

This is in continuation of Overloading new and delete in the immensely illuminating C++ FAQ:
Operator overloading.

An followup entry to this FAQ is:
How should I write ISO C++ standard conformant custom new and delete operators?

Note: The answer is based on lessons from Scott Meyers' More Effective C++.
(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)



First of all, there are really a number of different new and delete operators (an arbitrary number, really).

First, there are ::operator new , ::operator new[] , ::operator delete and ::operator delete[] . Second, for any class X , there are X::operator new , X::operator new[] , X::operator delete and X::operator delete[] .

Between these, it's much more common to overload the class-specific operators than the global operators -- it's fairly common for the memory usage of a particular class to follow a specific enough pattern that you can write operators that provide substantial improvements over the defaults. It's generally much more difficult to predict memory usage nearly that accurately or specifically on a global basis.

It's probably also worth mentioning that although operator new and operator new[] are separate from each other (likewise for any X::operator new and X::operator new[] ), there is no difference between the requirements for the two. One will be invoked to allocate a single object, and the other to allocate an array of objects, but each still just receives an amount of memory that's needed, and needs to return the address of a block of memory (at least) that large.

Speaking of requirements, it's probably worthwhile to review the other requirements1: the global operators must be truly global -- you may not put one inside a namespace or make one static in a particular translation unit. In other words, there are only two levels at which overloads can take place: a class-specific overload or a global overload. In-between points such as "all the classes in namespace X" or "all allocations in translation unit Y" are not allowed. The class-specific operators are required to be static -- but you're not actually required to declare them as static -- they will be static whether you explicitly declare them static or not. Officially, the global operators much return memory aligned so that it can be used for an object of any type. Unofficially, there's a little wiggle-room in one regard: if you get a request for a small block (eg, 2 bytes) you only really need to provide memory aligned for an object up to that size, since attempting to store anything larger there would lead to undefined behavior anyway.

Having covered those preliminaries, let's get back to the original question about why you'd want to overload these operators. First, I should point out that the reasons for overloading the global operators tend to be substantially different from the reasons for overloading the class-specific operators.

Since it's more common, I'll talk about the class-specific operators first. The primary reason for class-specific memory management is performance. This commonly comes in either (or both) of two forms: either improving speed, or reducing fragmentation. Speed is improved by the fact that the memory manager will only deal with blocks of a particular size, so it can return the address of any free block rather than spending any time checking whether a block is large enough, splitting a block in two if it's too large, etc. Fragmentation is reduced in (mostly) the same way -- for example, pre-allocating a block large enough for N objects gives exactly the space necessary for N objects; allocating one object's worth of memory will allocate exactly the space for one object, and not a single byte more.

There's a much greater variety of reasons for overloading the global memory management operators. Many of these are oriented toward debugging or instrumentation, such as tracking the total memory needed by an application (eg, in preparation for porting to an embedded system), or debugging memory problems by showing mismatches between allocating and freeing memory. Another common strategy is to allocate extra memory before and after the boundaries of each requested block, and writing unique patterns into those areas. At the end of execution (and possibly other times as well), those areas are examined to see if code has written outside the allocated boundaries. Yet another is to attempt to improve ease of use by automating at least some aspects of memory allocation or deletion, such as with an automated garbage collector.

A non-default global allocator can be used to improve performance as well. A typical case would be replacing a default allocator that was just slow in general (eg, at least some versions of MS VC++ around 4.x would call the system HeapAlloc and HeapFree functions for every allocation/deletion operation). Another possibility I've seen in practice was occurred on Intel processors when using the SSE operations. These operate on 128-bit data. While the operations will work regardless of alignment, speed is improved when the data is aligned to 128-bit boundaries. Some compilers (eg, MS VC++ again2) haven't necessarily enforced alignment to that larger boundary, so even though code using the default allocator would work, replacing the allocating could provide a substantial speed improvement for those operations.


  • Most of the requirements are covered in §3.7.3 and §18.4 of the C++ standard (or §3.7.4 and §18.6 in C++0x, at least as of N3291).
  • I feel obliged to point out that I don't intend to pick on Microsoft's compiler -- I doubt it has an unusual number of such problems, but I happen to use it a lot, so I tend to be quite aware of its problems.

  • Many computer architectures require that data of particular types be placed in memory at particular kinds of addresses. For example, an architecture might require that pointers occur at addresses that are a multiple of four (ie, be four-byte aligned) or that doubles must occur at addresses that are a multiple of eight (ie, be eight-byte aligned). Failure to follow such constraints can lead to hardware exceptions at run-time. Other architectures are more forgiving, and may allow it to work though reducing the performance.

    To clarify: if an architecture requires for instance that double data be eight-byte aligned, then there is nothing to optimize. Any kind of dynamic allocation of the appropriate size (eg malloc(size) , operator new(size) , operator new[](size) , new char[size] where size >= sizeof(double) ) is guaranteed to be properly aligned. If an implementation doesn't make this guarantee, it is not conforming. Changing operator new to do 'the right thing' in that case would be an attempt at 'fixing' the implementation, not an optimization.

    On the other hand, some architectures allow different (or all) kinds of alignment for one or more data types, but provide different performance guarantees depending on alignment for those same types. An implementation may then return memory (again, assuming a request of appropriate size) that is sub-optimally aligned, and still be conforming. This is what the example is about.

    链接地址: http://www.djcxy.com/p/2006.html

    上一篇: 什么是三项规则?

    下一篇: 为什么要替换默认的新运算符和删除运算符?