Optimizing for space instead of speed in C++

When you say "optimization", people tend to think "speed". But what about embedded systems where speed isn't all that critical, but memory is a major constraint? What are some guidelines, techniques, and tricks that can be used for shaving off those extra kilobytes in ROM and RAM? How does one "profile" code to see where the memory bloat is?

PS One could argue that "prematurely" optimizing for space in embedded systems isn't all that evil, because you leave yourself more room for data storage and feature creep. It also allows you to cut hardware production costs because your code can run on smaller ROM/RAM.

PPS References to articles and books are welcome too!

PPPS These questions are closely related: 404615, 1561629


My experience from an extremely constrained embedded memory environment:

  • Use fixed size buffers. Don't use pointers or dynamic allocation because they have too much overhead.
  • Use the smallest int data type that works.
  • Don't ever use recursion. Always use looping.
  • Don't pass lots of function parameters. Use globals instead. :)

  • There are many things you can do to reduce your memory footprints, I'm sure people have written books on the subject, but a few of the major ones are:

  • Compiler options to reduce code size (including -Os and packing/alignment options)

  • Linker options to strip dead code

  • If you're loading from flash (or ROM) to ram to execute (rather than executing from flash), then use a compressed flash image, and decompress it with your bootloader.

  • Use static allocation: a heap is an inefficient way to allocate limited memory, and if it might fail due to fragmentation if it is constrained.

  • Tools to find the stack high-watermark (typically they fill the stack with a pattern, execute the program, then see where the pattern remains), so you can set the stack size(s) optimally

  • And of course, optimising the algorithms you use for memory footprint (often at expense of speed)


  • A few obvious ones

  • If speed isn't critical, execute the code directly from flash.
  • Declare constant data tables using const . This will avoid the data being copied from flash to RAM
  • Pack large data tables tightly using the smallest data types, and in the correct order to avoid padding.
  • Use compression for large sets of data (as long as the compression code doesn't outweigh the data)
  • Turn off exception handling and RTTI.
  • Did anybody mention using -Os? ;-)
  • Folding knowledge into data

    One of the rules of Unix philosophy can help make code more compact:

    Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

    I can't count how many times I've seen elaborate branching logic, spanning many pages, that could've been folded into a nice compact table of rules, constants, and function pointers. State machines can often be represented this way (State Pattern). The Command Pattern also applies. It's all about the declarative vs imperative styles of programming.

    Log codes + binary data instead of text

    Instead of logging plain text, log event codes and binary data. Then use a "phrasebook" to reconstitute the event messages. The messages in the phrasebook can even contain printf-style format specifiers, so that the event data values are displayed neatly within the text.

    Minimize the number of threads

    Each thread needs it own memory block for a stack and TSS. Where you don't need preemption, consider making your tasks execute co-operatively within the same thread (cooperative multi-tasking).

    Use memory pools instead of hoarding

    To avoid heap fragmentation, I've often seen separate modules hoard large static memory buffers for their own use, even when the memory is only occasionally required. A memory pool could be used instead so the the memory is only used "on demand". However, this approach may require careful analysis and instrumentation to make sure pools are not depleted at runtime.

    Dynamic allocation only at initialization

    In embedded systems where only one application runs indefinitely, you can use dynamic allocation in a sensible way that doesn't lead to fragmentation: Just dynamically allocate once in your various initialization routines, and never free the memory. reserve() your containers to the correct capacity and don't let them auto-grow. If you need to frequently allocate/free buffers of data (say, for communication packets), then use memory pools. I once even extended the C/C++ runtimes so that it would abort my program if anything tried to dynamically allocate memory after the initialization sequence.

    链接地址: http://www.djcxy.com/p/6732.html

    上一篇: 在C ++中优化性能优化的想法

    下一篇: 在C ++中优化空间而不是速度