digit numbers in 1 MB of RAM

I have a computer with 1 MB of RAM and no other local storage. I must use it to accept 1 million 8-digit decimal numbers over a TCP connection, sort them, and then send the sorted list out over another TCP connection.

The list of numbers may contain duplicates, which I must not discard. The code will be placed in ROM, so I need not subtract the size of my code from the 1 MB. I already have code to drive the Ethernet port and handle TCP/IP connections, and it requires 2 KB for its state data, including a 1 KB buffer via which the code will read and write data. Is there a solution to this problem?

Sources Of Question And Answer:
slashdot.org

cleaton.net


Here's some working C++ code which solves the problem.

Proof that the memory constraints are satisfied:

Editor: There is no proof of the maximum memory requirements offered by the author either in this post or in his blogs. Since the number of bits necessary to encode a value depends on the values previously encoded, such a proof is likely non-trivial. The author notes that the largest encoded size he could stumble upon empirically was 1011732 , and chose the buffer size 1013000 arbitrarily.

typedef unsigned int u32;

namespace WorkArea
{
    static const u32 circularSize = 253250;
    u32 circular[circularSize] = { 0 };         // consumes 1013000 bytes

    static const u32 stageSize = 8000;
    u32 stage[stageSize];                       // consumes 32000 bytes

    ...

Together, these two arrays take 1045000 bytes of storage. That leaves 1048576 - 1045000 - 2×1024 = 1528 bytes for remaining variables and stack space.

It runs in about 23 seconds on my Xeon W3520. You can verify that the program works using the following Python script, assuming a program name of sort1mb.exe .

from subprocess import *
import random

sequence = [random.randint(0, 99999999) for i in xrange(1000000)]

sorter = Popen('sort1mb.exe', stdin=PIPE, stdout=PIPE)
for value in sequence:
    sorter.stdin.write('%08dn' % value)
sorter.stdin.close()

result = [int(line) for line in sorter.stdout]
print('OK!' if result == sorted(sequence) else 'Error!')

A detailed explanation of the algorithm can be found in the following series of posts:

  • 1MB Sorting Explained
  • Arithmetic Coding and the 1MB Sorting Problem
  • Arithmetic Encoding Using Fixed-Point Math

  • There is one rather sneaky trick not mentioned here so far. We assume that you have no extra way to store data, but that is not strictly true.

    One way around your problem is to do the following horrible thing, which should not be attempted by anyone under any circumstances: Use the network traffic to store data. And no, I don't mean NAS.

    You can sort the numbers with only a few bytes of RAM in the following way:

  • First take 2 variables: COUNTER and VALUE .
  • First set all registers to 0 ;
  • Every time you receive an integer I , increment COUNTER and set VALUE to max(VALUE, I) ;
  • Then send an ICMP echo request packet with data set to I to the router. Erase I and repeat.
  • Every time you receive the returned ICMP packet, you simply extract the integer and send it back out again in another echo request. This produces a huge number of ICMP requests scuttling backward and forward containing the integers.
  • Once COUNTER reaches 1000000 , you have all of the values stored in the incessant stream of ICMP requests, and VALUE now contains the maximum integer. Pick some threshold T >> 1000000 . Set COUNTER to zero. Every time you receive an ICMP packet, increment COUNTER and send the contained integer I back out in another echo request, unless I=VALUE , in which case transmit it to the destination for the sorted integers. Once COUNTER=T , decrement VALUE by 1 , reset COUNTER to zero and repeat. Once VALUE reaches zero you should have transmitted all integers in order from largest to smallest to the destination, and have only used about 47 bits of RAM for the two persistent variables (and whatever small amount you need for the temporary values).

    I know this is horrible, and I know there can be all sorts of practical issues, but I thought it might give some of you a laugh or at least horrify you.


    Please see the first correct answer or the later answer with arithmetic encoding. Below you may find some fun, but not a 100% bullet-proof solution.

    This is quite an interesting task and here is an another solution. I hope somebody would find the result useful (or at least interesting).

    Stage 1: Initial data structure, rough compression approach, basic results

    Let's do some simple math: we have 1M (1048576 bytes) of RAM initially available to store 10^6 8 digit decimal numbers. [0;99999999]. So to store one number 27 bits are needed (taking the assumption that unsigned numbers will be used). Thus, to store a raw stream ~3.5M of RAM will be needed. Somebody already said it doesn't seem to be feasible, but I would say the task can be solved if the input is "good enough". Basically, the idea is to compress the input data with compression factor 0.29 or higher and do sorting in a proper manner.

    Let's solve the compression issue first. There are some relevant tests already available:

    http://www.theeggeadventure.com/wikimedia/index.php/Java_Data_Compression

    "I ran a test to compress one million consecutive integers using various forms of compression. The results are as follows:"

    None     4000027
    Deflate  2006803
    Filtered 1391833
    BZip2    427067
    Lzma     255040
    

    It looks like LZMA (Lempel–Ziv–Markov chain algorithm) is a good choice to continue with. I've prepared a simple PoC, but there are still some details to be highlighted:

  • Memory is limited so the idea is to presort numbers and use compressed buckets (dynamic size) as temporary storage
  • It is easier to achieve a better compression factor with presorted data, so there is a static buffer for each bucket (numbers from the buffer are to be sorted before LZMA)
  • Each bucket holds a specific range, so the final sort can be done for each bucket separately
  • Bucket's size can be properly set, so there will be enough memory to decompress stored data and do the final sort for each bucket separately
  • 内存中的排序

    Please note, attached code is a POC, it can't be used as a final solution, it just demonstrates the idea of using several smaller buffers to store presorted numbers in some optimal way (possibly compressed). LZMA is not proposed as a final solution. It is used as a fastest possible way to introduce a compression to this PoC.

    See the PoC code below (please note it just a demo, to compile it LZMA-Java will be needed):

    public class MemorySortDemo {
    
    static final int NUM_COUNT = 1000000;
    static final int NUM_MAX   = 100000000;
    
    static final int BUCKETS      = 5;
    static final int DICT_SIZE    = 16 * 1024; // LZMA dictionary size
    static final int BUCKET_SIZE  = 1024;
    static final int BUFFER_SIZE  = 10 * 1024;
    static final int BUCKET_RANGE = NUM_MAX / BUCKETS;
    
    static class Producer {
        private Random random = new Random();
        public int produce() { return random.nextInt(NUM_MAX); }
    }
    
    static class Bucket {
        public int size, pointer;
        public int[] buffer = new int[BUFFER_SIZE];
    
        public ByteArrayOutputStream tempOut = new ByteArrayOutputStream();
        public DataOutputStream tempDataOut = new DataOutputStream(tempOut);
        public ByteArrayOutputStream compressedOut = new ByteArrayOutputStream();
    
        public void submitBuffer() throws IOException {
            Arrays.sort(buffer, 0, pointer);
    
            for (int j = 0; j < pointer; j++) {
                tempDataOut.writeInt(buffer[j]);
                size++;
            }            
            pointer = 0;
        }
    
        public void write(int value) throws IOException {
            if (isBufferFull()) {
                submitBuffer();
            }
            buffer[pointer++] = value;
        }
    
        public boolean isBufferFull() {
            return pointer == BUFFER_SIZE;
        }
    
        public byte[] compressData() throws IOException {
            tempDataOut.close();
            return compress(tempOut.toByteArray());
        }        
    
        private byte[] compress(byte[] input) throws IOException {
            final BufferedInputStream in = new BufferedInputStream(new ByteArrayInputStream(input));
            final DataOutputStream out = new DataOutputStream(new BufferedOutputStream(compressedOut));
    
            final Encoder encoder = new Encoder();
            encoder.setEndMarkerMode(true);
            encoder.setNumFastBytes(0x20);
            encoder.setDictionarySize(DICT_SIZE);
            encoder.setMatchFinder(Encoder.EMatchFinderTypeBT4);
    
            ByteArrayOutputStream encoderPrperties = new ByteArrayOutputStream();
            encoder.writeCoderProperties(encoderPrperties);
            encoderPrperties.flush();
            encoderPrperties.close();
    
            encoder.code(in, out, -1, -1, null);
            out.flush();
            out.close();
            in.close();
    
            return encoderPrperties.toByteArray();
        }
    
        public int[] decompress(byte[] properties) throws IOException {
            InputStream in = new ByteArrayInputStream(compressedOut.toByteArray());
            ByteArrayOutputStream data = new ByteArrayOutputStream(10 * 1024);
            BufferedOutputStream out = new BufferedOutputStream(data);
    
            Decoder decoder = new Decoder();
            decoder.setDecoderProperties(properties);
            decoder.code(in, out, 4 * size);
    
            out.flush();
            out.close();
            in.close();
    
            DataInputStream input = new DataInputStream(new ByteArrayInputStream(data.toByteArray()));
            int[] array = new int[size];
            for (int k = 0; k < size; k++) {
                array[k] = input.readInt();
            }
    
            return array;
        }
    }
    
    static class Sorter {
        private Bucket[] bucket = new Bucket[BUCKETS];
    
        public void doSort(Producer p, Consumer c) throws IOException {
    
            for (int i = 0; i < bucket.length; i++) {  // allocate buckets
                bucket[i] = new Bucket();
            }
    
            for(int i=0; i< NUM_COUNT; i++) {         // produce some data
                int value = p.produce();
                int bucketId = value/BUCKET_RANGE;
                bucket[bucketId].write(value);
                c.register(value);
            }
    
            for (int i = 0; i < bucket.length; i++) { // submit non-empty buffers
                bucket[i].submitBuffer();
            }
    
            byte[] compressProperties = null;
            for (int i = 0; i < bucket.length; i++) { // compress the data
                compressProperties = bucket[i].compressData();
            }
    
            printStatistics();
    
            for (int i = 0; i < bucket.length; i++) { // decode & sort buckets one by one
                int[] array = bucket[i].decompress(compressProperties);
                Arrays.sort(array);
    
                for(int v : array) {
                    c.consume(v);
                }
            }
            c.finalCheck();
        }
    
        public void printStatistics() {
            int size = 0;
            int sizeCompressed = 0;
    
            for (int i = 0; i < BUCKETS; i++) {
                int bucketSize = 4*bucket[i].size;
                size += bucketSize;
                sizeCompressed += bucket[i].compressedOut.size();
    
                System.out.println("  bucket[" + i
                        + "] contains: " + bucket[i].size
                        + " numbers, compressed size: " + bucket[i].compressedOut.size()
                        + String.format(" compression factor: %.2f", ((double)bucket[i].compressedOut.size())/bucketSize));
            }
    
            System.out.println(String.format("Data size: %.2fM",(double)size/(1014*1024))
                    + String.format(" compressed %.2fM",(double)sizeCompressed/(1014*1024))
                    + String.format(" compression factor %.2f",(double)sizeCompressed/size));
        }
    }
    
    static class Consumer {
        private Set<Integer> values = new HashSet<>();
    
        int v = -1;
        public void consume(int value) {
            if(v < 0) v = value;
    
            if(v > value) {
                throw new IllegalArgumentException("Current value is greater than previous: " + v + " > " + value);
            }else{
                v = value;
                values.remove(value);
            }
        }
    
        public void register(int value) {
            values.add(value);
        }
    
        public void finalCheck() {
            System.out.println(values.size() > 0 ? "NOT OK: " + values.size() : "OK!");
        }
    }
    
    public static void main(String[] args) throws IOException {
        Producer p = new Producer();
        Consumer c = new Consumer();
        Sorter sorter = new Sorter();
    
        sorter.doSort(p, c);
    }
    }
    

    With random numbers it produces the following:

    bucket[0] contains: 200357 numbers, compressed size: 353679 compression factor: 0.44
    bucket[1] contains: 199465 numbers, compressed size: 352127 compression factor: 0.44
    bucket[2] contains: 199682 numbers, compressed size: 352464 compression factor: 0.44
    bucket[3] contains: 199949 numbers, compressed size: 352947 compression factor: 0.44
    bucket[4] contains: 200547 numbers, compressed size: 353914 compression factor: 0.44
    Data size: 3.85M compressed 1.70M compression factor 0.44
    

    For a simple ascending sequence (one bucket is used) it produces:

    bucket[0] contains: 1000000 numbers, compressed size: 256700 compression factor: 0.06
    Data size: 3.85M compressed 0.25M compression factor 0.06
    

    EDIT

    Conclusion:

  • Don't try to fool the Nature
  • Use simpler compression with lower memory footprint
  • Some additional clues are really needed. Common bullet-proof solution does not seem to be feasible.
  • Stage 2: Enhanced compression, final conclusion

    As was already mentioned in the previous section, any suitable compression technique can be used. So let's get rid of LZMA in favor of simpler and better (if possible) approach. There are a lot of good solutions including Arithmetic coding, Radix tree etc.

    Anyway, simple but useful encoding scheme will be more illustrative than yet another external library, providing some nifty algorithm. The actual solution is pretty straightforward: since there are buckets with partially sorted data, deltas can be used instead of numbers.

    编码方案

    Random input test shows slightly better results:

    bucket[0] contains: 10103 numbers, compressed size: 13683 compression factor: 0.34
    bucket[1] contains: 9885 numbers, compressed size: 13479 compression factor: 0.34
    ...
    bucket[98] contains: 10026 numbers, compressed size: 13612 compression factor: 0.34
    bucket[99] contains: 10058 numbers, compressed size: 13701 compression factor: 0.34
    Data size: 3.85M compressed 1.31M compression factor 0.34
    

    Sample code

      public static void encode(int[] buffer, int length, BinaryOut output) {
        short size = (short)(length & 0x7FFF);
    
        output.write(size);
        output.write(buffer[0]);
    
        for(int i=1; i< size; i++) {
            int next = buffer[i] - buffer[i-1];
            int bits = getBinarySize(next);
    
            int len = bits;
    
            if(bits > 24) {
              output.write(3, 2);
              len = bits - 24;
            }else if(bits > 16) {
              output.write(2, 2);
              len = bits-16;
            }else if(bits > 8) {
              output.write(1, 2);
              len = bits - 8;
            }else{
              output.write(0, 2);
            }
    
            if (len > 0) {
                if ((len % 2) > 0) {
                    len = len / 2;
                    output.write(len, 2);
                    output.write(false);
                } else {
                    len = len / 2 - 1;
                    output.write(len, 2);
                }
    
                output.write(next, bits);
            }
        }
    }
    
    public static short decode(BinaryIn input, int[] buffer, int offset) {
        short length = input.readShort();
        int value = input.readInt();
        buffer[offset] = value;
    
        for (int i = 1; i < length; i++) {
            int flag = input.readInt(2);
    
            int bits;
            int next = 0;
            switch (flag) {
                case 0:
                    bits = 2 * input.readInt(2) + 2;
                    next = input.readInt(bits);
                    break;
                case 1:
                    bits = 8 + 2 * input.readInt(2) +2;
                    next = input.readInt(bits);
                    break;
                case 2:
                    bits = 16 + 2 * input.readInt(2) +2;
                    next = input.readInt(bits);
                    break;
                case 3:
                    bits = 24 + 2 * input.readInt(2) +2;
                    next = input.readInt(bits);
                    break;
            }
    
            buffer[offset + i] = buffer[offset + i - 1] + next;
        }
    
       return length;
    }
    

    Please note, this approach:

  • does not consume a lot of memory
  • works with streams
  • provides not so bad results
  • Full code can be found here, BinaryInput and BinaryOutput implementations can be found here

    Final conclusion

    No final conclusion :) Sometimes it is really good idea to move one level up and review the task from a meta-level point of view.

    It was fun to spend some time with this task. BTW, there are a lot of interesting answers below. Thank you for your attention and happy codding.

    链接地址: http://www.djcxy.com/p/61376.html

    上一篇: 数据挖掘项目数据集

    下一篇: 在1 MB的RAM中的数字数字