digit numbers in 1 MB of RAM
I have a computer with 1 MB of RAM and no other local storage. I must use it to accept 1 million 8-digit decimal numbers over a TCP connection, sort them, and then send the sorted list out over another TCP connection.
The list of numbers may contain duplicates, which I must not discard. The code will be placed in ROM, so I need not subtract the size of my code from the 1 MB. I already have code to drive the Ethernet port and handle TCP/IP connections, and it requires 2 KB for its state data, including a 1 KB buffer via which the code will read and write data. Is there a solution to this problem?
Sources Of Question And Answer:
slashdot.org
cleaton.net
Here's some working C++ code which solves the problem.
Proof that the memory constraints are satisfied:
Editor: There is no proof of the maximum memory requirements offered by the author either in this post or in his blogs. Since the number of bits necessary to encode a value depends on the values previously encoded, such a proof is likely non-trivial. The author notes that the largest encoded size he could stumble upon empirically was 1011732
, and chose the buffer size 1013000
arbitrarily.
typedef unsigned int u32;
namespace WorkArea
{
static const u32 circularSize = 253250;
u32 circular[circularSize] = { 0 }; // consumes 1013000 bytes
static const u32 stageSize = 8000;
u32 stage[stageSize]; // consumes 32000 bytes
...
Together, these two arrays take 1045000 bytes of storage. That leaves 1048576 - 1045000 - 2×1024 = 1528 bytes for remaining variables and stack space.
It runs in about 23 seconds on my Xeon W3520. You can verify that the program works using the following Python script, assuming a program name of sort1mb.exe
.
from subprocess import *
import random
sequence = [random.randint(0, 99999999) for i in xrange(1000000)]
sorter = Popen('sort1mb.exe', stdin=PIPE, stdout=PIPE)
for value in sequence:
sorter.stdin.write('%08dn' % value)
sorter.stdin.close()
result = [int(line) for line in sorter.stdout]
print('OK!' if result == sorted(sequence) else 'Error!')
A detailed explanation of the algorithm can be found in the following series of posts:
There is one rather sneaky trick not mentioned here so far. We assume that you have no extra way to store data, but that is not strictly true.
One way around your problem is to do the following horrible thing, which should not be attempted by anyone under any circumstances: Use the network traffic to store data. And no, I don't mean NAS.
You can sort the numbers with only a few bytes of RAM in the following way:
COUNTER
and VALUE
. 0
; I
, increment COUNTER
and set VALUE
to max(VALUE, I)
; Once COUNTER
reaches 1000000
, you have all of the values stored in the incessant stream of ICMP requests, and VALUE
now contains the maximum integer. Pick some threshold T >> 1000000
. Set COUNTER
to zero. Every time you receive an ICMP packet, increment COUNTER
and send the contained integer I back out in another echo request, unless I=VALUE
, in which case transmit it to the destination for the sorted integers. Once COUNTER=T
, decrement VALUE
by 1
, reset COUNTER
to zero and repeat. Once VALUE
reaches zero you should have transmitted all integers in order from largest to smallest to the destination, and have only used about 47 bits of RAM for the two persistent variables (and whatever small amount you need for the temporary values).
I know this is horrible, and I know there can be all sorts of practical issues, but I thought it might give some of you a laugh or at least horrify you.
Please see the first correct answer or the later answer with arithmetic encoding. Below you may find some fun, but not a 100% bullet-proof solution.
This is quite an interesting task and here is an another solution. I hope somebody would find the result useful (or at least interesting).
Stage 1: Initial data structure, rough compression approach, basic results
Let's do some simple math: we have 1M (1048576 bytes) of RAM initially available to store 10^6 8 digit decimal numbers. [0;99999999]. So to store one number 27 bits are needed (taking the assumption that unsigned numbers will be used). Thus, to store a raw stream ~3.5M of RAM will be needed. Somebody already said it doesn't seem to be feasible, but I would say the task can be solved if the input is "good enough". Basically, the idea is to compress the input data with compression factor 0.29 or higher and do sorting in a proper manner.
Let's solve the compression issue first. There are some relevant tests already available:
http://www.theeggeadventure.com/wikimedia/index.php/Java_Data_Compression
"I ran a test to compress one million consecutive integers using various forms of compression. The results are as follows:"
None 4000027
Deflate 2006803
Filtered 1391833
BZip2 427067
Lzma 255040
It looks like LZMA (Lempel–Ziv–Markov chain algorithm) is a good choice to continue with. I've prepared a simple PoC, but there are still some details to be highlighted:
Please note, attached code is a POC, it can't be used as a final solution, it just demonstrates the idea of using several smaller buffers to store presorted numbers in some optimal way (possibly compressed). LZMA is not proposed as a final solution. It is used as a fastest possible way to introduce a compression to this PoC.
See the PoC code below (please note it just a demo, to compile it LZMA-Java will be needed):
public class MemorySortDemo {
static final int NUM_COUNT = 1000000;
static final int NUM_MAX = 100000000;
static final int BUCKETS = 5;
static final int DICT_SIZE = 16 * 1024; // LZMA dictionary size
static final int BUCKET_SIZE = 1024;
static final int BUFFER_SIZE = 10 * 1024;
static final int BUCKET_RANGE = NUM_MAX / BUCKETS;
static class Producer {
private Random random = new Random();
public int produce() { return random.nextInt(NUM_MAX); }
}
static class Bucket {
public int size, pointer;
public int[] buffer = new int[BUFFER_SIZE];
public ByteArrayOutputStream tempOut = new ByteArrayOutputStream();
public DataOutputStream tempDataOut = new DataOutputStream(tempOut);
public ByteArrayOutputStream compressedOut = new ByteArrayOutputStream();
public void submitBuffer() throws IOException {
Arrays.sort(buffer, 0, pointer);
for (int j = 0; j < pointer; j++) {
tempDataOut.writeInt(buffer[j]);
size++;
}
pointer = 0;
}
public void write(int value) throws IOException {
if (isBufferFull()) {
submitBuffer();
}
buffer[pointer++] = value;
}
public boolean isBufferFull() {
return pointer == BUFFER_SIZE;
}
public byte[] compressData() throws IOException {
tempDataOut.close();
return compress(tempOut.toByteArray());
}
private byte[] compress(byte[] input) throws IOException {
final BufferedInputStream in = new BufferedInputStream(new ByteArrayInputStream(input));
final DataOutputStream out = new DataOutputStream(new BufferedOutputStream(compressedOut));
final Encoder encoder = new Encoder();
encoder.setEndMarkerMode(true);
encoder.setNumFastBytes(0x20);
encoder.setDictionarySize(DICT_SIZE);
encoder.setMatchFinder(Encoder.EMatchFinderTypeBT4);
ByteArrayOutputStream encoderPrperties = new ByteArrayOutputStream();
encoder.writeCoderProperties(encoderPrperties);
encoderPrperties.flush();
encoderPrperties.close();
encoder.code(in, out, -1, -1, null);
out.flush();
out.close();
in.close();
return encoderPrperties.toByteArray();
}
public int[] decompress(byte[] properties) throws IOException {
InputStream in = new ByteArrayInputStream(compressedOut.toByteArray());
ByteArrayOutputStream data = new ByteArrayOutputStream(10 * 1024);
BufferedOutputStream out = new BufferedOutputStream(data);
Decoder decoder = new Decoder();
decoder.setDecoderProperties(properties);
decoder.code(in, out, 4 * size);
out.flush();
out.close();
in.close();
DataInputStream input = new DataInputStream(new ByteArrayInputStream(data.toByteArray()));
int[] array = new int[size];
for (int k = 0; k < size; k++) {
array[k] = input.readInt();
}
return array;
}
}
static class Sorter {
private Bucket[] bucket = new Bucket[BUCKETS];
public void doSort(Producer p, Consumer c) throws IOException {
for (int i = 0; i < bucket.length; i++) { // allocate buckets
bucket[i] = new Bucket();
}
for(int i=0; i< NUM_COUNT; i++) { // produce some data
int value = p.produce();
int bucketId = value/BUCKET_RANGE;
bucket[bucketId].write(value);
c.register(value);
}
for (int i = 0; i < bucket.length; i++) { // submit non-empty buffers
bucket[i].submitBuffer();
}
byte[] compressProperties = null;
for (int i = 0; i < bucket.length; i++) { // compress the data
compressProperties = bucket[i].compressData();
}
printStatistics();
for (int i = 0; i < bucket.length; i++) { // decode & sort buckets one by one
int[] array = bucket[i].decompress(compressProperties);
Arrays.sort(array);
for(int v : array) {
c.consume(v);
}
}
c.finalCheck();
}
public void printStatistics() {
int size = 0;
int sizeCompressed = 0;
for (int i = 0; i < BUCKETS; i++) {
int bucketSize = 4*bucket[i].size;
size += bucketSize;
sizeCompressed += bucket[i].compressedOut.size();
System.out.println(" bucket[" + i
+ "] contains: " + bucket[i].size
+ " numbers, compressed size: " + bucket[i].compressedOut.size()
+ String.format(" compression factor: %.2f", ((double)bucket[i].compressedOut.size())/bucketSize));
}
System.out.println(String.format("Data size: %.2fM",(double)size/(1014*1024))
+ String.format(" compressed %.2fM",(double)sizeCompressed/(1014*1024))
+ String.format(" compression factor %.2f",(double)sizeCompressed/size));
}
}
static class Consumer {
private Set<Integer> values = new HashSet<>();
int v = -1;
public void consume(int value) {
if(v < 0) v = value;
if(v > value) {
throw new IllegalArgumentException("Current value is greater than previous: " + v + " > " + value);
}else{
v = value;
values.remove(value);
}
}
public void register(int value) {
values.add(value);
}
public void finalCheck() {
System.out.println(values.size() > 0 ? "NOT OK: " + values.size() : "OK!");
}
}
public static void main(String[] args) throws IOException {
Producer p = new Producer();
Consumer c = new Consumer();
Sorter sorter = new Sorter();
sorter.doSort(p, c);
}
}
With random numbers it produces the following:
bucket[0] contains: 200357 numbers, compressed size: 353679 compression factor: 0.44
bucket[1] contains: 199465 numbers, compressed size: 352127 compression factor: 0.44
bucket[2] contains: 199682 numbers, compressed size: 352464 compression factor: 0.44
bucket[3] contains: 199949 numbers, compressed size: 352947 compression factor: 0.44
bucket[4] contains: 200547 numbers, compressed size: 353914 compression factor: 0.44
Data size: 3.85M compressed 1.70M compression factor 0.44
For a simple ascending sequence (one bucket is used) it produces:
bucket[0] contains: 1000000 numbers, compressed size: 256700 compression factor: 0.06
Data size: 3.85M compressed 0.25M compression factor 0.06
EDIT
Conclusion:
Stage 2: Enhanced compression, final conclusion
As was already mentioned in the previous section, any suitable compression technique can be used. So let's get rid of LZMA in favor of simpler and better (if possible) approach. There are a lot of good solutions including Arithmetic coding, Radix tree etc.
Anyway, simple but useful encoding scheme will be more illustrative than yet another external library, providing some nifty algorithm. The actual solution is pretty straightforward: since there are buckets with partially sorted data, deltas can be used instead of numbers.
Random input test shows slightly better results:
bucket[0] contains: 10103 numbers, compressed size: 13683 compression factor: 0.34
bucket[1] contains: 9885 numbers, compressed size: 13479 compression factor: 0.34
...
bucket[98] contains: 10026 numbers, compressed size: 13612 compression factor: 0.34
bucket[99] contains: 10058 numbers, compressed size: 13701 compression factor: 0.34
Data size: 3.85M compressed 1.31M compression factor 0.34
Sample code
public static void encode(int[] buffer, int length, BinaryOut output) {
short size = (short)(length & 0x7FFF);
output.write(size);
output.write(buffer[0]);
for(int i=1; i< size; i++) {
int next = buffer[i] - buffer[i-1];
int bits = getBinarySize(next);
int len = bits;
if(bits > 24) {
output.write(3, 2);
len = bits - 24;
}else if(bits > 16) {
output.write(2, 2);
len = bits-16;
}else if(bits > 8) {
output.write(1, 2);
len = bits - 8;
}else{
output.write(0, 2);
}
if (len > 0) {
if ((len % 2) > 0) {
len = len / 2;
output.write(len, 2);
output.write(false);
} else {
len = len / 2 - 1;
output.write(len, 2);
}
output.write(next, bits);
}
}
}
public static short decode(BinaryIn input, int[] buffer, int offset) {
short length = input.readShort();
int value = input.readInt();
buffer[offset] = value;
for (int i = 1; i < length; i++) {
int flag = input.readInt(2);
int bits;
int next = 0;
switch (flag) {
case 0:
bits = 2 * input.readInt(2) + 2;
next = input.readInt(bits);
break;
case 1:
bits = 8 + 2 * input.readInt(2) +2;
next = input.readInt(bits);
break;
case 2:
bits = 16 + 2 * input.readInt(2) +2;
next = input.readInt(bits);
break;
case 3:
bits = 24 + 2 * input.readInt(2) +2;
next = input.readInt(bits);
break;
}
buffer[offset + i] = buffer[offset + i - 1] + next;
}
return length;
}
Please note, this approach:
Full code can be found here, BinaryInput and BinaryOutput implementations can be found here
Final conclusion
No final conclusion :) Sometimes it is really good idea to move one level up and review the task from a meta-level point of view.
It was fun to spend some time with this task. BTW, there are a lot of interesting answers below. Thank you for your attention and happy codding.
链接地址: http://www.djcxy.com/p/61376.html上一篇: 数据挖掘项目数据集
下一篇: 在1 MB的RAM中的数字数字