NSDictionary, NSArray, NSSet and efficiency

I've got a text file, with about 200,000 lines. Each line represents an object with multiple properties. I only search through one of the properties (the unique ID) of the objects. If the unique ID I'm looking for is the same as the current object's unique ID, I'm gonna read the rest of the object's values.

Right now, each time I search for an object, I just read the whole text file line by line, create an object for each line and see if it's the object I'm looking for - which is basically the most inefficient way to do the search. I would like to read all those objects into memory, so I can later search through them more efficiently.

The question is, what's the most efficient way to perform such a search? Is a 200,000-entries NSArray a good way to do this (I doubt it)? How about an NSSet? With an NSSet, is it possible to only search for one property of the objects?

Thanks for any help!

-- Ry


@yngvedh is correct in that an NSDictionary has O(1) lookup time (as is expected for a map structure). However, after doing some testing, you can see that NSSet also has O(1) lookup time. Here's the basic test I did to come up with that: http://pastie.org/933070

Basically, I create 1,000,000 strings, then time how long it takes me to retrieve 100,000 random ones from both the dictionary and the set. When I run this a few times, the set actually appears to be faster...

dict lookup: 0.174897
set lookup: 0.166058
---------------------
dict lookup: 0.171486
set lookup: 0.165325
---------------------
dict lookup: 0.170934
set lookup: 0.164638
---------------------
dict lookup: 0.172619
set lookup: 0.172966

In your particular case, I'm not sure either of these will be what you want. You say that you want all of these objects in memory, but do you really need them all, or do you just need a few of them? If it's the latter, then I would probably read through the file and create an object ID to file offset mapping (ie, remember where each object id is in the file). Then you could look up which ones you want and use the file offset to jump to the right spot in the file, parse that line, and move on. This is a job for NSFileHandle .


Use NSDictionary to map from ID's to objects. That is: use the ID as key and the object as value. NSDictionary is the only collection class which supports efficient key lookup. (Or key lookup at all)

Dictionaries are a different kind of collection than the other collection classes. It is an associative collection (maps IDs to objects in your case) whereas the others are simply containers for multiple objects. NSSet holds unordered unique objects and NSArray holds ordered objects (may hold duplicates).

UPDATE:

To avoid reallocations as you read the entries, use the dictionaryWithCapacity: method. If you know the (approximate) number of entries prior to reading them you can use it to preallocate a big enough dictionary.


200,000 objects sounds like you might run into memory constraints, depending on size of the objects and your target environment. One other thing you may want to consider is to convert the data into SQLite database, and then index the columns you want to do lookup on. This would provide a good compromise between efficiency and resource consumption, as you would not have to load the full set into memory.

链接地址: http://www.djcxy.com/p/85156.html

上一篇: 谓词是在NSSet中查找对象的最有效方法吗?

下一篇: NSDictionary,NSArray,NSSet和效率