C++: Data structure for efficient insertion and retrieval of custom data

I'm having a situation in C++ (on Windows) where I need to keep a set of pair of int: pair where start values are unique (we need not be concerned with this). The operations required are:

  • insert pair
  • retrieve pair X: this should return the pair Y where Y's start < X's start < X's end < Y's end. If Y doesn't exist, return false.
  • The basic solution is to simply keep a set of pairs. For retrieval, we'll iterate sequentially through the set to check. This is O(n).

    I'm looking for a better solution. I currently see 2 candidate data structures:

  • Sorted vector
  • STL's set (internally implemented as binary search tree?)
  • Sorted Vector: Pros: can customize the binary search to support the retrieval operation. This is O(logn) Cons: how to efficiently insert a new pair to maintain the sorted order. How to avoid a re-sorting cost of O(nlogn)?

    Set: Pros: Easy insertion using the standard insert method. This is O(1)? Cons: How to avoid the sequential search? How to do better than O(n)?

    Thanks for your advice.

    I'm also open to any other structures that can efficiently (1st criterion is speed; 2nd is memory) support the 2 operations mentioned above.


    It isn't clear whether the ranges can overlap, but if they can't, then this should work. I've included a complete example with tests.

    #include <stdlib.h>
    #include <assert.h>
    #include <limits.h>
    #include <map>
    
    struct RangeContainer {
      typedef std::map<int,int> RangeMap;
      typedef std::pair<int,int> Range;
    
      void insert(const Range &range)
      {
        range_map.insert(range);
      }
    
      Range find(const Range &x) const
      {
        RangeMap::const_iterator iter = range_map.upper_bound(x.second);
        if (iter==range_map.begin()) {
          return invalidRange();
        }
        --iter;
        Range y = *iter;
        if (y.first<x.first && x.second<y.second) {
          return y;
        }
    
        return invalidRange();
      }
    
      static Range invalidRange()
      {
        return Range(INT_MAX,INT_MIN);
      }
    
      RangeMap range_map;
    };
    
    
    static void test1()
    {
      RangeContainer c;
      typedef RangeContainer::Range Range;
      c.insert(Range(1,10));
      c.insert(Range(20,30));
      assert(c.find(Range(-5,-4))==c.invalidRange());
      assert(c.find(Range(1,10))==c.invalidRange());
      assert(c.find(Range(2,9))==Range(1,10));
      assert(c.find(Range(2,10))==c.invalidRange());
      assert(c.find(Range(11,19))==c.invalidRange());
      assert(c.find(Range(21,29))==Range(20,30));
      assert(c.find(Range(20,29))==c.invalidRange());
      assert(c.find(Range(21,30))==c.invalidRange());
      assert(c.find(Range(35,40))==c.invalidRange());
    }
    
    int main(int argc,char **argv)
    {
      test1();
      return EXIT_SUCCESS;
    }
    
    链接地址: http://www.djcxy.com/p/87900.html

    上一篇: 将大数据文件读入数据结构c ++

    下一篇: C ++:用于有效插入和检索自定义数据的数据结构