How to speed up a query which use several string filters?

2018-06-20 14:24:38

I have a collection in MongoDB 3.4 to store the contacts of all users from some application. Each contact has a large list of string fields (100+). I use MongoDB but the question is valid for any other engine (MySQL, Elastic Search, etc.)

Almost all the queries to retrieve contacts have the same four base conditions, for example, user_id, base_field1, base_field2, base_field3 so a I created a compound index with those fields to improve the queries. The base query look like this:

db.contacts.find({
    user_id: 1434,
    base_field1: {$in: [0, 10]},
    base_field2: true,
    base_field3: "some value"
}).limit(10)

The execution time of the base query is good (less than 2 seconds) but keep in mind that there are 25K contacts which match the base conditions.

However the application lets the user to filter contacts by any other field and even to add any number of filters. All the filters uses contains operator so the query look like:

db.contacts.find({
    user_id: 1434,
    base_field1: {$in: [0, 10]},
    base_field2: true,
    base_field3: "some value",
    field4: {$regex: "foobar", $options: "i"},
    field5: {$regex: "foobar", $options: "i"},
    field6: {$regex: "foobar", $options: "i"},
      .
      .
      .
}).limit(10)

So the execution time is not good (between 9-10 seconds) for our requirements. Also, as you can expect, increasing the number of filters increase the execution time too so:

Is there any way to speed up the query from a design&query point of view?

Is there any other DB engine better than MongoDB to improve this kind of queries?

Please take in account the following comments and restrictions before reply:

A Text Index is useless here because if I create a compound text index with all the possible fields but the user filters only by field4 contains "foobar" then the result might have contacts which contains "foobar" on field5.

Just create a compound index with more than 31 fields is not possible in MongoDB.

Create a simple index for each field doesn't make sense because when the user filters by several fields, only one index will be used by MongoDB. Also you can create only 64 indexes per collection.

I actually use a MongoDB shared cluster through a hashed key (user_id) but for the sake of simplification I reduced the problem to the scope of only one shard, I mean, the problem exists even when I add a shard per user.

Edit: I changed OR conditions (field4 OR field5 ...) by AND conditions which is the real case.

链接地址: http://www.djcxy.com/p/57918.html

上一篇: 索引相交查询与COLLSCAN执行相同

下一篇: 如何加快使用多个字符串过滤器的查询？