MongoDB: what is the most efficient way to query a single random document?

2018-06-19 22:25:11

I need to pick a document from a collection at random (alternatively - a small number of successive documents from a randomly-positioned "window"). I've found two solutions: 1 and 2 . The first is unacceptable since I anticipate large collection size and wish to minimize the document size. The second seems ineffective (I'm not sure about the complexity of skip operation). And here one can find a mention of querying a document with a specified index, but I don't know how to do it (I'm using C++ driver).

Are there other solutions to the problem? Which is the most efficient?

I had a similar issue once. In my case, I had a date property on my documents. I knew the earliest date possible in the dataset so in my application code, I would generate a random date within the range of EARLIEST_DATE_IN_SET and NOW and then query mongodb using a GTE query on the date property and simply limit it to 1 result.

There was a small chance that the random date would be greater than the highest date in the data set, so i accounted for that in the application code.

With an index on the date property, this was a super fast query.

It seems like you could mold solution 1 there, (assuming your _id key was an auto-inc value), then just do a count on your records, and use that as the upper limit for a random int in c++, then grab that row.

Likewise, if you don't have an autoinc _id key, just create one with your results.. having an additional field with an INT shouldn't add that much to your document size.

If you don't have an auto-inc field Mongo talks about how to quickly add one here:

Auto Inc Field.

链接地址: http://www.djcxy.com/p/56084.html

上一篇: 为什么单个修订版的SVN转储大于完整转储？

下一篇: MongoDB：查询单个随机文档的最有效方法是什么？