MongoDB: what is the most efficient way to query a single random document?

I need to pick a document from a collection at random (alternatively - a small number of successive documents from a randomly-positioned "window"). I've found two solutions: 1 and 2 . The first is unacceptable since I anticipate large collection size and wish to minimize the document size. The second seems ineffective (I'm not sure about the complexity of skip operation). And here one can find a mention of querying a document with a specified index, but I don't know how to do it (I'm using C++ driver).

Are there other solutions to the problem? Which is the most efficient?


I had a similar issue once. In my case, I had a date property on my documents. I knew the earliest date possible in the dataset so in my application code, I would generate a random date within the range of EARLIEST_DATE_IN_SET and NOW and then query mongodb using a GTE query on the date property and simply limit it to 1 result.

There was a small chance that the random date would be greater than the highest date in the data set, so i accounted for that in the application code.

With an index on the date property, this was a super fast query.


It seems like you could mold solution 1 there, (assuming your _id key was an auto-inc value), then just do a count on your records, and use that as the upper limit for a random int in c++, then grab that row.

Likewise, if you don't have an autoinc _id key, just create one with your results.. having an additional field with an INT shouldn't add that much to your document size.

If you don't have an auto-inc field Mongo talks about how to quickly add one here:

Auto Inc Field.

链接地址: http://www.djcxy.com/p/56084.html

上一篇: 为什么单个修订版的SVN转储大于完整转储?

下一篇: MongoDB:查询单个随机文档的最有效方法是什么?