Mongoid Random Document

Mongoid random document

The best solution is going to depend on the expected size of the collection.

For tiny collections, just get all of them and .shuffle.slice!

For small sizes of n, you can get away with something like this:

result = (0..User.count-1).sort_by{rand}.slice(0, n).collect! do |i| User.skip(i).first end

For large sizes of n, I would recommend creating a "random" column to sort by. See here for details: http://cookbook.mongodb.org/patterns/random-attribute/ https://github.com/mongodb/cookbook/blob/master/content/patterns/random-attribute.txt

How to select random documents from the MongoDB collection but not the previous ones?

I had the same issue while displaying stories to users.

So I simply store 5 to 10 ids of first query result in frontend side and add $match filter with next query. And again replace previous ids with new result ids.

let ids = req.body.alreadyShowedVideos;
const videos = await Video.aggregate([{$match: {_id: {$ne: ids}}}, { $sample: { size: Number(num) } }]);

Mongo - Find a random document

Looking at the stages of your aggregation pipeline:

{ $match: { _id: { $nin: myID } } } 

Pipeline Sequence Optimization

This will use the built-in index on _id.

{ $sample: { size: 1 } }

This will select a sample record from the result of $match stage.

This is expected to be efficient provided not a not large size myID array.

Note:The inequality operator $nin is not very selective since it often matches a large portion of the index. As a result, in many cases, a $nin query with an index may perform no better than a $nin query that must scan all documents in a collection

Sample a random document from a mongoDB database in C#

You should use this package

using MongoDB.Driver.Linq; 

Try like this:

users_coll.AsQueryable().Sample(20);

MongoDB: how to find 10 random document in a collection of 100?

This was answered long time ago and, since then, MongoDB has greatly evolved.

As posted in another answer, MongoDB now supports sampling within the Aggregation Framework since version 3.2:

The way you could do this is:

db.products.aggregate([{$sample: {size: 5}}]); // You want to get 5 docs

Or:

db.products.aggregate([
{$match: {category:"Electronic Devices"}}, // filter the results
{$sample: {size: 5}} // You want to get 5 docs
]);

However, there are some warnings about the $sample operator:

(as of Nov, 6h 2017, where latest version is 3.4) => If any of this is not met:

  • $sample is the first stage of the pipeline
  • N is less than 5% of the total documents in the collection
  • The collection contains more than 100 documents

If any of the above conditions are NOT met, $sample performs a
collection scan followed by a random sort to select N documents.

Like in the last example with the $match

OLD ANSWER

You could always run:

db.products.find({category:"Electronic Devices"}).skip(Math.random()*YOUR_COLLECTION_SIZE)

But the order won't be random and you will need two queries (one count to get YOUR_COLLECTION_SIZE) or estimate how big it is (it is about 100 records, about 1000, about 10000...)

You could also add a field to all documents with a random number and query by that number. The drawback here would be that you will get the same results every time you run the same query. To fix that you can always play with limit and skip or even with sort. you could as well update those random numbers every time you fetch a record (implies more queries).

--I don't know if you are using Mongoose, Mondoid or directly the Mongo Driver for any specific language, so I'll write all about mongo shell.

Thus your, let's say, product record would look like this:

{
_id: ObjectId("..."),
name: "Awesome Product",
category: "Electronic Devices",
}

and I would suggest to use:

{
_id: ObjectId("..."),
name: "Awesome Product",
category: "Electronic Devices",
_random_sample: Math.random()
}

Then you could do:

db.products.find({category:"Electronic Devices",_random_sample:{$gte:Math.random()}})

then, you could run periodically so you update the document's _random_sample field periodically:

var your_query = {} //it would impact in your performance if there are a lot of records
your_query = {category: "Electronic Devices"} //Update
//upsert = false, multi = true
db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)

or just whenever you retrieve some records you could update all of them or just a few (depending on how many records you've retrieved)

for(var i = 0; i < records.length; i++){
var query = {_id: records[i]._id};
//upsert = false, multi = false
db.products.update(query,{$set:{_random_sample::Math.random()}},false,false);
}

EDIT

Be aware that

db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)

won't work very well as it will update every products that matches your query with the same random number. The last approach works better (updating some documents as you retrieve them)

Select random document with filter with pymongo?

As any other aggregation stage it takes input from the previous stage.

Prepend the $sample with $match to filter the documents. E.g.:

db.hyperparameters_collection.aggregate([
{ "$match": { "start_time": { "$exists": False } } },
{ "$sample": { "size": 1 } }
])


Related Topics



Leave a reply



Submit