Performing a Where - in Query in Couchdb

Performing a WHERE - IN query in CouchDB

You need to use views keys query parameter to get records with keys in specified set.

function(doc){
emit(doc.table.id, null);
}

And then

GET /db/_design/ddoc_name/_view/by_table_id?keys=[2,4,56]

To retrieve document content in same time just add include_docs=True query parameter to your request.

UPD: Probably, you might be interested to retrieve documents by this reference ids (2,4,56). By default CouchDB views "maps" emitted keys with documents they belongs to. To tweak this behaviour you could use linked documents trick:

function(doc){
emit(doc.table.id, {'_id': doc.table.id});
}

And now request

GET /db/_design/ddoc_name/_view/by_table_id?keys=[2,4,56]&include_docs=True

will return rows with id field that points to document that holds 2,4 and 56 keys and doc one that contains referenced document content.

SELECT ... WHERE ... query in CouchDB + Python

First, try creating a view.

function(doc) {
if(doc.type) {
emit(doc.type);
}
}

See here for more information on views: http://guide.couchdb.org/draft/views.html

Next, write some python. I only have experience with the cloudant-python library which would look something like this:

import cloudant

account = cloudant.Account('http://localhost:5984')
db = account.database('yourdb')

view = db.view('theview')
options = {
'key': 'post',
'include_docs': True
}
for row in view.iter(params=options):
# emits only rows with the key 'post'
# with each row's emitting document

Note: this should also work for CouchDB

CouchDB - Mango Query to select records based on complex composite key

You can use $regex operator described in chapter Condition Operators of CouchDB API Reference. In below example, I assumed _id to be the key you want to search by.

"selector": {
"_id": {
"$regex": "^001.*"
}
}

Here's an example using CURL (replace <db> with the name of your database).

curl -H 'Content-Type: application/json' -X POST http://localhost:5984/<db>/_find -d '{"selector":{"_id":{"$regex": "^001.*"}}}'

Can $in / $or queries use indexes?

Answering my question: no, $in/or queries cannot use indices. I asked the question in this user@couchdb mailing list thread, where Garren Smith answered and pointed to Understanding Mango View-Based Indexes vs. Search-Based Indexes and A look under the covers of PouchDB-find. To quote Garren,

The reason that adding "_id": {"$gt": 0} works is because
pouchdb-find/mango fetches all the docs using the _all_docs index and then
processes the $in operator in memory.

If you have a large database this will hurt. But you can use a better value
than 0 to reduce the number of documents that need to be sorted in memory,
which is a good thing.

So, careful, "_id": {"$gt": 0}is by no means a way to use an index (that's the impression I got when @markwatsonatx suggested it), it's only a way to suppress pouchdb-find's warning, by telling it "I know this won't fit a map/reduce, I'll be running in-memory operations on allDocs, and I'm aware of the perf. consequences". Also,

The warning is just to help anyone new to using Mango that what they are
doing isn't the best way on a large database but will be fine on a small
database. It's a fine way to experiment but once you start noticing performance issues, creating an index is the way forward.

I'll complement with a little benchmark I made, comparing different approaches to fetch {10, 100, 1000, 10000} "cases" (using selectors on an indexed field) from a db containing [10000 cases, 100000 noise documents]


|number of cases fetched|10 |100 |1000 |10000 |
|-----------------------|------|-------|--------|------|
|$in |2452ms|2539ms |2474ms |5032ms|
|$in + $gt |905ms |784ms |1014ms |3805ms|
|$in + $gt + $lt |5ms |13ms |100ms |3854ms|
|$or |2638ms|11763ms|101279ms|- |

How to perform multi-dimensional queries in CouchDB/Cloudant

You should be able to use the N1QL query language for queries like this with no problems. N1QL is only available for Couchbase, not the CouchDB project that Couchbase grew out of.

For example, if I understand your first query there, you could write it like this in N1QL:

SELECT * 
FROM datapoints
WHERE lat BETWEEN -72 AND -70 AND
lng BETWEEN 40 AND 50 AND
time BETWEEN 1531500769 AND 1530500000

To run such a query efficiently, you'll need an index, like this:

CREATE INDEX lat_long_time_idx ON datapoints(lat, lng, time)

You can find out more about N1QL here:
https://query-tutorial.couchbase.com/tutorial/#1

CouchDB query performance

The view runs on every document, but only once. After that, the document's view value(s) are stored forever. Fetching a customer by name will be very fast because you would normally have only a few new documents to process in the view at query time.

Query time will not increase noticeably if you have more documents. Technically, access times grow logarithmically with the number of documents. However, in practice fetching documents is basically constant time and very unlikely to be a problem.



Related Topics



Leave a reply



Submit