Performing a WHERE - IN query in CouchDB
You need to use views keys
query parameter to get records with keys in specified set.
function(doc){
emit(doc.table.id, null);
}
And then
GET /db/_design/ddoc_name/_view/by_table_id?keys=[2,4,56]
To retrieve document content in same time just add include_docs=True
query parameter to your request.
UPD: Probably, you might be interested to retrieve documents by this reference ids (2,4,56). By default CouchDB views "maps" emitted keys with documents they belongs to. To tweak this behaviour you could use linked documents trick:
function(doc){
emit(doc.table.id, {'_id': doc.table.id});
}
And now request
GET /db/_design/ddoc_name/_view/by_table_id?keys=[2,4,56]&include_docs=True
will return rows with id
field that points to document that holds 2
,4
and 56
keys and doc
one that contains referenced document content.
SELECT ... WHERE ... query in CouchDB + Python
First, try creating a view.
function(doc) {
if(doc.type) {
emit(doc.type);
}
}
See here for more information on views: http://guide.couchdb.org/draft/views.html
Next, write some python. I only have experience with the cloudant-python library which would look something like this:
import cloudant
account = cloudant.Account('http://localhost:5984')
db = account.database('yourdb')
view = db.view('theview')
options = {
'key': 'post',
'include_docs': True
}
for row in view.iter(params=options):
# emits only rows with the key 'post'
# with each row's emitting document
Note: this should also work for CouchDB
CouchDB - Mango Query to select records based on complex composite key
You can use $regex
operator described in chapter Condition Operators of CouchDB API Reference. In below example, I assumed _id
to be the key you want to search by.
"selector": {
"_id": {
"$regex": "^001.*"
}
}
Here's an example using CURL (replace <db>
with the name of your database).
curl -H 'Content-Type: application/json' -X POST http://localhost:5984/<db>/_find -d '{"selector":{"_id":{"$regex": "^001.*"}}}'
Can $in / $or queries use indexes?
Answering my question: no, $in
/or
queries cannot use indices. I asked the question in this user@couchdb mailing list thread, where Garren Smith answered and pointed to Understanding Mango View-Based Indexes vs. Search-Based Indexes and A look under the covers of PouchDB-find. To quote Garren,
The reason that adding
"_id": {"$gt": 0}
works is because
pouchdb-find/mango fetches all the docs using the_all_docs
index and then
processes the$in
operator in memory.If you have a large database this will hurt. But you can use a better value
than0
to reduce the number of documents that need to be sorted in memory,
which is a good thing.
So, careful, "_id": {"$gt": 0}
is by no means a way to use an index (that's the impression I got when @markwatsonatx suggested it), it's only a way to suppress pouchdb-find's warning, by telling it "I know this won't fit a map/reduce, I'll be running in-memory operations on allDocs, and I'm aware of the perf. consequences". Also,
The warning is just to help anyone new to using Mango that what they are
doing isn't the best way on a large database but will be fine on a small
database. It's a fine way to experiment but once you start noticing performance issues, creating an index is the way forward.
I'll complement with a little benchmark I made, comparing different approaches to fetch {10, 100, 1000, 10000} "cases" (using selectors on an indexed field) from a db containing [10000 cases, 100000 noise documents]
|number of cases fetched|10 |100 |1000 |10000 |
|-----------------------|------|-------|--------|------|
|$in |2452ms|2539ms |2474ms |5032ms|
|$in + $gt |905ms |784ms |1014ms |3805ms|
|$in + $gt + $lt |5ms |13ms |100ms |3854ms|
|$or |2638ms|11763ms|101279ms|- |
How to perform multi-dimensional queries in CouchDB/Cloudant
You should be able to use the N1QL query language for queries like this with no problems. N1QL is only available for Couchbase, not the CouchDB project that Couchbase grew out of.
For example, if I understand your first query there, you could write it like this in N1QL:
SELECT *
FROM datapoints
WHERE lat BETWEEN -72 AND -70 AND
lng BETWEEN 40 AND 50 AND
time BETWEEN 1531500769 AND 1530500000
To run such a query efficiently, you'll need an index, like this:
CREATE INDEX lat_long_time_idx ON datapoints(lat, lng, time)
You can find out more about N1QL here:
https://query-tutorial.couchbase.com/tutorial/#1
CouchDB query performance
The view runs on every document, but only once. After that, the document's view value(s) are stored forever. Fetching a customer by name will be very fast because you would normally have only a few new documents to process in the view at query time.
Query time will not increase noticeably if you have more documents. Technically, access times grow logarithmically with the number of documents. However, in practice fetching documents is basically constant time and very unlikely to be a problem.
Related Topics
SQL Query to Join Two Tables Based Off Closest Timestamp
Blank Values in Date Column Returning as 1900/01/01 on Running Select Statement
How to Select Rows Where a Column Value Starts with a Certain String
SQL Do Inner Join If Condition Met
What Would Be the Best Way to Store Records Order in SQL
SQL Server Group by Query Select First Row Each Group
Using Variables for Asc and Desc in Order By
Prepend Table Name to Each Column in a Result Set in SQL? (Postgres Specifically)
Set Empty Strings ('') to Null in the Whole Database
Why Is Variable Declared Inside If Statement Created Even When Condition Evaluates to False
How to Execute a Stored Procedure Over a Set Without Using a Cursor
How to Subtract 2 Dates in Oracle to Get the Result in Hour and Minute
Change Datatype Varchar to Nvarchar in Existing SQL Server 2005 Database. Any Issues
SQL Insert into with Subquery and Value
Enforce a Foreign-Key Constraint to Columns of Same Table
Sql: Retrieve Only the Records Whose Value Has Changed
Insert Manually into a Table by SQL Statement, But Key Is Autoincremented