Map/reduce queries, also known as secondary indexes, are one of the most powerful features in PouchDB. However, they can be quite tricky to use, so this guide is designed to dispell some of the mysteries around them.
query()
API when the more performant allDocs()
API would be a better fit.
Before you solve a problem with secondary indexes, you should ask yourself: can I solve this with the primary index (_id
) instead?
Mappin' and reducin'
The PouchDB query()
API (which corresponds to the _view
API in CouchDB) has two modes: temporary queries and persistent queries.
Temporary queries
Temporary queries are very slow, and we only recommend them for quick debugging during development. To use a temporary query, you simply pass in a map
function:
db.query(function (doc, emit) {
emit(doc.name);
}, {key: 'foo'}).then(function (result) {
// found docs with name === 'foo'
}).catch(function (err) {
// handle any errors
});
In the above example, the result
object will contain all documents where the name
attribute is equal to 'foo'
.
emit
pattern is part of the standard CouchDB map/reduce API. What the function basically says is, "for each document, emit doc.name
as a key."
Persistent queries
Persistent queries are much faster, and are the intended way to use the query()
API in your production apps. To use persistent queries, there are two steps.
First, you create a design document, which describes the map
function you would like to use:
// document that tells PouchDB/CouchDB
// to build up an index on doc.name
var ddoc = {
_id: '_design/my_index',
views: {
by_name: {
map: function (doc) { emit(doc.name); }.toString()
}
}
};
// save it
pouch.put(ddoc).then(function () {
// success!
}).catch(function (err) {
// some error (maybe a 409, because it already exists?)
});
.toString()
at the end of the map function is necessary to prep the
object for becoming valid JSON.
Then you actually query it, by using the name you gave the design document when you saved it:
db.query('my_index/by_name').then(function (res) {
// got the query results
}).catch(function (err) {
// some error
});
Note that, the first time you query, it will be quite slow because the index isn't built until you query it. To get around this, you can do an empty query to kick off a new build:
db.query('my_index/by_name', {
limit: 0 // don't return any results
}).then(function (res) {
// index was built!
}).catch(function (err) {
// some error
});
After this, your queries will be much faster.
More about map/reduce
That was a fairly whirlwind tour of the query()
API, so let's get into more detail about how to write your map/reduce functions.
Indexes in SQL databases
Quick refresher on how indexes work: in relational databases like MySQL and PostgreSQL, you can usually query whatever field you want:
SELECT * FROM pokemon WHERE name = 'Pikachu';
But if you don't want your performance to be terrible, you first add an index:
ALTER TABLE pokemon ADD INDEX myIndex ON (name);
The job of the index is to ensure the field is stored in a B-tree within the database, so your queries run in O(log(n)) time instead of O(n) time.
Indexes in NoSQL databases
All of the above is also true in document stores like CouchDB and MongoDB, but conceptually it's a little different. By default, documents are assumed to be schemaless blobs with one primary key (called _id
in both Mongo and Couch), and any other keys need to be specified separately. The concepts are largely the same; it's mostly just the vocabulary that's different.
In CouchDB, queries are called map/reduce functions. This is because, like most NoSQL databases, CouchDB is designed to scale well across multiple computers, and to perform efficient query operations in parallel. Basically, the idea is that you divide your query into a map function and a reduce function, each of which may be executed in parallel in a multi-node cluster.
Map functions
It may sound daunting at first, but in the simplest (and most common) case, you only need the map function. A basic map function might look like this:
function myMapFunction(doc) {
emit(doc.name);
}
This is functionally equivalent to the SQL index given above. What it essentially says is: "for each document in the database, emit its name as a key."
And since it's just JavaScript, you're allowed to get as fancy as you want here:
function myMapFunction(doc) {
if (doc.type === 'pokemon') {
if (doc.name === 'Pikachu') {
emit('Pika pi!');
} else {
emit(doc.name);
}
}
}
Then you can query it:
// find pokemon with name === 'Pika pi!'
pouch.query(myMapFunction, {
key : 'Pika pi!',
include_docs : true
}).then(function (result) {
// handle result
}).catch(function (err) {
// handle errors
});
// find the first 5 pokemon whose name starts with 'P'
pouch.query(myMapFunction, {
startkey : 'P',
endkey : 'P\uffff',
limit : 5,
include_docs : true
}).then(function (result) {
// handle result
}).catch(function (err) {
// handle errors
});
query()
– i.e., startkey
/endkey
/key
/keys
/skip
/limit
/descending
– are exactly the same as with allDocs()
. For a guide to pagination, read the Bulk operations guide or Pagination strategies with PouchDB.
Reduce functions
As for reduce functions, there are a few handy built-ins that do aggregate operations ('_sum'
, '_count'
, and '_stats'
), and you can typically steer clear of trying to write your own:
// emit the first letter of each pokemon's name
var myMapReduceFun = {
map: function (doc) {
emit(doc.name.charAt(0));
},
reduce: '_count'
};
// count the pokemon whose names start with 'P'
pouch.query(myMapReduceFun, {
key: 'P', reduce: true, group: true
}).then(function (result) {
// handle result
}).catch(function (err) {
// handle errors
});
If you're adventurous, though, you should check out the CouchDB documentation or the PouchDB documentation for details on reduce functions.
More about map/reduce
The map/reduce API is complex. Part of this problem will be resolved when the more developer-friendly Cloudant query language is released in CouchDB 2.0, and the equivalent pouchdb-find plugin is finished.
pouchdb-find is in beta, but you may find it is already sufficient for simple queries. Eventually it will replace map/reduce as PouchDB’s “flagship” query engine.
In the meantime, there are a few tricks you can use to avoid unnecessarily complicating your codebase:
- Avoid the
query()
API altogether if you can. You'd be amazed how much you can do with justallDocs()
. (In fact, under the hood, thequery()
API is simply implemented on top ofallDocs()
!) - If your data is highly relational, try the relational-pouch plugin.
- Read the 12 tips for better code with PouchDB.
Related API documentation
Next
Now that we've learned how to map reduce, map reuse, and map recycle, let's move on to destroy()
and compact()
.