mapreduce - SELECT COUNT(DISTINCT *) in CouchDB -
i have documents complying format:
{ "_id": "some_doc_id", "user": "some_user_id", "date": "2015-09-15", … }
it's possible have multiple documents same user. i'd count how many distinct users there between 2 dates. e.g. between '2015-09-15' , '2015-09-25', there 745 different users.
in sql, write query:
select count(distinct user) documents date between '2015-09-15' , '2015-09-25'
thank you.
i use map function like:
function (doc) { emit(doc.date, doc.user); }
which emit documents sorted date, user being value. reduce function this:
function(keys, values, rereduce) { if (rereduce) { return values.reduce(function (acc, index) { return object.keys(index).reduce(function (acc, user) { acc[user] = (acc[user] || 0) + index[user]; return acc; }, acc); }, {}); } else { return values.reduce(function (acc, user) { if (!(user in acc)) acc[user] = 0; acc[user]++; return acc; }, {}); } }
it's custom reduce function, bear me brief explanation. normal case (the 2nd branch, not rereduce
) counts values finds. result object { some_user_id: 1 }
.
the rereduce
branch takes several of reduced objects , merges them (and counts) 1 reduced result. (you can read more reduce , rereduce here)
from there, can query view following query params:
start_key="2015-09-15" end_key="2015-09-25"
you'll end same results shown earlier. (ie: { some_user_id: 1 }
) on client, can count keys in resulting object idea how many unique users there given date range.
Comments
Post a Comment