MongoDB中MapReduce的应用
MongoDB中MapReduce的应用
在MongoDB中可以使用MapReduce进行一些复杂的聚合查询
Map函数和Reduce函数可以使用JavaScript来实现
可以通过db.runCommand或mapReduce命令来执行一个MapReduce的操作:
- db.runCommand(
- { mapreduce : <collection>,
- map : <mapfunction>,
- reduce : <reducefunction>
- [, query : <query filter object>]
- [, sort : <sort the query. useful for optimization>]
- [, limit : <number of objects to return from collection>]
- [, out : <output-collection name>]
- [, keeptemp: <true|false>]
- [, finalize : <finalizefunction>]
- [, scope : <object where fields go into javascript global scope >]
- [, verbose : true]
- }
- );
- #或者使用一个包装后的Helper命令
- db.collection.mapReduce(mapfunction,reducefunction[,options]);
如果没有定义out,则它执行后默认生成一个临时的collection,当client连接断开后,该collection会自动被清除
一个简单的列子,有一个user_addr的collection,结果如下:
- db.user_addr.find({'Uid':'test@sohu.com'})
- { "_id" : ObjectId("4bbde0bf600ac3c3cc7245e3"), "Uid" : "yangsong@sohu.com", "Al" : [
- {
- "Nn" : "test-1",
- "Em" : "test-1@sohu.com",
- },
- {
- "Nn" : "test-2",
- "Em" : "test-2@sohu.com",
- },
- {
- "Nn" : "test-3",
- "Em" : "test-3@sohu.com",
- }
- ] }
存储了一个用户(Uid)对应的联系人信息(Al),现在要查询每个Em联系人对应的数目,则建立如下的MapReduce
- m=function () {
- for (index in this.Al) {
- emit(this.Al[index].Em, 1);
- }
- }
- r=function (k, vals) {
- var sum = 0;
- for (index in vals) {
- sum += vals[index];
- > }
- return sum;
- }
- res=db.user_addr.mapReduce(m,r)
- {
- "result" : "tmp.mr.mapreduce_1272267853_1",
- "timeMillis" : 29,
- "counts" : {
- "input" : 5,
- "emit" : 26,
- "output" : 26
- },
- "ok" : 1,
- }
- db[res.result].find()
MongoDB中的group函数实际上也需要借助MapReduce来实现
如:按照uid来group,计算出每个uid有多少个联系人
- r=function (obj,prev) {
- prev.sum += obj.Al.length;
- }
- db.user_addr.group({key:{'Uid':1},reduce:r,initial:{sum:0}})
评论暂时关闭