MongoDB - GridFS源码分析

文章由LinuxBoy分享于2019-03-30 12:03:45热评（554）

MongoDB - GridFS源码分析

The database supports native storage of binary data within BSON objects. However, BSON objects in MongoDB are limited in size (4MB older versions, 16MB in v1.7/1.8, higher limits in the future). The GridFS spec provides a mechanism for transparently dividing a large file among multiple documents. This allows us to efficiently store large objects, and in the case of especially large files, such as videos, permits range operations (e.g., fetching only the first N bytes of a file).

这是官网的一段GridFS的介绍. GridFS是提供了把一个大文件切分成多个document来存储的. 向GridFS中插入一个文件, 默认使用fs.files和fs.chunks两个collection来存储此文件的信息的, 其中fs.files存放了文件的信息, fs.chunks存放了文件的数据.

> db.fs.files.findOne()
{
"_id" : ObjectId("4fbfae0ad417a4f2bc5bc6d1"),
"filename" : "./Makefile", //文件名
"chunkSize" : 262144, //chunk的大小, 是固定的, 默认为256*1024
"uploadDate" : ISODate("2012-05-25T16:06:34.794Z"), //上传日期
"md5" : "f9eae9d5987644a537862ca3707ff59d", //文件的md5值
"length" : 130 //文件的长度
}
> db.fs.chunks.findOne()
{
"_id" : ObjectId("4fbfae0a4d460742f1aa76dc"),
"files_id" : ObjectId("4fbfae0ad417a4f2bc5bc6d1"), //对应fs.files中的_id
"n" : 0, //文件的第几个chunk,这里要注意如果文件大于fs.files中的chunkSize则进行分块, 从0计数
"data" : BinData(0,"QWxsOgoJZysrIG1haW4uY3BwIC1ML3Vzci9sb2NhbC9saWIvIC1JL3Vzci9sb2NhbC9pbmNsdWRlIC1sbW9uZ29jbGllbnQg
LWxib29zdF90aHJlYWQgLWxib29zdF9maWxlc3lzdGVtIC1sYm9vc3RfcHJvZ3JhbV9vcHRpb25zCg==") //文件的二进制流
}

下面看具体测试代码:

int main(int argc, const char** argv)
{
DBClientConnection pConn;
pConn.connect("10.15.107.154:20000");
GridFS* pGridFs = new GridFS(pConn, "TestGF");
#if 1
///< 存储文件
pGridFs->storeFile("./Makefile");
#endif
#if 0
///< 遍历文件
auto_ptr<DBClientCursor> cursor = pGridFs->list();
if (cursor->more()) {
BSONObj obj = cursor->next();
cout<<obj.toString().data()<<endl;;
}
#endif
#if 0
///< 读取文件
GridFile file = pGridFs->findFile("./Makefile");
file.write("./hello");
#endif
#if 0
///< 删除文件
pGridFs->removeFile("./main.cpp");
#endif
return 0;
}

推荐文章：

MongoDB - GridFS源码分析