Node.JS Mongo DB驱动程序没有拆分批量插入?

时间:2023-01-03 02:57:47

I'm trying to insert about 100k documents with a single collection.insert call using the standard Mongo DB driver for Node.JS:

我正在尝试使用Node.JS的标准Mongo DB驱动程序,使用单个collection.insert调用插入大约100k个文档:

var MongoClient = require('mongodb').MongoClient;

MongoClient.connect('mongodb://localhost/testdb', function(err, db) {
    var collection = db.collection('testcollection');
    var docs = [];

    var doc = {
        str: 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam sit amet urna consequat quam pharetra sagittis vitae at nulla. Suspendisse non felis sollicitudin, condimentum urna eu, congue massa. Nam arcu dui, sodales eget auctor nec, ullamcorper in turpis. Praesent sit amet purus mi. Mauris egestas sapien magna, a mattis tellus luctus et. Suspendisse potenti. Nam posuere neque at vulputate ornare. Nunc mollis lorem est, at porttitor augue sodales sed. Ut dui sapien, fermentum eu laoreet sed, sodales et augue. Aliquam erat volutpat.'
    };

    for (var i = 0; i < 100000; i++) {
        docs[i] = doc;
    }

    collection.insert(docs, function(err) {
        throw err;
    });
});

However, i get the following error:

但是,我收到以下错误:

/var/node/testproject/node_modules/mongodb/lib/mongodb/connection/base.js:242
        throw message;
              ^
Error: Document exceeds maximum allowed bson size of 16777216 bytes
    at InsertCommand.toBinary (/var/node/testproject/node_modules/mongodb/lib/mongodb/commands/insert_command.js:86:11)
    at Connection.write (/var/node/testproject/node_modules/mongodb/lib/mongodb/connection/connection.js:230:42)
    at __executeInsertCommand (/var/node/testproject/node_modules/mongodb/lib/mongodb/db.js:1857:14)
    at Db._executeInsertCommand (/var/node/testproject/node_modules/mongodb/lib/mongodb/db.js:1930:5)
    at insertAll (/var/node/testproject/node_modules/mongodb/lib/mongodb/collection/core.js:205:13)
    at Collection.insert (/var/node/testproject/node_modules/mongodb/lib/mongodb/collection/core.js:35:3)
    at /var/node/testproject/dbtest.js:15:16
    at /var/node/testproject/node_modules/mongodb/lib/mongodb/mongo_client.js:431:11
    at process._tickCallback (node.js:664:11)

Since the individual documents are clearly smaller than 16 MB and given the stack trace, it seems the driver doesn't split up commands automatically. How do I fix this, preferably without coding it myself?

由于单个文档明显小于16 MB并且给定堆栈跟踪,因此驱动程序似乎不会自动拆分命令。我如何解决这个问题,最好不要自己编码?

1 个解决方案

#1


5  

I was asking the questions to clarify what you were doing, and as suspected the documents array does exceed 64MB.

我问的问题是为了澄清你在做什么,并且怀疑文件阵列确实超过了64MB。

You seemed to expect that was the limitation per document, but what you were not expecting is that your whole request is actually a BSON document.

您似乎期望这是每个文档的限制,但您没想到的是您的整个请求实际上是BSON文档。

This is part of the wire protocol for mongodb, and as such batch requests are subject to the same limitations, in that your entire submission cannot exceed 16MB in size.

这是mongodb的有线协议的一部分,因此批量请求受到相同的限制,因为您的整个提交的大小不能超过16MB。

If you look at the runCommand pages for the insert and update operations in the 2.6 documentation, this is clearly stated.

如果您查看2.6文档中的插入和更新操作的runCommand页面,则会明确说明。

So in essence, this is not a bug. You need to break up large batch requests, and keep them under the 16MB size.

所以从本质上讲,这不是一个错误。您需要分解大批量请求,并将它们保持在16MB大小之下。

#1


5  

I was asking the questions to clarify what you were doing, and as suspected the documents array does exceed 64MB.

我问的问题是为了澄清你在做什么,并且怀疑文件阵列确实超过了64MB。

You seemed to expect that was the limitation per document, but what you were not expecting is that your whole request is actually a BSON document.

您似乎期望这是每个文档的限制,但您没想到的是您的整个请求实际上是BSON文档。

This is part of the wire protocol for mongodb, and as such batch requests are subject to the same limitations, in that your entire submission cannot exceed 16MB in size.

这是mongodb的有线协议的一部分,因此批量请求受到相同的限制,因为您的整个提交的大小不能超过16MB。

If you look at the runCommand pages for the insert and update operations in the 2.6 documentation, this is clearly stated.

如果您查看2.6文档中的插入和更新操作的runCommand页面,则会明确说明。

So in essence, this is not a bug. You need to break up large batch requests, and keep them under the 16MB size.

所以从本质上讲,这不是一个错误。您需要分解大批量请求,并将它们保持在16MB大小之下。