Cassandra .csv导入错误:批量太大

时间:2022-09-15 14:03:12

I'm trying to import data from a .csv file to Cassandra 3.2.1 via copy command.In the file are only 299 rows with 14 columns. I get the Error:

我正在尝试通过复制命令将数据从.csv文件导入到Cassandra 3.2.1。该文件只有299行,有14列。我得到错误:

Failed to import 299 rows: InvalidRequest - code=2200 [Invalid query] message="Batch too large"

无法导入299行:InvalidRequest - code = 2200 [Invalid query] message =“Batch too large”

I used the following copy comand and tryied to increase the batch size:

我使用了以下复制命令并尝试增加批量大小:

copy table (Col1,Col2,...)from 'file.csv' with delimiter =';' and header = true and MAXBATCHSIZE = 5000;

使用delimiter =';'从'file.csv'复制表(Col1,Col2,...)并且header = true和MAXBATCHSIZE = 5000;

I think 299 rows are not too much to import to cassandra or i am wrong?

我认为299行不是太多导入到cassandra或我错了?

1 个解决方案

#1


4  

The error you're encountering is a server-side error message, saying that the size (in term of bytes count) of your batch insert is too large.

您遇到的错误是服务器端错误消息,表示批量插入的大小(以字节数计)太大。

This batch size is defined in the cassandra.yaml file:

此批处理大小在cassandra.yaml文件中定义:

# Log WARN on any batch size exceeding this value. 5kb per batch by default.
# Caution should be taken on increasing the size of this threshold as it can lead to node instability.
batch_size_warn_threshold_in_kb: 5

# Fail any batch exceeding this value. 50kb (10x warn threshold) by default.
batch_size_fail_threshold_in_kb: 50

If you insert a lot of big columns (in size) you may reach quickly this threshold. Try to reduce MAXBATCHSIZE to 200.

如果插入大量大柱(大小),您可以快速达到此阈值。尝试将MAXBATCHSIZE降低到200。

More info on COPY options here

有关COPY选项的更多信息

#1


4  

The error you're encountering is a server-side error message, saying that the size (in term of bytes count) of your batch insert is too large.

您遇到的错误是服务器端错误消息,表示批量插入的大小(以字节数计)太大。

This batch size is defined in the cassandra.yaml file:

此批处理大小在cassandra.yaml文件中定义:

# Log WARN on any batch size exceeding this value. 5kb per batch by default.
# Caution should be taken on increasing the size of this threshold as it can lead to node instability.
batch_size_warn_threshold_in_kb: 5

# Fail any batch exceeding this value. 50kb (10x warn threshold) by default.
batch_size_fail_threshold_in_kb: 50

If you insert a lot of big columns (in size) you may reach quickly this threshold. Try to reduce MAXBATCHSIZE to 200.

如果插入大量大柱(大小),您可以快速达到此阈值。尝试将MAXBATCHSIZE降低到200。

More info on COPY options here

有关COPY选项的更多信息