CQLSSTableWriter是否支持在一个JVM实例中同时对多个列族进行写操作?

时间:2022-12-22 04:47:26

I'm running Cassandra 2.1.0 as my client due to 2.0.9 not supporting concurrent writers on the same table, 2.0.9 on the cluster.

我将Cassandra 2.1.0作为我的客户端,因为2.0.9不支持同一表上的并发写入器,2.0.9在集群上。

I can use concurrent CQLSStableWriter objects for a single CF in one JVM instance. However, when I try to use two CQLSStableWriter objects, one for each CF, for two CF's in one JVM instance, I receive the error:

我可以在一个JVM实例中为单个CF使用并发的CQLSStableWriter对象。但是,当我尝试在一个JVM实例中为两个CF使用两个CQLSStableWriter对象时,我收到的错误是:

Exception in thread "Thread-2" java.lang.IllegalArgumentException: unconfigured columnfamily <the second column family>
at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:460)
at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.using(CQLSSTableWriter.java:391)
at CsvLoader.generateSSTables(CsvLoader.java:60)
at MultiThreadedCsvLoader$LoaderThread.run(MultiThreadedCsvLoader.java:93)
Caused by: org.apache.cassandra.exceptions.InvalidRequestException: unconfigured columnfamily avping_v2_file_sha2_id_idx
at org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily(ThriftValidation.java:115)
at org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:730)
at org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:724)
at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:437)
at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:449)
... 3 more

The code I am running is in short:

我正在运行的代码很简单:

CQLSSTableWriter writer1 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_1").forTable(<cf_1 create statement>).using(<cf_1 insert statement>).build();
CQLSSTableWriter writer2 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_2").forTable(<cf_2 create statement>).using(<cf_2 insert statement>).build();

The error occurs during the second call of using(). The program has multiple threads, but I restricted to one thread for debugging.

在第二个调用using()时发生错误。该程序有多个线程,但我只对一个线程进行调试。

Is multiple CQLSSTableWriters for multiple CF's in one JVM instance supported currently? Am I using the API correctly?

一个JVM实例中支持多个CF的cqlsstablewriter吗?我是否正确地使用了API ?

The reason I am writing to multiple CF's is that I need to build the main table and also one or more indices. sstableloader seems to be the recommended method for bulk loading. Are there any other decent methods to approach this problem if CQLSSTableWriter doesn't support my use case, such as loading the main table first and then using a CQL client to iterate over the rows in the main CF and inserting into the index? Or just switch to completely using CQL BATCH?

我编写多个CF的原因是我需要构建主表和一个或多个索引。sstableloader似乎是推荐的散装装载方法。如果CQLSSTableWriter不支持我的用例,比如先加载主表,然后使用CQL客户机对主CF中的行进行迭代,然后插入到索引中,那么是否还有其他合适的方法来解决这个问题呢?还是完全使用CQL批处理?

The first test data set is 10's of TB. The data is either in gzip'd text files or a Postgres database.

第一个测试数据集是10的TB。数据要么在gzip的文本文件中,要么在Postgres数据库中。

1 个解决方案

#1


2  

between writer1 and writer2 put you can insert this

在writer1和writer2之间你可以插入这个

import org.apache.cassandra.config.KSMetaData;
import org.apache.cassandra.config.Schema;
...
CQLSSTableWriter writer1 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_1").forTable(<cf_1 create statement>).using(<cf_1 insert statement>).build();
... do your stuff with writer1....
// remove keyspace definition
KSMetaData ksm = Schema.instance.getKSMetaData("keyspace");
Schema.instance.clearKeyspaceDefinition(ksm);
CQLSSTableWriter writer2 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_2").forTable(<cf_2 create statement>).using(<cf_2 insert statement>).build();
... do your stuff with writer2....

it worked for me
hope it helps

我希望它能有所帮助

#1


2  

between writer1 and writer2 put you can insert this

在writer1和writer2之间你可以插入这个

import org.apache.cassandra.config.KSMetaData;
import org.apache.cassandra.config.Schema;
...
CQLSSTableWriter writer1 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_1").forTable(<cf_1 create statement>).using(<cf_1 insert statement>).build();
... do your stuff with writer1....
// remove keyspace definition
KSMetaData ksm = Schema.instance.getKSMetaData("keyspace");
Schema.instance.clearKeyspaceDefinition(ksm);
CQLSSTableWriter writer2 = CQLSSTableWriter.builder().inDirectory("keyspace/cf_2").forTable(<cf_2 create statement>).using(<cf_2 insert statement>).build();
... do your stuff with writer2....

it worked for me
hope it helps

我希望它能有所帮助