考虑将序列化的java对象存储为cassandra的JSON。问题是什么?

时间:2021-07-05 04:50:42

I am using Cassandra 1.2.2. I am finding it so easy to use Jackson to map my objects to and fro json and java for storing in database. I am actually tempted to do this to all of my data. My question is, Is this a good idea? What are the disadvantages of doing this to my application. My first guess is probably more processing overheads but is the juice worth the squeeze? and are there any other disadvantages that i need to know about?

我正在使用Cassandra 1.2.2。我发现使用Jackson将对象映射到json和java以便在数据库中存储非常容易。我想对我所有的数据都这样做。我的问题是,这是个好主意吗?这样做对我的应用有什么缺点?我的第一个猜想可能是更多的加工费用,但果汁值得吗?还有其他我需要知道的缺点吗?

1 个解决方案

#1


13  

One disadvantage is that to modify the data you have to read in the original, deserialize, make your change, serialize and write out the whole object. In Cassandra, writes are much more efficient than reads so it is beneficial to avoid reads before writes if possible.

一个缺点是,要修改必须在原始数据中读取的数据、反序列化、进行更改、序列化和写出整个对象。在Cassandra中,写操作比读操作更有效,所以如果可能的话,避免写操作之前读操作是有益的。

The alternative is to use separate columns for each field in your JSON. You can use composite columns for multi-dimensional data.

另一种方法是为JSON中的每个字段使用单独的列。可以对多维数据使用复合列。

So if you had the data:

如果你有数据

{
  name: "fred"
  address: "some town"
  age: 42
}

and you wanted to change the address, if you had these as separate Cassandra columns you'd just insert a column called address. If you had the JSON serialized you'd have to do much more work. This doesn't apply if your data is write-once.

你想要改变地址,如果你把这些作为单独的Cassandra列你只需要插入一个叫做address的列。如果将JSON序列化,则需要做更多工作。如果您的数据是写一次的话,这并不适用。

Even if your data is write-once, if you just wanted to read one field from the data you can just read that column if stored separately rather than reading the whole thing and deserializing. This only applies if you want to read parts of your data.

即使您的数据是写一次,如果您只是想从数据中读取一个字段,您也可以单独存储该列,而不是读取整个数据并反序列化。这只适用于您想要读取数据的部分。

In conclusion, there could be significant performance advantages to using separate columns if you have to update your data or if you only want to read parts at once.

总之,如果必须更新数据,或者只希望一次读取部分,那么使用单独的列可能具有显著的性能优势。

#1


13  

One disadvantage is that to modify the data you have to read in the original, deserialize, make your change, serialize and write out the whole object. In Cassandra, writes are much more efficient than reads so it is beneficial to avoid reads before writes if possible.

一个缺点是,要修改必须在原始数据中读取的数据、反序列化、进行更改、序列化和写出整个对象。在Cassandra中,写操作比读操作更有效,所以如果可能的话,避免写操作之前读操作是有益的。

The alternative is to use separate columns for each field in your JSON. You can use composite columns for multi-dimensional data.

另一种方法是为JSON中的每个字段使用单独的列。可以对多维数据使用复合列。

So if you had the data:

如果你有数据

{
  name: "fred"
  address: "some town"
  age: 42
}

and you wanted to change the address, if you had these as separate Cassandra columns you'd just insert a column called address. If you had the JSON serialized you'd have to do much more work. This doesn't apply if your data is write-once.

你想要改变地址,如果你把这些作为单独的Cassandra列你只需要插入一个叫做address的列。如果将JSON序列化,则需要做更多工作。如果您的数据是写一次的话,这并不适用。

Even if your data is write-once, if you just wanted to read one field from the data you can just read that column if stored separately rather than reading the whole thing and deserializing. This only applies if you want to read parts of your data.

即使您的数据是写一次,如果您只是想从数据中读取一个字段,您也可以单独存储该列,而不是读取整个数据并反序列化。这只适用于您想要读取数据的部分。

In conclusion, there could be significant performance advantages to using separate columns if you have to update your data or if you only want to read parts at once.

总之,如果必须更新数据,或者只希望一次读取部分,那么使用单独的列可能具有显著的性能优势。