Redshift数据库上的表数是否会影响性能

时间:2020-12-13 23:05:29

Beyond the total size of my data, does it make a difference if I keep the data in a large number of tables or consolidate into one large table? Do tiny tables with almost no data have an effect on performance?

超出我数据的总大小,如果我将数据保存在大量表中或合并到一个大表中,它会有所不同吗?几乎没有数据的小表对性能有影响吗?

1 个解决方案

#1


As with any database, running a query on a table with more data will be slower due to the need to process the data. This is especially the case with Amazon Redshift since it does not use indexes -- rather, it reads through all the data (but can be made faster by using a SORTKEY which helps speed WHERE clauses).

与任何数据库一样,由于需要处理数据,在具有更多数据的表上运行查询将更慢。 Amazon Redshift尤其如此,因为它不使用索引 - 而是读取所有数据(但可以通过使用有助于加速WHERE子句的SORTKEY来加快速度)。

Some Amazon Redshift users keep data in separate tables, such as one table per month. They then create a VIEW that combines 12 monthly tables together. This way, a query can be run on only the current month's table (which would be faster), or across 12 months of data.

某些Amazon Redshift用户将数据保存在单独的表中,例如每月一个表。然后,他们创建一个VIEW,将12个月表组合在一起。这样,查询只能在当前月份的表格上运行(这会更快),或者在12个月的数据中运行。

A table (big or small) will not impact a query in which it is not involved.

表(大或小)不会影响不涉及的查询。

#1


As with any database, running a query on a table with more data will be slower due to the need to process the data. This is especially the case with Amazon Redshift since it does not use indexes -- rather, it reads through all the data (but can be made faster by using a SORTKEY which helps speed WHERE clauses).

与任何数据库一样,由于需要处理数据,在具有更多数据的表上运行查询将更慢。 Amazon Redshift尤其如此,因为它不使用索引 - 而是读取所有数据(但可以通过使用有助于加速WHERE子句的SORTKEY来加快速度)。

Some Amazon Redshift users keep data in separate tables, such as one table per month. They then create a VIEW that combines 12 monthly tables together. This way, a query can be run on only the current month's table (which would be faster), or across 12 months of data.

某些Amazon Redshift用户将数据保存在单独的表中,例如每月一个表。然后,他们创建一个VIEW,将12个月表组合在一起。这样,查询只能在当前月份的表格上运行(这会更快),或者在12个月的数据中运行。

A table (big or small) will not impact a query in which it is not involved.

表(大或小)不会影响不涉及的查询。