计算关系的最有效方式

So for example, if you had a users table, and a friends mapper between users. Should you add an extra column in users called 'friends-count' and update it every time they add a friend, or should you just do a count query every time you need the count. Which one is more efficient?

例如,如果您有一个用户表,并在用户之间有一个朋友映射器。你应该在名为'friends-count'的用户中添加一个额外的列,并在每次添加朋友时更新它,或者你应该在每次需要计数时进行计数查询。哪一个更有效率?

2 个解决方案

#1

Why would you have two versions of the truth and what would you do if they got out of sync... And how would you determine that.

为什么你会有两个版本的事实,如果他们不同步你会怎么做...你将如何确定。

The most efficient eay is to count each time, the support and validation effort the maintain a cache are likely to outweigh any computational benefit. CPU cycles are cheap compared to effort from devs.

最有效的方法是每次计数,维护缓存的支持和验证工作可能超过任何计算收益。与开发人员的努力相比,CPU周期便宜。

If it ends up being a bottleneck in your app then look at caching, you then have a tangible benefit for the caching.

如果它最终成为您应用程序中的瓶颈,那么请查看缓存,然后您可以获得缓存的实际好处。

#2

It depends. It's not always true as joocer said that the more efficient option is to count every time.

这取决于。这并不总是正确的,因为joocer说更有效的选择是每次计数。

That may be true when your system has relationship information about not many people queried with a frequency that allows a query to return results before the next query comes (the limits depends on your architecture efficiency). I mean if you can count friendships in 1 second and you get 10 counting requests at the second your system will collapse shortly.

当您的系统具有关于不太多人查询的关系信息时,可能会出现这种情况,该频率允许查询在下一个查询到来之前返回结果(限制取决于您的体系结构效率)。我的意思是,如果你可以在1秒钟内统计友谊,并且你在第二次得到10个计数请求,你的系统很快就会崩溃。

A counting action, for each request, on a system big and frequently queried as "Facebook" for example is not really efficient.

例如,针对每个请求的计数操作,对于大而且经常被查询为“Facebook”的系统而言并不是真正有效的。

Concurrency on counting field can be handled through many advanced techniques (by the middle tier, by the front end or even by the database depending on your preferences for each architecture) with no much over-work for the system, that doesn't depend a lot by the database size.

计数字段上的并发可以通过许多高级技术处理(中间层,前端甚至数据库,具体取决于您对每个体系结构的偏好),系统没有太多的过度工作,这不依赖于很多数据库的大小。

If you'll say more details about you architecture, you can get better answer for your specific case.

如果您要说明有关您的架构的更多详细信息,您可以针对具体案例获得更好的答案。

Caching is a similar way to say store counting on an extra column (eg. database materialized view, witch I tend to suggest you, if your RDBMS supports it ). Depending on the caching implementation, it usually tends to be less efficient than storing directly the extra information on the RDBMS in an optimal way.

缓存是一种类似于存储计数额外列的方式(例如,数据库物化视图,如果您的RDBMS支持,我倾向于建议您)。根据缓存实现,它通常比以最佳方式直接在RDBMS上存储额外信息的效率低。

#1