我应该为每个用户创建独立的sql server数据库吗?

时间:2023-01-17 18:10:56

I am working on Asp.Net MVC web application, back-end is sql server 2012.

我正在做Asp。Net MVC web应用,后端是sql server 2012。

This application will provide billing, accounting and inventory management. User will create account by signup. just like http://www.quickbooks.in. Each user will create some masters and various transactions. There is no limit, user can make unlimited records in database.

此应用程序将提供帐单、会计和库存管理。用户将通过注册创建帐户。就像http://www.quickbooks.in。每个用户将创建一些master和各种事务。没有限制,用户可以在数据库中创建无限记录。

I want to keep stable database performance, after heavy data load. I am maintaining proper indexing and primary keys in it, but there would be heavy load on database, per user.

我希望在大量的数据负载之后保持稳定的数据库性能。我正在维护适当的索引和主键,但是每个用户对数据库的负载都很大。

So, should i create separate database for each user or should maintain one database with UserID. Add UserID in each table and making partition based on UserID ?

因此,我应该为每个用户创建单独的数据库,还是应该使用UserID维护一个数据库。在每个表中添加UserID并基于UserID创建分区?

I am not an expert in Sql Server, so please provide suggestion with clear specification.

我不是Sql Server方面的专家,所以请您提供有明确规范的建议。

Please inform me if there is any lack of information.

如果有任何信息不足,请通知我。

2 个解决方案

#1


1  

A DB per user is what happens when customers need to be able pack up and leave taking the actual database with them. Think of a self hosted wordpress website. Or if there are incredible risks to one user accidentally seeing another user's data, so it's safer to rely on the servers security model than to rely on remembering to add the UserId filter to all your queries. I can't imagine a scenario like that, but who knows-- maybe if the privacy laws allowed for jail time, I would rather data partitioned by security rules rather than carefully writing WHERE clauses.

当客户需要打包并离开实际的数据库时,每个用户都有一个数据库。想想一个自我托管的wordpress网站。或者,如果一个用户意外地看到另一个用户的数据有不可思议的风险,那么依赖服务器安全模型比依赖于在所有查询中添加UserId过滤器更安全。我无法想象这样的场景,但谁知道呢——也许如果隐私法允许坐牢的话,我宁愿用安全规则来划分数据,而不是仔细地写条款。

If you did do user-per-database, creating a new user will be 10x more effort. While INSERT, UPDATE and so on stay the same from version to version, with each upgrade the syntax for database, user creation, permission granting and so on will evolve enough to break those scripts each SQL version upgrade.

如果您确实使用了用户数据库,那么创建一个新用户将花费更多的精力。当插入、更新等操作在不同版本之间保持不变时,每次升级数据库、用户创建、权限授予等语法,就足以破坏每次SQL版本升级的脚本。

Also, this will multiply your migration headaches by the number of users. Let's say you have 5000 users and you need to add some new columns, change a columns data type, update a trigger, and so on. Instead of needing to run that change script 1x, you need to run it 5000 times.

此外,这将使您的迁移头痛成倍增加用户的数量。假设您有5000个用户,您需要添加一些新列,更改列数据类型,更新触发器,等等。不需要运行那个修改脚本1x,您需要运行它5000次。

Per user Dbs also probably wastes disk space. Each of those databases is going to have a transaction log, sitting idle taking up the minimum log space.

每个用户Dbs也可能浪费磁盘空间。每个数据库都有一个事务日志,空闲时占用最小的日志空间。

As for load, if collectively your 5000 users are doing 1 billion inserts, updates and so on per day, my intuition tells me that it's going to be faster on one database, unless there is some sort of contension issue (everyone reading and writing to the same table at the same time and the same pages of the same table). Each database has machine resources (probably threads and memory) per database doing housekeeping, so these extra DBs can't be free.

至于负载,如果集体5000用户在做10亿插入,每天更新等等,我的直觉告诉我,这将是更快的在一个数据库,除非有某种contension问题(每个人都阅读和写作同一表在同一时间和同一页面相同的表)。每个数据库都有机器资源(可能是线程和内存)用于管理,因此这些额外的DBs不能是免费的。

Anyhow, the best thing to do is to simulate the two architectures and use a random data generator to simulate load and see how they perform.

总之,最好的方法是模拟这两个体系结构,并使用一个随机数据生成器来模拟负载,看看它们是如何执行的。

#2


1  

It's not an easy answer to give.

这不是一个简单的答案。

First, there is logical design to be considered. Then you have integrity, security, management and performance (in this very order).

首先,要考虑逻辑设计。然后您就拥有了完整性、安全性、管理和性能(按此顺序)。

A database is a logical unit of data, self contained. Ideally, you should be able to take a database, move it to another instance, probably change the connection strings and be running again. All the constraints are database-level. No foreign keys can exist referencing some object outside the database. So, try thinking in these terms first.

数据库是自包含的数据的逻辑单元。理想情况下,您应该能够获取一个数据库,将它移动到另一个实例,可能更改连接字符串并再次运行。所有的约束都是数据库级的。不存在引用数据库之外的对象的外键。所以,先试着用这些术语来思考。

How would you reliably prevent one user messing up the other user's data? Keep in mind that it's just a matter of time before someone opens an excel sheet and fire up queries on the database bypassing your application. Row level security in SQL Server is something you don't want to deal with.

如何可靠地防止一个用户破坏另一个用户的数据?请记住,在有人打开excel表并绕过应用程序对数据库发起查询之前,这只是时间问题。SQL Server中的行级安全性是您不希望处理的。

Multiple databases mean that all management tasks should be scripted out and executed on all databases. Yes, there is some overhead to it, but once you set it up it's just the matter of monitoring. If a database goes suspect, it's a single customer down, not all of them. You can even have different versions for different customes if each customer have it's own database. Additionally, if you roll an upgrade, you can do it per customer, so the inpact will be much less.

多个数据库意味着所有管理任务都应该编写脚本并在所有数据库上执行。是的,它有一些开销,但是一旦您设置了它,这只是监视的问题。如果数据库出现问题,那么它是一个客户,而不是所有客户。如果每个客户都有自己的数据库,您甚至可以为不同的客户提供不同的版本。此外,如果您滚动升级,您可以为每个客户进行升级,因此inpact将更少。

Performance is the least relevant factor here. Of course, it really depends on how many customers and how much data, but proper indexing will solve these issues. Scale-out is much easier with multiple databases.

性能是这里最不相关的因素。当然,这实际上取决于有多少客户和多少数据,但是适当的索引将解决这些问题。对于多个数据库,扩展要容易得多。

BTW, partitioning, as you mentioned it, is never a performance booster, it's simply a management feature, allowing for faster loading and evicting of data from a table.

顺便说一句,正如您所提到的,分区从来都不是性能提升器,它只是一个管理特性,允许更快地从表中加载和删除数据。

I'd probably put each customer in separate database, but it's up to you eventually to make a decision for yourself. Hope I've helped some with this.

我可能会把每个客户都放在单独的数据库中,但最终由您自己决定。希望我在这方面有所帮助。

#1


1  

A DB per user is what happens when customers need to be able pack up and leave taking the actual database with them. Think of a self hosted wordpress website. Or if there are incredible risks to one user accidentally seeing another user's data, so it's safer to rely on the servers security model than to rely on remembering to add the UserId filter to all your queries. I can't imagine a scenario like that, but who knows-- maybe if the privacy laws allowed for jail time, I would rather data partitioned by security rules rather than carefully writing WHERE clauses.

当客户需要打包并离开实际的数据库时,每个用户都有一个数据库。想想一个自我托管的wordpress网站。或者,如果一个用户意外地看到另一个用户的数据有不可思议的风险,那么依赖服务器安全模型比依赖于在所有查询中添加UserId过滤器更安全。我无法想象这样的场景,但谁知道呢——也许如果隐私法允许坐牢的话,我宁愿用安全规则来划分数据,而不是仔细地写条款。

If you did do user-per-database, creating a new user will be 10x more effort. While INSERT, UPDATE and so on stay the same from version to version, with each upgrade the syntax for database, user creation, permission granting and so on will evolve enough to break those scripts each SQL version upgrade.

如果您确实使用了用户数据库,那么创建一个新用户将花费更多的精力。当插入、更新等操作在不同版本之间保持不变时,每次升级数据库、用户创建、权限授予等语法,就足以破坏每次SQL版本升级的脚本。

Also, this will multiply your migration headaches by the number of users. Let's say you have 5000 users and you need to add some new columns, change a columns data type, update a trigger, and so on. Instead of needing to run that change script 1x, you need to run it 5000 times.

此外,这将使您的迁移头痛成倍增加用户的数量。假设您有5000个用户,您需要添加一些新列,更改列数据类型,更新触发器,等等。不需要运行那个修改脚本1x,您需要运行它5000次。

Per user Dbs also probably wastes disk space. Each of those databases is going to have a transaction log, sitting idle taking up the minimum log space.

每个用户Dbs也可能浪费磁盘空间。每个数据库都有一个事务日志,空闲时占用最小的日志空间。

As for load, if collectively your 5000 users are doing 1 billion inserts, updates and so on per day, my intuition tells me that it's going to be faster on one database, unless there is some sort of contension issue (everyone reading and writing to the same table at the same time and the same pages of the same table). Each database has machine resources (probably threads and memory) per database doing housekeeping, so these extra DBs can't be free.

至于负载,如果集体5000用户在做10亿插入,每天更新等等,我的直觉告诉我,这将是更快的在一个数据库,除非有某种contension问题(每个人都阅读和写作同一表在同一时间和同一页面相同的表)。每个数据库都有机器资源(可能是线程和内存)用于管理,因此这些额外的DBs不能是免费的。

Anyhow, the best thing to do is to simulate the two architectures and use a random data generator to simulate load and see how they perform.

总之,最好的方法是模拟这两个体系结构,并使用一个随机数据生成器来模拟负载,看看它们是如何执行的。

#2


1  

It's not an easy answer to give.

这不是一个简单的答案。

First, there is logical design to be considered. Then you have integrity, security, management and performance (in this very order).

首先,要考虑逻辑设计。然后您就拥有了完整性、安全性、管理和性能(按此顺序)。

A database is a logical unit of data, self contained. Ideally, you should be able to take a database, move it to another instance, probably change the connection strings and be running again. All the constraints are database-level. No foreign keys can exist referencing some object outside the database. So, try thinking in these terms first.

数据库是自包含的数据的逻辑单元。理想情况下,您应该能够获取一个数据库,将它移动到另一个实例,可能更改连接字符串并再次运行。所有的约束都是数据库级的。不存在引用数据库之外的对象的外键。所以,先试着用这些术语来思考。

How would you reliably prevent one user messing up the other user's data? Keep in mind that it's just a matter of time before someone opens an excel sheet and fire up queries on the database bypassing your application. Row level security in SQL Server is something you don't want to deal with.

如何可靠地防止一个用户破坏另一个用户的数据?请记住,在有人打开excel表并绕过应用程序对数据库发起查询之前,这只是时间问题。SQL Server中的行级安全性是您不希望处理的。

Multiple databases mean that all management tasks should be scripted out and executed on all databases. Yes, there is some overhead to it, but once you set it up it's just the matter of monitoring. If a database goes suspect, it's a single customer down, not all of them. You can even have different versions for different customes if each customer have it's own database. Additionally, if you roll an upgrade, you can do it per customer, so the inpact will be much less.

多个数据库意味着所有管理任务都应该编写脚本并在所有数据库上执行。是的,它有一些开销,但是一旦您设置了它,这只是监视的问题。如果数据库出现问题,那么它是一个客户,而不是所有客户。如果每个客户都有自己的数据库,您甚至可以为不同的客户提供不同的版本。此外,如果您滚动升级,您可以为每个客户进行升级,因此inpact将更少。

Performance is the least relevant factor here. Of course, it really depends on how many customers and how much data, but proper indexing will solve these issues. Scale-out is much easier with multiple databases.

性能是这里最不相关的因素。当然,这实际上取决于有多少客户和多少数据,但是适当的索引将解决这些问题。对于多个数据库,扩展要容易得多。

BTW, partitioning, as you mentioned it, is never a performance booster, it's simply a management feature, allowing for faster loading and evicting of data from a table.

顺便说一句,正如您所提到的,分区从来都不是性能提升器,它只是一个管理特性,允许更快地从表中加载和删除数据。

I'd probably put each customer in separate database, but it's up to you eventually to make a decision for yourself. Hope I've helped some with this.

我可能会把每个客户都放在单独的数据库中,但最终由您自己决定。希望我在这方面有所帮助。