对于以SQLite和Azure SQL数据库为中心存储的在线/脱机多客户端移动应用程序来说,最好的主要策略是什么?

时间:2021-08-25 16:30:47

What primary key strategy would be best to use for a relational database model given the following?

对于关系数据库模型来说,有什么主要的关键策略是最好的呢?

  • tens of thousands of users
  • 成千上万的用户
  • multiple clients per user (phone, tablet, desktop)
  • 每个用户有多个客户端(电话、平板、桌面)
  • millions of rows per table (continually growing)
  • 每个表有数百万行(不断增长)

Azure SQL will be the central data store which will be exposed via Web API. The clients will include a web application and a number of native apps including iOS, Android, Mac, Windows 8, etc. The web application will require an “always on” connection and will not have a local data store but will instead retrieve and update via the api - think CRUD via RESTful API.

Azure SQL将是通过Web API公开的中心数据存储。客户将包括一个web应用程序和本地应用包括iOS、Android,Mac,Windows 8等。web应用程序将需要一个“总”连接,没有一个本地数据存储,而是将检索和更新通过api——认为CRUD通过RESTful api。

All other clients (phone, tablet, desktop) will have a local db (SQLite). On first use of this type of client the user must authenticate and sync. Once authenticated and synced, these clients can operate in an offline mode (creating, deleting and updating records in the local SQLite db). These changes will eventually sync with the Azure backend.

所有其他客户(电话、平板、桌面)都有本地db (SQLite)。在首次使用这种类型的客户机时,用户必须进行身份验证和同步。通过身份验证和同步之后,这些客户端可以在脱机模式下操作(在本地SQLite db中创建、删除和更新记录)。这些更改最终将与Azure后端同步。

The distributed nature of the databases leaves us with a primary key problem and the reason for asking this question.

数据库的分布式特性给我们留下了一个主要的关键问题和提出这个问题的原因。

Here is what we have considered thus far:

以下是我们到目前为止所考虑的:

GUID

GUID

Each client creates it’s own keys. On sync, there is a very small chance for a duplicate key but we would need to account for it by writing functionality into each client to update all relationships with a new key. GUIDs are big and when multiple foreign keys per table are considered, storage may become an issue over time. Likely the biggest problem is the random nature of GUIDs which means that they can not (or should not) be used as the clustered index due to fragmentation. This means we would need to create a clustered index (perhaps arbitrary) for each table.

每个客户机都创建自己的密钥。同步时,复制键的机会非常小,但是我们需要将功能写入每个客户端,以使用新键更新所有关系。GUIDs很大,如果考虑每个表有多个外键,那么随着时间的推移,存储可能会成为一个问题。可能最大的问题是gui的随机性,这意味着由于碎片化,它们不能(或不应该)用作集群索引。这意味着我们需要为每个表创建一个聚集索引(可能是任意的)。

Identity

身份

Each client creates it’s own primary keys. On sync, these keys are replaced with server generated keys. This adds additional complexity to the syncing process and forces each client to “fix” their keys including all foreign keys on related tables.

每个客户端都创建自己的主键。同步时,这些键将被服务器生成的键替换。这增加了同步过程的复杂性,并迫使每个客户机“修复”它们的键,包括相关表上的所有外键。

Composite

复合

Each client is assigned a client id on first sync. This client id is used in conjunction with a local auto-incrementing id as a composite primary key for each table. This composite key will be unique so there should be no conflicts on sync but it does mean that most tables will require a composite primary key. Performance and query complexity is the concern here.

每个客户端在第一次同步时被分配一个客户端id。此客户端id与本地自动递增id一起使用,作为每个表的复合主键。这个组合键将是唯一的,所以在同步时不应该有冲突,但这确实意味着大多数表将需要一个组合主键。这里关注的是性能和查询复杂性。

HiLo (Merged Composite)

小矿脉(合并复合)

Like the composite approach, each client is assigned a client id (int32) on the first sync The client id is merged with a unique local id (int32) into a single column to make an application wide unique id (int64). This should result in no conflicts during sync. While there is more order to these keys vs GUIDs since the ids generated by each client are sequential, there will be thousands of unique client-ids, so do we still run the risk of fragmentation on our clustered index?

与复合方法一样,每个客户端在第一次同步时被分配一个客户端id (int32),客户端id与一个唯一的本地id (int32)合并到一个列中,以创建一个应用程序范围内唯一的id (int64)。这应该不会在同步过程中导致冲突。虽然由于每个客户机生成的id都是连续的,所以这些键与gui的顺序更有序,但是将会有数千个唯一的客户机id,所以我们仍然在集群索引上运行碎片化的风险吗?

Are we overlooking something? Are there any other approaches worth investigating? A discussion of the pros and cons of each approach would be quite helpful.

我们忽略一些东西吗?还有其他值得研究的方法吗?讨论每种方法的优缺点将非常有帮助。

1 个解决方案

#1


0  

The key (pun intended) thing to remember is to simply have a unique key for each object you are storing on the persistent store. How you handle the storage of that object is completely up to you and up to the methodology of how you access that key. Each of the strategies you listed have their own reasons for why they do what they do but in the end they are storing a key for a certain object in the db so all of its attributes can be changed while retaining the same object reference in the database.

需要记住的关键字(双关)是为存储在持久存储中的每个对象提供唯一的键。如何处理该对象的存储完全取决于您以及如何访问该密钥的方法。您列出的每一种策略都有自己的原因来解释它们为什么要这么做,但最终它们将为db中的某个对象存储一个键,以便在数据库中保留相同的对象引用的同时更改其所有属性。

#1


0  

The key (pun intended) thing to remember is to simply have a unique key for each object you are storing on the persistent store. How you handle the storage of that object is completely up to you and up to the methodology of how you access that key. Each of the strategies you listed have their own reasons for why they do what they do but in the end they are storing a key for a certain object in the db so all of its attributes can be changed while retaining the same object reference in the database.

需要记住的关键字(双关)是为存储在持久存储中的每个对象提供唯一的键。如何处理该对象的存储完全取决于您以及如何访问该密钥的方法。您列出的每一种策略都有自己的原因来解释它们为什么要这么做,但最终它们将为db中的某个对象存储一个键,以便在数据库中保留相同的对象引用的同时更改其所有属性。