增量更新应用程序数据的最佳方法

I have been working on an application for a couple of years that I updated using a back-end database. The whole key is that everything is cached on the client, so that it never requires an network connection to operate, but when it does have a connection it will always pickup the latest updates. Every application updated is shipped with the latest version of the database and I wanted it to download only the minimum amount of data when the database has been updated.

我使用后端数据库更新了一个应用程序几年。整个关键是所有内容都缓存在客户端上,因此它永远不需要网络连接来运行,但是当它确实有连接时,它将始终获取最新的更新。更新的每个应用程序都随最新版本的数据库一起提供,我希望它在数据库更新时仅下载最少量的数据。

I currently use a table with a timestamp to check for updates. It looks something like this.

我目前使用带有时间戳的表来检查更新。它看起来像这样。

ID - Name - Description- Severity - LastUpdated
0 - test.exe - KnownVirus - Critical - 2009-09-11 13:38
1 - test2.exe - Firewall - None - 2009-09-12 14:38

This approach was fine for what I previously needed, but I am looking to expand more function of the application to use this type of dynamic approach. All the data is currently stored as XML, but I do not want to store complete XML files in the database and only transmit changed data.

这种方法适合我以前需要的,但我希望扩展应用程序的更多功能以使用这种类型的动态方法。所有数据当前都存储为XML,但我不想将完整的XML文件存储在数据库中,只传输更改的数据。

So how would you go about allowing a fairly simple approach to storing dynamic content (text/xml/json/xaml) in a database, and have the client only download new updates? I was thinking of having logic that can handle XML inserted directly

那么你将如何允许一种相当简单的方法将动态内容(text / xml / json / xaml)存储在数据库中,并让客户端只下载新的更新?我正在考虑使用可以直接处理XML的逻辑

ID - Data - Revision
15 - XXX - 15

XXX would be something like <Content><File>Test.dll<File/><Description>New DLL to load.</Description></Content> and would be inserted into the cache, but this would obviously be complicated as I would need to load them in sequence.

XXX将类似 Test.dll 要加载的新DLL。并将插入到缓存中,但这显然会很复杂,因为我需要按顺序加载它们。

Another approach that has been mentioned was to base it on something similar to Source Control, storing the version in the root of the file and calculating the delta to figure out the minimal amount of data that need to be sent to the client.

已经提到的另一种方法是将其基于类似于源代码控制的东西,将版本存储在文件的根目录中并计算增量以计算出需要发送到客户端的最小数据量。

Anyone got any suggestions on how to approach this with no risk for data corruption? I would also to expand with features that allows me to revert possibly bad revisions, and replace them with new working ones.

任何人都有任何建议如何处理这个没有数据损坏的风险?我还将扩展功能,允许我恢复可能不好的修订,并用新的工作替换它们。

4 个解决方案

#1

It really depends on the tools you are using and the architecture you already have. Is there already a server with some logic and a data access layer?

这实际上取决于您使用的工具和您已有的架构。是否已经有一些具有逻辑和数据访问层的服务器?

Dynamic approaches might get complicated, slow and limit the number of solutions. Why do you need a dynamic structure? Would it be feasible to just add data by using a name-value pair approach in a relational database? Static and uniform data structures are much easier to handle.

动态方法可能变得复杂,缓慢并限制解决方案的数量。为什么需要动态结构?通过在关系数据库中使用名称 - 值对方法来添加数据是否可行?静态和统一的数据结构更容易处理。

Before going into detail, you should consider the different scenarios.

在详细介绍之前,您应该考虑不同的场景。

Items can be added

可以添加项目

Items can be changed

项目可以更改

Items can be removed (I assume)

项目可以删除(我假设)

Adding is not a big problem. The client needs to remember the last revision number it got from the server and you write a query which get everything since there.

添加不是一个大问题。客户端需要记住它从服务器获得的最后一个修订版号,然后编写一个查询,从那里获取所有内容。

Changing is basically the same. You should care about identification of the items. You need an unchangeable surrogate key, as it seems to be the ID you already have. (Guids may be useful here.)

变化基本相同。你应该关心物品的识别。您需要一个不可更改的代理键,因为它似乎是您已有的ID。 (Guids在这里可能很有用。)

Removing is tricky. You need to either flag items as deleted instead of actually removing them, or have a list of removed IDs with the revision number when they had been removed.

删除是棘手的。您需要将项目标记为已删除而不是实际删除它们,或者删除已删除ID的列表以及修订号。

Storing the data in the client: Consider using a relational database like SQLite in the client. (It doesn't need installation, it is just storing in a file. Firefox for instance stores quite a lot in SQLite databases.) When using the same in the server, you can probably reuse some code. It is also transaction based, which helps to keep it consistent (rollback in case of error during synchronization).

将数据存储在客户端中:考虑在客户端中使用SQLite等关系数据库。 (它不需要安装,它只是存储在一个文件中。例如,Firefox在SQLite数据库中存储了很多。)在服务器中使用它时,你可以重用一些代码。它也是基于事务的,这有助于保持一致(在同步期间出现错误时回滚)。

XML - if you really need it - can be stored just as a string in the database.

XML - 如果你真的需要它 - 可以像数据库中的字符串一样存储。

When using an abstraction layer or ORM that supports SQLite (eg. NHibernate), you may also reuse some code even when there is another database used by the server. Note that the learning curve for such an ORM might be rather steep. If you don't know anything like this, it could be too much.

当使用支持SQLite的抽象层或ORM(例如NHibernate)时,即使服务器使用了另一个数据库,您也可以重用某些代码。请注意,这种ORM的学习曲线可能相当陡峭。如果你不知道这样的话,可能会太多。

You don't need to force reuse of code in the client and server.

您无需在客户端和服务器中强制重用代码。

Synchronization itself shouldn't be very complicated. You have a revision number in the client and a last revision in the server. You get all new / changed and deleted items since then in the client and apply it to the local store. Update the local revision number. Commit. Done.

同步本身不应该非常复杂。您在客户端中有一个修订号,在服务器中有最后一个修订版。从那时起,您将在客户端中获取所有新的/已更改和已删除的项目,并将其应用于本地存储。更新本地修订号。承诺。完成。

I would never update only a part of a revision, because then you can't really know what changed since the last synchronization. Because you do differential updates, it is essential to have a well defined state of the client.

我永远不会只更新修订的一部分,因为那时你无法真正知道自上次同步以来发生了什么变化。因为您进行差异更新,所以必须具有良好定义的客户端状态。

#2

I would go with a solution using Sync Framework.

我会选择使用Sync Framework的解决方案。

Quote from Microsoft:

来自微软的报价:

Microsoft Sync Framework is a comprehensive synchronization platform enabling collaboration and offline for applications, services and devices. Developers can build synchronization ecosystems that integrate any application, any data from any store using any protocol over any network. Sync Framework features technologies and tools that enable roaming, sharing, and taking data offline.

Microsoft Sync Framework是一个全面的同步平台,可实现应用程序,服务和设备的协作和脱机。开发人员可以构建同步生态系统,通过任何网络上的任何协议集成任何应用程序,任何商店的任何数据。 Sync Framework具有支持漫游,共享和脱机数据的技术和工具。

A key aspect of Sync Framework is the ability to create custom providers. Providers enable any data sources to participate in the Sync Framework synchronization process, allowing peer-to-peer synchronization to occur.

Sync Framework的一个关键方面是创建自定义提供程序的能力。提供程序使任何数据源都能参与Sync Framework同步过程,从而允许进行对等同步。

#3

I have just built an application pretty much exactly as you described. I built it on top of the Microsoft Sync Framework that DjSol mentioned.

我刚刚构建了一个与你描述的完全相同的应用程序。我在DjSol提到的Microsoft Sync Framework之上构建它。

I use a C# front end application with a SqlCe database, and a SQL 2005 Server at the other end.

我使用带有SqlCe数据库的C#前端应用程序和另一端的SQL 2005 Server。

The following articles were extremely useful for me:

以下文章对我非常有用:

Tutorial: Synchronizing SQL Server and SQL Server Compact

教程:同步SQL Server和SQL Server Compact

Walkthrough: Creating a Sync service

演练:创建同步服务

Step by step N-tier configuration of Sync services for ADO.NET 2.0

为ADO.NET 2.0逐步配置同步服务的N层配置

How to Sync schema changed database using sync framework?

如何使用同步框架同步架构更改的数据库?

#4

You don't say what your back-end database is, but if it's SQL Server you can use SqlCE (SQL Server Compact Edition) as the client DB and then use RDA merge replication to update the client DB as desired. This will handle all your requirements for sure; there is no need to reinvent the wheel for such a common requirement.

您没有说明您的后端数据库是什么,但如果它是SQL Server,您可以使用SqlCE(SQL Server Compact Edition)作为客户端数据库,然后使用RDA合并复制来根据需要更新客户端数据库。这将确保满足您的所有要求;没有必要为了这样一个共同的要求重新发明*。

#1