Azure SQL 数据同步最佳做法Best practices for Azure SQL Data Sync

适用于:是Azure SQL 数据库 APPLIES TO: yesAzure SQL Database

本文介绍了针对 Azure SQL 数据同步的最佳做法。This article describes best practices for Azure SQL Data Sync.

有关 SQL 数据同步的概述,请参阅使用 Azure SQL 数据同步跨多个云和本地数据库同步数据For an overview of SQL Data Sync, see Sync data across multiple cloud and on-premises databases with Azure SQL Data Sync.

重要

目前,Azure SQL 数据同步不支持 Azure SQL 托管实例。Azure SQL Data Sync does not support Azure SQL Managed Instance at this time.

安全性和可靠性Security and reliability

客户端代理Client agent

  • 使用具有网络服务访问权限的最小特权用户帐户安装客户端代理。Install the client agent by using the least privileged user account that has network service access.
  • 在不是 SQL Server 计算机的计算机上安装客户端代理。Install the client agent on a computer that isn't the SQL Server computer.
  • 请勿向多个代理注册本地数据库。Don't register an on-premises database with more than one agent.
    • 即使为不同的同步组同步不同表时也应避免这样做。Avoid this even if you are syncing different tables for different sync groups.
    • 如果向多个客户端代理注册本地数据库,则会在删除其中一个同步组时产生问题。Registering an on-premises database with multiple client agents poses challenges when you delete one of the sync groups.

具有所需最小特权的数据库帐户Database accounts with least required privileges

  • 对于同步设置。For sync setup. 创建/更改表、更改数据库、创建过程、选择/更改架构、创建用户定义的类型。Create/Alter Table; Alter Database; Create Procedure; Select/ Alter Schema; Create User-Defined Type.

  • 对于正在进行的同步。在选择用于同步的表上以及同步元数据和跟踪表上为“选择/插入/更新/删除”权限,在服务创建的存储过程上为“执行”权限,在用户定义的表类型上为“执行”权限。For ongoing sync. Select/ Insert/ Update/ Delete on tables that are selected for syncing, and on sync metadata and tracking tables; Execute permission on stored procedures created by the service; Execute permission on user-defined table types.

  • 对于取消预配。For deprovisioning. 在同步的表部分为“更改”权限,在同步元数据表上为“选择/删除”权限,在同步跟踪表、存储的过程和用户定义类型上为“控制”权限。Alter on tables part of sync; Select/ Delete on sync metadata tables; Control on sync tracking tables, stored procedures, and user-defined types.

Azure SQL 数据库仅支持单组凭据。Azure SQL Database supports only a single set of credentials. 若要在此约束内完成这些任务,请考虑使用以下选项:To accomplish these tasks within this constraint, consider the following options:

  • 针对不同阶段更改凭据(例如 credential1 用于安装,credential2 用于正在运行)。Change the credentials for different phases (for example, credentials1 for setup and credentials2 for ongoing).
  • 更改凭据的权限(即,在设置同步后更改权限)。Change the permission of the credentials (that is, change the permission after sync is set up).

设置Setup

数据库考虑因素和约束Database considerations and constraints

数据库大小Database size

创建新的数据库时,请设置最大大小,以使其始终大于你部署的数据库。When you create a new database, set the maximum size so that it's always larger than the database you deploy. 如果未将最大大小设置为大于部署的数据库,则同步会失败。If you don't set the maximum size to larger than the deployed database, sync fails. 尽管 SQL 数据同步不提供自动增长,但可以在创建数据库后运行 ALTER DATABASE 命令来增加数据库的大小。Although SQL Data Sync doesn't offer automatic growth, you can run the ALTER DATABASE command to increase the size of the database after it has been created. 请确保不超出数据库大小限制。Ensure that you stay within the database size limits.

重要

SQL 数据同步会为每个数据库存储额外的元数据。SQL Data Sync stores additional metadata with each database. 请确保在计算所需的空间时考虑此元数据。Ensure that you account for this metadata when you calculate space needed. 增加的开销量与表宽度(例如,窄表会需要更多的开销)和流量大小有关。The amount of added overhead is related to the width of the tables (for example, narrow tables require more overhead) and the amount of traffic.

表考虑因素和约束Table considerations and constraints

选择表Selecting tables

无需在同步组中包括数据库中的所有表。You don't have to include all the tables that are in a database in a sync group. 同步组中所包括的表会影响效率和成本。The tables that you include in a sync group affect efficiency and costs. 仅当业务需要时才在同步组中包括表及其依赖的表。Include tables, and the tables they are dependent on, in a sync group only if business needs require it.

主键Primary keys

同步组中的每个表均必须具有主键。Each table in a sync group must have a primary key. SQL 数据同步服务无法同步不具有主键的表。SQL Data Sync can't sync a table that doesn't have a primary key.

在使用 SQL 数据同步投入生产之前,请测试初始和正在进行的同步性能。Before using SQL Data Sync in production, test initial and ongoing sync performance.

空表提供最佳性能Empty tables provide the best performance

空表在初始化时提供最佳性能。Empty tables provide the best performance at initialization time. 如果目标表为空表,则数据同步会使用批量插入来加载数据。If the target table is empty, Data Sync uses bulk insert to load the data. 否则,数据同步会逐行进行比较和插入以检查是否存在冲突。Otherwise, Data Sync does a row-by-row comparison and insertion to check for conflicts. 但是,如果不考虑性能,则可以在已包含数据的表之间设置同步。If performance is not a concern, however, you can set up sync between tables that already contain data.

预配目标数据库Provisioning destination databases

SQL 数据同步提供了基本的数据库自动预配。SQL Data Sync provides basic database autoprovisioning.

本部分讨论 SQL 数据同步预配的限制。This section discusses the limitations of provisioning in SQL Data Sync.

自动预配限制Autoprovisioning limitations

SQL 数据同步自动预配的限制如下:SQL Data Sync has the following limitations for autoprovisioning:

  • 在目标表中仅选择已创建的列。Select only the columns that are created in the destination table. 在目标表中,不会对不属于同步组一部分的任何列进行预配。Any columns that aren't part of the sync group aren't provisioned in the destination tables.
  • 仅为所选列创建索引。Indexes are created only for selected columns. 如果源表索引包含不是同步组一部分的列,则不会在目标表中预配这些索引。If the source table index has columns that aren't part of the sync group, those indexes aren't provisioned in the destination tables.
  • 不会预配 XML 类型列上的索引。Indexes on XML type columns aren't provisioned.
  • 不会预配 CHECK 约束。CHECK constraints aren't provisioned.
  • 不会预配源表上的现有触发器。Existing triggers on the source tables aren't provisioned.
  • 不会在目标数据库上创建视图和存储的过程。Views and stored procedures aren't created on the destination database.
  • 对外键约束的 ON UPDATE CASCADE 和 ON DELETE CASCADE 操作不会在目标表中重新创建。ON UPDATE CASCADE and ON DELETE CASCADE actions on foreign key constraints aren't recreated in the destination tables.
  • 如果具有精度大于 28 的十进制或数值列,则 SQL 数据同步在同步期间可能出现转换溢出问题。建议将十进制或数值列的精度限制为 28 或更小。If you have decimal or numeric columns with a precision greater than 28, SQL Data Sync may encounter a conversion overflow issue during sync. We recommend that you limit the precision of decimal or numeric columns to 28 or less.

建议Recommendations

  • 仅在尝试使用该服务时使用 SQL 数据同步自动预配功能。Use the SQL Data Sync autoprovisioning capability only when you are trying out the service.
  • 对于生产环境,应预配数据库架构。For production, provision the database schema.

在哪里定位中心数据库Where to locate the hub database

企业到云方案Enterprise-to-cloud scenario

为了最大程度的减小延迟,请使中心数据库位于同步组数据库流量最密集的位置。To minimize latency, keep the hub database close to the greatest concentration of the sync group's database traffic.

云到云方案Cloud-to-cloud scenario

  • 如果同步组中的所有数据库位于一个数据中心,则中心数据库应位于同一个数据中心。When all the databases in a sync group are in one datacenter, the hub should be located in the same datacenter. 此配置会减少延迟并降低在数据中心之间的数据传输成本。This configuration reduces latency and the cost of data transfer between datacenters.
  • 如果同步组中的数据库位于多个数据中心,则中心数据库应位于与大部分数据库和数据库流量相同的数据中心。When the databases in a sync group are in multiple datacenters, the hub should be located in the same datacenter as the majority of the databases and database traffic.

混合方案Mixed scenarios

将前面的准则应用于复杂的同步组配置,例如,应用于企业到云和云到云方案的组合。Apply the preceding guidelines to complex sync group configurations, such as those that are a mix of enterprise-to-cloud and cloud-to-cloud scenarios.

同步Sync

避免缓慢且昂贵的初始同步Avoid slow and costly initial sync

在本部分,我们将讨论同步组的初始同步。In this section, we discuss the initial sync of a sync group. 了解如何帮助防止初始同步运行时间过长而产生不必要的昂贵开销。Learn how to help prevent an initial sync from taking longer and being more costly than necessary.

初始同步工作原理How initial sync works

创建同步组时,请先仅处理一个数据库中的数据。When you create a sync group, start with data in only one database. 如果数据在多个数据库中,SQL 数据同步会将每一行都视为需要解决的冲突。If you have data in multiple databases, SQL Data Sync treats each row as a conflict that needs to be resolved. 此冲突解决会导致初始同步运行缓慢。This conflict resolution causes the initial sync to go slowly. 如果数据在多个数据库中,则初始同步可能需要几天到几个月的时间,具体要取决于数据库大小。If you have data in multiple databases, initial sync might take between several days and several months, depending on the database size.

如果数据库在不同的数据中心,每行都必须在不同的数据中心之间穿行。If the databases are in different datacenters, each row must travel between the different datacenters. 这也增加了初始同步的成本。This increases the cost of an initial sync.

建议Recommendation

如果可能,请先仅处理同步组中一个数据库的数据。If possible, start with data in only one of the sync group's databases.

进行设计以避免同步循环Design to avoid sync loops

在同步组中存在循环引用时将出现同步循环。A sync loop occurs when there are circular references within a sync group. 在此情况下,一个数据库中的每个更改都将通过同步组中的数据库进行循环无休止地复制。In that scenario, each change in one database is endlessly and circularly replicated through the databases in the sync group.

请确保避免同步循环,因为它们会导致性能降低并可能显著增加成本。Ensure that you avoid sync loops, because they cause performance degradation and might significantly increase costs.

无法传播的更改Changes that fail to propagate

更改无法传播的原因Reasons that changes fail to propagate

更改可能由于以下原因之一而无法传播:Changes might fail to propagate for one of the following reasons:

  • 架构/数据类型不兼容。Schema/datatype incompatibility.
  • 在不可为 NULL 的列中插入 NULL。Inserting null in non-nullable columns.
  • 违反外键约束。Violating foreign key constraints.

如果更改无法传播,会发生什么情况?What happens when changes fail to propagate?

  • 同步组显示它处于警告状态。Sync group shows that it's in a Warning state.
  • 详细信息显示在门户 UI 日志查看器中。Details are listed in the portal UI log viewer.
  • 如果问题在 45 天未解决,数据库将过时。If the issue is not resolved for 45 days, the database becomes out of date.

备注

这些更改即不再传播。These changes never propagate. 在此情况下恢复的唯一方法是重新创建同步组。The only way to recover in this scenario is to re-create the sync group.

建议Recommendation

通过门户和日志界面定期监控同步组和数据库运行状况。Monitor the sync group and database health regularly through the portal and log interface.

维护Maintenance

避免过时的数据库和同步组Avoid out-of-date databases and sync groups

同步组或同步组中的数据库可能会过时。A sync group or a database in a sync group can become out of date. 如果同步组的状态为“过时”,则会停止运行。When a sync group's status is Out-of-date, it stops functioning. 如果数据库的状态为“过时”,数据可能会丢失。When a database's status is Out-of-date, data might be lost. 最好避免这种情况,而不是尝试基于其还原。It's best to avoid this scenario instead of trying to recover from it.

避免过时的数据库Avoid out-of-date databases

如果数据库离线长达 45 天或更多,其状态将设置为“过时”。A database's status is set to Out-of-date when it has been offline for 45 days or more. 请确保没有任何数据库离线长达 45 天或更多,以避免数据库出现“过时”状态。To avoid an Out-of-date status on a database, ensure that none of the databases are offline for 45 days or more.

避免过时的同步组Avoid out-of-date sync groups

如果同步组中的任何更改在 45 天或更长时间内无法传播到同步组中的其他数据库,则同步组的状态将设置为“过时”。A sync group's status is set to Out-of-date when any change in the sync group fails to propagate to the rest of the sync group for 45 days or more. 请定期检查同步组的历史记录日志,以避免同步组出现“过时”状态。To avoid an Out-of-date status on a sync group, regularly check the sync group's history log. 确保所有冲突已解决,且更改在同步组数据库之间成功传播。Ensure that all conflicts are resolved, and that changes are successfully propagated throughout the sync group databases.

同步组可能会由于下列原因之一无法应用更改:A sync group might fail to apply a change for one of these reasons:

  • 表之间的架构不兼容。Schema incompatibility between tables.
  • 表之间的数据不兼容。Data incompatibility between tables.
  • 将具有 NULL 值的行插入不允许 NULL 值的列中。Inserting a row with a null value in a column that doesn't allow null values.
  • 使用违反外键约束的值更新行。Updating a row with a value that violates a foreign key constraint.

若要避免同步组过时:To prevent out-of-date sync groups:

  • 更新架构以允许失败的行中包含的值。Update the schema to allow the values that are contained in the failed rows.
  • 更新外键值以包括失败的行中包含的值。Update the foreign key values to include the values that are contained in the failed rows.
  • 更新失败的行中的数据值,以使其与目标数据库中的架构或外键兼容。Update the data values in the failed row so they are compatible with the schema or foreign keys in the target database.

避免取消预配问题Avoid deprovisioning issues

在某些情况下,向客户端代理取消注册数据库可能会导致同步失败。In some circumstances, unregistering a database with a client agent might cause sync to fail.

方案Scenario

  1. 使用 SQL 数据库实例和 SQL Server 数据库创建与本地代理 1 关联的同步组 A。Sync group A was created by using a SQL Database instance and a SQL Server database, which is associated with local agent 1.
  2. 向本地代理 2(此代理不与任何同步组关联)注册同一本地数据库。The same on-premises database is registered with local agent 2 (this agent is not associated with any sync group).
  3. 从本地代理 2 取消注册本地数据库将会删除本地数据库的同步组 A 的跟踪和元表。Unregistering the on-premises database from local agent 2 removes the tracking and meta tables for sync group A for the on-premises database.
  4. 同步组 A 操作失败,并显示以下错误:“当前操作无法完成,因为尚未预配进行同步的数据库,或者你没有访问同步配置表的权限。”Sync group A operations fail, with this error: "The current operation could not be completed because the database is not provisioned for sync or you do not have permissions to the sync configuration tables."

解决方案Solution

若要避免此情况,请不要向多个代理注册数据库。To avoid this scenario, don't register a database with more than one agent.

若要从此情况恢复:To recover from this scenario:

  1. 从数据库所属的每个同步组删除此数据库。Remove the database from each sync group that it belongs to.
  2. 将数据库重新添加回从中删除的每个同步组。Add the database back into each sync group that you removed it from.
  3. 部署每个受影响的同步组(此操作预配了数据库)。Deploy each affected sync group (this action provisions the database).

修改同步组Modifying a sync group

请勿尝试先从同步组中删除数据库,然后在不先部署其中一个更改的情况下编辑同步组。Don't attempt to remove a database from a sync group and then edit the sync group without first deploying one of the changes.

而是首先从同步组中删除数据库。Instead, first remove a database from a sync group. 然后,部署更改并等待取消预配完成。Then, deploy the change and wait for deprovisioning to finish. 取消预配完成后,可以编辑同步组并部署更改。When deprovisioning is finished, you can edit the sync group and deploy the changes.

如果你尝试先删除数据库,然后编辑同步组而不先部署其中一个更改,则一个或另一个操作会失败。If you attempt to remove a database and then edit a sync group without first deploying one of the changes, one or the other operation fails. 门户界面可能会出现不一致状态。The portal interface might become inconsistent. 如果出现此情况,请刷新页面以还原正确的状态。If this happens, refresh the page to restore the correct state.

避免架构刷新超时Avoid schema refresh timeout

在要同步复杂架构的情况下,如果同步元数据数据库的 SKU 较低(例如为“基本”),则可能会在架构刷新过程中遇到“操作超时”错误。If you have a complex schema to sync, you may encounter an "operation timeout" during a schema refresh if the sync metadata database has a lower SKU (example: basic).

解决方案Solution

若要缓解此问题,请将同步元数据数据库纵向扩展到更高的 SKU,例如 S3。To mitigate this issue, please scale up your sync metadata database to have a higher SKU, such as S3.

后续步骤Next steps

有关 SQL 数据同步的详细信息,请参阅:For more information about SQL Data Sync, see:

有关 SQL 数据库的详细信息,请参阅:For more information about SQL Database, see: