流式引入和架构更改Streaming ingestion and schema changes

背景Background

群集节点会缓存通过流式引入接收数据的数据库的架构。Cluster nodes cache schema of databases that receive data via streaming ingestion. 此过程会优化群集资源的性能和利用率,但可能会在架构更改时导致传播延迟。This process optimizes performance and utilization of cluster resources, but can cause propagation delays when schema changes.

架构更改的示例如下:Examples of schema changes are:

  • 创建和删除数据库和表Creation and deletion of databases and tables
  • 添加、删除、重新键入或重命名表中的列Adding, removing, retyping, or renaming the columns of the table
  • 添加或删除预先创建的引入映射Adding or removing pre-created ingestion mappings
  • 添加、删除或更改策略Adding, removing, or altering policies

如果架构更改和流式引入流不协调,则某些流式引入请求可能会失败。If schema changes and streaming ingestion flows are uncoordinated, some of the streaming ingestion requests may fail. 失败可能包括与架构相关的错误,或将不完整或失真的数据插入到表中。The failures could include schema-related errors, or the insertion of incomplete or distorted data into the table. 在实现自定义引入应用程序时,强烈建议通过在有限的时间内执行重试操作或通过使用排队引入方法从失败的请求中重新路由数据来处理与架构相关的故障。When implementing custom ingestion application it is highly recommended to handle the schema-related failures by performing retries for a limited time, or by rerouting data from the failed requests via queued ingestion methods.

清除架构缓存Clearing the schema cache

通过在群集节点上显式清除架构缓存来减小传播延迟的影响。Reduce the effects of propagation delay by explicitly clearing the schema cache on the cluster nodes. 使用清除用于流式引入的架构缓存管理命令之一清除架构缓存。Clear the schema cache using one of the Clear schema cache for streaming ingestion management commands. 如果流式引入流和架构更改协调,则可以完全消除故障及其关联的数据失真。If the streaming ingestion flow and schema changes are coordinated, you can completely eliminate failures and their associated data distortion.

协调流示例:Coordinated flow example:

  1. 暂停流式引入。Suspend streaming ingestion.
  2. 等待所有未处理的流式引入请求完成。Wait until all outstanding streaming ingestion requests are complete>
  3. 进行架构更改。Do schema changes.
  4. 发出一个或多个 .clear cache streaming ingestion 架构命令。Issue one or several .clear cache streaming ingestion schema commands.
    • 重复上述操作直到成功,此时命令输出中的所有行都指示成功Repeat until successful and all rows in the command output indicate success
  5. 恢复流式引入。Resume streaming ingestion.

备注

经常使用清除缓存流式引入架构命令可能会对流式引入的性能产生不利影响。Using clear cache streaming ingestion schema commands frequently may have an adverse effect on the performance of streaming ingestion.