Azure Cosmos DB Cassandra API 支持的 Apache Cassandra 功能Apache Cassandra features supported by Azure Cosmos DB Cassandra API

Azure Cosmos DB 是 Azure 提供的多区域分布式多模型数据库服务。Azure Cosmos DB is Azure's multiple-regionally distributed multi-model database service. 你可以通过 Cassandra 查询语言 (CQL) v4 线路协议兼容的开放源代码 Cassandra 客户端驱动程序与 Azure Cosmos DB Cassandra API 进行通信。You can communicate with the Azure Cosmos DB Cassandra API through Cassandra Query Language (CQL) v4 wire protocol compliant open-source Cassandra client drivers.

通过使用 Azure Cosmos DB Cassandra API,你可以尽享 Apache Cassandra ApI 带来的诸多优势,以及 Azure Cosmos DB 提供的各项企业功能。By using the Azure Cosmos DB Cassandra API, you can enjoy the benefits of the Apache Cassandra APIs as well as the enterprise capabilities that Azure Cosmos DB provides. 企业功能包括多区域分发自动横向扩展分区、可用性和延迟保证、空闲时加密、备份等。The enterprise capabilities include multiple-region distribution, automatic scale out partitioning, availability and latency guarantees, encryption at rest, backups, and much more.

Cassandra 协议Cassandra protocol

Azure Cosmos DB Cassandra API 与 CQL 版本 v4 兼容。The Azure Cosmos DB Cassandra API is compatible with CQL version v4. 下面列出了支持的 CQL 命令、工具、限制和例外。The supported CQL commands, tools, limitations, and exceptions are listed below. 任何理解这些协议的客户端驱动程序应该都能够连接到 Azure Cosmos DB Cassandra API。Any client driver that understands these protocols should be able to connect to Azure Cosmos DB Cassandra API.

Cassandra 驱动程序Cassandra driver

Azure Cosmos DB Cassandra API 支持以下版本的 Cassandra 驱动程序:The following versions of Cassandra drivers are supported by Azure Cosmos DB Cassandra API:

CLR 数据类型CQL data types

Azure Cosmos DB Cassandra API 支持以下 CQL 数据类型:Azure Cosmos DB Cassandra API supports the following CQL data types:

  • asciiascii
  • bigintbigint
  • blobblob
  • booleanboolean
  • countercounter
  • datedate
  • decimaldecimal
  • Doubledouble
  • floatfloat
  • frozenfrozen
  • inetinet
  • intint
  • listlist
  • setset
  • smallintsmallint
  • texttext
  • timetime
  • timestamptimestamp
  • timeuuidtimeuuid
  • tinyinttinyint
  • tupletuple
  • uuiduuid
  • varcharvarchar
  • varintvarint
  • tuplestuples
  • udtsudts
  • mapmap

CQL 函数CQL functions

Azure Cosmos DB Cassandra API 支持以下 CQL 函数:Azure Cosmos DB Cassandra API supports the following CQL functions:

  • 令牌Token
  • 聚合函数Aggregate functions
    • min, max, avg, countmin, max, avg, count
  • Blob 转换函数Blob conversion functions
    • typeAsBlob(value)typeAsBlob(value)
    • blobAsType(value)blobAsType(value)
  • UUID 和 timeuuid 函数UUID and timeuuid functions
    • dateOf()dateOf()
    • now()now()
    • minTimeuuid()minTimeuuid()
    • unixTimestampOf()unixTimestampOf()
    • toDate(timeuuid)toDate(timeuuid)
    • toTimestamp(timeuuid)toTimestamp(timeuuid)
    • toUnixTimestamp(timeuuid)toUnixTimestamp(timeuuid)
    • toDate(timestamp)toDate(timestamp)
    • toUnixTimestamp(timestamp)toUnixTimestamp(timestamp)
    • toTimestamp(date)toTimestamp(date)
    • toUnixTimestamp(date)toUnixTimestamp(date)

Cassandra API 限制Cassandra API limits

Azure Cosmos DB Cassandra API 对表中存储的数据大小没有任何限制。Azure Cosmos DB Cassandra API does not have any limits on the size of data stored in a table. 在确保遵循分区键限制的同时,可以存储数百 TB 或 PB 的数据。Hundreds of terabytes or Petabytes of data can be stored while ensuring partition key limits are honored. 同样,每个实体或等效行对列数没有任何限制。Similarly, every entity or row equivalent does not have any limits on the number of columns. 但是,实体的总大小不应超过 2 MB。However, the total size of the entity should not exceed 2 MB. 与所有其他 API 一样,每个分区键的数据都不能超过 20 GB。The data per partition key cannot exceed 20 GB as in all other APIs.

工具Tools

Azure Cosmos DB Cassandra API 是一个托管的服务平台。Azure Cosmos DB Cassandra API is a managed service platform. 它不需要任何管理开销或实用程序(如垃圾回收器、Java 虚拟机 (JVM) 和 nodetool)来管理群集。It does not require any management overhead or utilities such as Garbage Collector, Java Virtual Machine(JVM), and nodetool to manage the cluster. 它支持利用二进制 CQLv4 兼容性的工具(如 cqlsh)。It supports tools such as cqlsh that utilizes Binary CQLv4 compatibility.

  • Azure 门户的数据资源管理器、指标、日志诊断、PowerShell 和 CLI 都是其他用来管理帐户的受支持机制。Azure portal's data explorer, metrics, log diagnostics, PowerShell, and CLI are other supported mechanisms to manage the account.

托管 CQL shell(预览版)Hosted CQL shell (preview)

可以使用安装在本地计算机上的 CQLSH 连接到 Azure Cosmos DB 中的 Cassandra API。You can connect to the Cassandra API in Azure Cosmos DB by using the CQLSH installed on a local machine. 它随 Apache Cassandra 3.1.1 一起提供,设置一些环境变量即可直接使用。It comes with Apache Cassandra 3.1.1 and works out of the box by setting the environment variables. 以下部分包括使用 CQLSH 在 Windows 或 Linux 上的 Azure Cosmos DB 中安装、配置和连接到 Cassandra API 的说明。The following sections include the instructions to install, configure, and connect to Cassandra API in Azure Cosmos DB, on Windows or Linux using CQLSH.

Windows:Windows:

如果使用 Windows,建议启用适用于 Linux 的 Windows 文件系统If using windows, we recommend you enable the Windows filesystem for Linux. 然后即可按照以下 linux 命令进行操作。You can then follow the linux commands below.

Unix/Linux/Mac:Unix/Linux/Mac:

# Install default-jre and default-jdk
sudo apt install default-jre
sudo apt-get update
sudo apt install default-jdk

# Import the Baltimore CyberTrust root certificate:
curl https://cacert.omniroot.com/bc2025.crt > bc2025.crt
keytool -importcert -alias bc2025ca -file bc2025.crt

# Install the Cassandra libraries in order to get CQLSH:
echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
sudo apt-get update
sudo apt-get install cassandra

# Export the SSL variables:
export SSL_VERSION=TLSv1_2
export SSL_VALIDATE=false

# Connect to Azure Cosmos DB API for Cassandra:
cqlsh <YOUR_ACCOUNT_NAME>.cassandra.cosmos.azure.cn 10350 -u <YOUR_ACCOUNT_NAME> -p <YOUR_ACCOUNT_PASSWORD> --ssl

CQL 命令CQL commands

Azure Cosmos DB 在 Cassandra API 帐户上支持以下数据库命令。Azure Cosmos DB supports the following database commands on Cassandra API accounts.

  • CREATE KEYSPACE(忽略此命令的复制设置)CREATE KEYSPACE (The replication settings for this command are ignored)
  • CREATE TABLECREATE TABLE
  • CREATE INDEX(无需指定索引名称,并且还不支持完全冻结索引)CREATE INDEX (without specifying index name, and full frozen indexes not yet supported)
  • ALLOW FILTERINGALLOW FILTERING
  • ALTER TABLEALTER TABLE
  • USEUSE
  • INSERTINSERT
  • SELECTSELECT
  • UPDATEUPDATE
  • BATCH - 仅支持未记录的命令BATCH - Only unlogged commands are supported
  • DELETEDELETE

通过兼容 CQL V4 的 SDK 执行的所有 CRUD 操作都将返回有关错误及已使用请求单位的其他信息。All CRUD operations that are executed through a CQL v4 compatible SDK will return extra information about error and request units consumed. 处理 DELETE 和 UPDATE 命令时应考虑资源治理,以确保最有效地使用预配的吞吐量。The DELETE and UPDATE commands should be handled with resource governance taken into consideration, to ensure the most efficient use of the provisioned throughput.

  • 请注意:如果指定,gc_grace_seconds 值必须为零。Note gc_grace_seconds value must be zero if specified.

    var tableInsertStatement = table.Insert(sampleEntity); 
    var insertResult = await tableInsertStatement.ExecuteAsync(); 
    
    foreach (string key in insertResult.Info.IncomingPayload) 
            { 
                byte[] valueInBytes = customPayload[key]; 
                double value = Encoding.UTF8.GetString(valueInBytes); 
                Console.WriteLine($"CustomPayload:  {key}: {value}"); 
            } 
    

一致性映射Consistency mapping

Azure Cosmos DB Cassandra API 为读取操作提供了一致性选择。Azure Cosmos DB Cassandra API provides choice of consistency for read operations. 一致性映射的详细信息在这里The consistency mapping is detailed here.

权限和角色管理Permission and role management

Azure Cosmos DB 支持基于角色的访问控制 (RBAC) 用于预配、旋转密钥、查看指标以及读写和只读密码/密钥(可通过 Azure 门户获取)。Azure Cosmos DB supports role-based access control (RBAC) for provisioning, rotating keys, viewing metrics and read-write and read-only passwords/keys that can be obtained through the Azure portal. Azure Cosmos DB 不支持 CRUD 活动的角色。Azure Cosmos DB does not support roles for CRUD activities.

密钥空间和表选项Keyspace and Table options

目前会忽略“创建密钥空间”命令中针对区域名称、类、replication_factor 和数据中心的选项。The options for region name, class, replication_factor, and datacenter in the "Create Keyspace" command are ignored currently. 系统使用基础 Azure Cosmos DB 的多区域分发复制方法来添加区域。The system uses the underlying Azure Cosmos DB's multiple-region distribution replication method to add the regions. 如果需要数据跨区域存在,可以使用 PowerShell、CLI 或门户在帐户级别启用它。若要了解详细信息,请参阅如何添加区域一文。If you need the cross-region presence of data, you can enable it at the account level with PowerShell, CLI, or portal, to learn more, see the how to add regions article. Durable_writes 不能禁用,因为 Azure Cosmos DB 需确保每次写入都是持久的。Durable_writes can't be disabled because Azure Cosmos DB ensures every write is durable. 在每个区域,Azure Cosmos DB 都会跨副本集(由四个副本组成)来复制数据。该副本集配置不能修改。In every region, Azure Cosmos DB replicates the data across the replica set that is made up of four replicas and this replica set configuration can't be modified.

在创建表时,会忽略所有选项,但 gc_grace_seconds 除外,后者应设置为零。All the options are ignored when creating the table, except gc_grace_seconds, which should be set to zero. 密钥空间和表有一个名为“cosmosdb_provisioned_throughput”的额外选项,该选项的最小值为 400 RU/秒。The Keyspace and table have an extra option named "cosmosdb_provisioned_throughput" with a minimum value of 400 RU/s. 密钥空间吞吐量允许跨多个表共享吞吐量,这适用于所有表都不利用预配的吞吐量的情况。The Keyspace throughput allows sharing throughput across multiple tables and it is useful for scenarios when all tables are not utilizing the provisioned throughput. “更改表”命令允许跨区域更改预配的吞吐量。Alter Table command allows changing the provisioned throughput across the regions.

CREATE  KEYSPACE  sampleks WITH REPLICATION = {  'class' : 'SimpleStrategy'}   AND cosmosdb_provisioned_throughput=2000;  

CREATE TABLE sampleks.t1(user_id int PRIMARY KEY, lastname text) WITH cosmosdb_provisioned_throughput=2000; 

ALTER TABLE gks1.t1 WITH cosmosdb_provisioned_throughput=10000 ;

使用 Cassandra 重试连接策略Usage of Cassandra retry connection policy

Azure Cosmos DB 是一种资源治理系统。Azure Cosmos DB is a resource governed system. 这意味着,你可以根据操作消耗的请求单位数在给定的秒内执行特定数目的操作。This means you can do a certain number of operations in a given second based on the request units consumed by the operations. 如果应用程序在给定的秒内超出该限制,则请求会受到速率限制,并会引发异常。If an application exceeds that limit in a given second, requests are rate-limited and exceptions will be thrown. Azure Cosmos DB 中的 Cassandra API 在 Cassandra 本机协议中将这些异常解释为过载错误。The Cassandra API in Azure Cosmos DB translates these exceptions to overloaded errors on the Cassandra native protocol. 为了确保应用程序在速率受限的情况下能够截获并重试请求,我们提供了 sparkJava 扩展。To ensure that your application can intercept and retry requests in case of rate limitation, the spark and the Java extensions are provided. 当连接到 Azure Cosmos DB 中的 Cassandra API 时,另请参阅 Datastax 驱动程序版本 3版本 4 的 Java 代码示例。See also Java code samples for version 3 and version 4 Datastax drivers, when connecting to Cassandra API in Azure Cosmos DB. 在 Azure Cosmos DB 中,如果使用其他 SDK 来访问 Cassandra API,请创建一项连接策略,以便在出现这些异常时进行重试。If you use other SDKs to access Cassandra API in Azure Cosmos DB, create a connection policy to retry on these exceptions.

后续步骤Next steps