Azure Cosmos DB Cassandra API 支持的 Apache Cassandra 功能Apache Cassandra features supported by Azure Cosmos DB Cassandra API

Azure Cosmos DB 是 Azure 提供的多区域分布式多模型数据库服务。Azure Cosmos DB is Azure's multiple-regionally distributed multi-model database service. 你可以通过与 CQL 二进制协议 v4 线路协议兼容的开放源代码 Cassandra 客户端驱动程序与 Azure Cosmos DB Cassandra API 通信。You can communicate with the Azure Cosmos DB Cassandra API through the CQL Binary Protocol v4 wire protocol compliant open-source Cassandra client drivers.

通过使用 Azure Cosmos DB Cassandra API,你可以尽享 Apache Cassandra ApI 带来的诸多优势,以及 Azure Cosmos DB 提供的各项企业功能。By using the Azure Cosmos DB Cassandra API, you can enjoy the benefits of the Apache Cassandra APIs as well as the enterprise capabilities that Azure Cosmos DB provides. 企业功能包括多区域分发自动横向扩展分区、可用性和延迟保证、空闲时加密、备份等。The enterprise capabilities include multiple-region distribution, automatic scale out partitioning, availability and latency guarantees, encryption at rest, backups, and much more.

Cassandra 协议Cassandra protocol

Azure Cosmos DB Cassandra API 与 Cassandra 查询语言 (CQL) v3.11 API 兼容(后向兼容版本 2.x)。The Azure Cosmos DB Cassandra API is compatible with Cassandra Query Language (CQL) v3.11 API (backward-compatible with version 2.x). 下面列出了支持的 CQL 命令、工具、限制和例外。The supported CQL commands, tools, limitations, and exceptions are listed below. 任何理解这些协议的客户端驱动程序应该都能够连接到 Azure Cosmos DB Cassandra API。Any client driver that understands these protocols should be able to connect to Azure Cosmos DB Cassandra API.

Cassandra 驱动程序Cassandra driver

Azure Cosmos DB Cassandra API 支持以下版本的 Cassandra 驱动程序:The following versions of Cassandra drivers are supported by Azure Cosmos DB Cassandra API:

CLR 数据类型CQL data types

Azure Cosmos DB Cassandra API 支持以下 CQL 数据类型:Azure Cosmos DB Cassandra API supports the following CQL data types:

CommandCommand 支持Supported
asciiascii Yes
bigintbigint Yes
blobblob Yes
booleanboolean Yes
countercounter Yes
datedate Yes
Decimaldecimal Yes
doubledouble Yes
FLOATfloat Yes
frozenfrozen Yes
inetinet Yes
intint Yes
listlist Yes
setset Yes
smallintsmallint Yes
texttext Yes
timetime Yes
timestamptimestamp Yes
timeuuidtimeuuid Yes
tinyinttinyint Yes
tupletuple Yes
uuiduuid Yes
varcharvarchar Yes
varintvarint Yes
tuplestuples Yes
udtsudts Yes
mapmap Yes

CQL 函数CQL functions

Azure Cosmos DB Cassandra API 支持以下 CQL 函数:Azure Cosmos DB Cassandra API supports the following CQL functions:

CommandCommand 支持Supported
标记 *Token * Yes
ttlttl Yes
writetimewritetime Yes
强制转换cast No

* Cassandra API 支持作为投影/选择器的标记,但只允许在 where 子句的左侧使用标记(pk)。* Cassandra API supports token as a projection/selector, and only allows token(pk) on the left-hand side of a where clause. 例如,支持 WHERE token(pk) > 1024,但不支持 WHERE token(pk) > token(100)For example, WHERE token(pk) > 1024 is supported, but WHERE token(pk) > token(100) is not supported.

聚合函数:Aggregate functions:

CommandCommand 支持Supported
minmin Yes
maxmax Yes
平均值avg Yes
countcount Yes

Blob 转换函数:Blob conversion functions:

CommandCommand 支持Supported
typeAsBlob(value)typeAsBlob(value) Yes
blobAsType(value)blobAsType(value) Yes

UUID 和 timeuuid 函数:UUID and timeuuid functions:

CommandCommand 支持Supported
dateOf()dateOf() Yes
now()now() Yes
minTimeuuid()minTimeuuid() Yes
unixTimestampOf()unixTimestampOf() Yes
toDate(timeuuid)toDate(timeuuid) Yes
toTimestamp(timeuuid)toTimestamp(timeuuid) Yes
toUnixTimestamp(timeuuid)toUnixTimestamp(timeuuid) Yes
toDate(timestamp)toDate(timestamp) Yes
toUnixTimestamp(timestamp)toUnixTimestamp(timestamp) Yes
toTimestamp(date)toTimestamp(date) Yes
toUnixTimestamp(date)toUnixTimestamp(date) Yes

CQL 命令CQL commands

Azure Cosmos DB 在 Cassandra API 帐户上支持以下数据库命令。Azure Cosmos DB supports the following database commands on Cassandra API accounts.

CommandCommand 支持Supported
ALLOW FILTERINGALLOW FILTERING Yes
ALTER KEYSPACEALTER KEYSPACE 不适用(PaaS 服务,在内部管理的复制)N/A (PaaS service, replication managed internally)
ALTER MATERIALIZED VIEWALTER MATERIALIZED VIEW No
ALTER_ROLEALTER ROLE No
ALTER TABLEALTER TABLE Yes
ALTER TYPEALTER TYPE No
ALTER USERALTER USER No
BATCHBATCH 是(仅限无日志记录的批处理)Yes (unlogged batch only)
COMPACT STORAGECOMPACT STORAGE 不适用(PaaS 服务)N/A (PaaS service)
CREATE AGGREGATECREATE AGGREGATE No
CREATE CUSTOM INDEX (SASI)CREATE CUSTOM INDEX (SASI) No
CREATE INDEXCREATE INDEX 是(没有指定索引名,并且不支持群集键或完全 FROZEN(冻结)的集合上的索引)Yes (without specifying index name, and indexes on clustering keys or full FROZEN collection not supported)
CREATE FUNCTIONCREATE FUNCTION No
CREATE KEYSPACE(忽略复制设置)CREATE KEYSPACE (replication settings ignored) Yes
CREATE MATERIALIZED VIEWCREATE MATERIALIZED VIEW No
CREATE TABLECREATE TABLE Yes
CREATE TRIGGERCREATE TRIGGER No
CREATE TYPECREATE TYPE Yes
CREATE ROLECREATE ROLE No
CREATE USER(在原生 Apache Cassandra 中已弃用)CREATE USER (Deprecated in native Apache Cassandra) No
DELETEDELETE Yes
DELETE(使用 IF 条件的轻型事务)DELETE (lightweight transactions with IF CONDITION) Yes
DROP AGGREGATEDROP AGGREGATE No
.DROP FUNCTIONDROP FUNCTION No
DROP INDEXDROP INDEX Yes
DROP KEYSPACEDROP KEYSPACE Yes
DROP MATERIALIZED VIEWDROP MATERIALIZED VIEW No
DROP ROLEDROP ROLE No
DROP TABLEDROP TABLE Yes
DROP_TRIGGERDROP TRIGGER No
DROP TYPEDROP TYPE Yes
DROP USER(在原生 Apache Cassandra 中已弃用)DROP USER (Deprecated in native Apache Cassandra) No
GRANTGRANT No
INSERTINSERT Yes
INSERT(使用 IF 条件的轻型事务)INSERT (lightweight transactions with IF CONDITION) Yes
LIST PERMISSIONSLIST PERMISSIONS No
LIST ROLESLIST ROLES No
LIST USERS(在原生 Apache Cassandra 中已弃用)LIST USERS (Deprecated in native Apache Cassandra) No
REVOKEREVOKE No
SELECTSELECT Yes
SELECT(使用 IF 条件的轻型事务)SELECT (lightweight transactions with IF CONDITION) No
UPDATEUPDATE Yes
UPDATE(使用 IF 条件的轻型事务)UPDATE (lightweight transactions with IF CONDITION) No
TRUNCATETRUNCATE No
USEUSE Yes

JSON 支持JSON Support

CommandCommand 支持Supported
SELECT JSONSELECT JSON Yes
INSERT JSONINSERT JSON Yes
fromJson()fromJson() No
toJson()toJson() No

Cassandra API 限制Cassandra API limits

Azure Cosmos DB Cassandra API 对表中存储的数据大小没有任何限制。Azure Cosmos DB Cassandra API does not have any limits on the size of data stored in a table. 在确保遵循分区键限制的同时,可以存储数百 TB 或 PB 的数据。Hundreds of terabytes or Petabytes of data can be stored while ensuring partition key limits are honored. 同样,每个实体或等效行对列数没有任何限制。Similarly, every entity or row equivalent does not have any limits on the number of columns. 但是,实体的总大小不应超过 2 MB。However, the total size of the entity should not exceed 2 MB. 与所有其他 API 一样,每个分区键的数据都不能超过 20 GB。The data per partition key cannot exceed 20 GB as in all other APIs.

工具Tools

Azure Cosmos DB Cassandra API 是一个托管的服务平台。Azure Cosmos DB Cassandra API is a managed service platform. 它不需要任何管理开销或实用程序(如垃圾回收器、Java 虚拟机 (JVM) 和 nodetool)来管理群集。It does not require any management overhead or utilities such as Garbage Collector, Java Virtual Machine(JVM), and nodetool to manage the cluster. 它支持利用二进制 CQLv4 兼容性的工具(如 cqlsh)。It supports tools such as cqlsh that utilizes Binary CQLv4 compatibility.

  • Azure 门户的数据资源管理器、指标、日志诊断、PowerShell 和 CLI 都是其他用来管理帐户的受支持机制。Azure portal's data explorer, metrics, log diagnostics, PowerShell, and CLI are other supported mechanisms to manage the account.

托管 CQL shell(预览版)Hosted CQL shell (preview)

可以使用安装在本地计算机上的 CQLSH 连接到 Azure Cosmos DB 中的 Cassandra API。You can connect to the Cassandra API in Azure Cosmos DB by using the CQLSH installed on a local machine. 它随 Apache Cassandra 3.1.1 一起提供,设置一些环境变量即可直接使用。It comes with Apache Cassandra 3.1.1 and works out of the box by setting the environment variables. 以下部分包括使用 CQLSH 在 Windows 或 Linux 上的 Azure Cosmos DB 中安装、配置和连接到 Cassandra API 的说明。The following sections include the instructions to install, configure, and connect to Cassandra API in Azure Cosmos DB, on Windows or Linux using CQLSH.

备注

与 Azure Cosmos DB Cassandra API 的连接将不适用于 CQLSH 的 DataStax Enterprise (DSE) 版本。Connections to Azure Cosmos DB Cassandra API will not work with DataStax Enterprise (DSE) versions of CQLSH. 连接到 Cassandra API 时,请确保只使用 CQLSH 的开源 Apache Cassandra 版本。Please ensure you use only the open source Apache Cassandra versions of CQLSH when connecting to Cassandra API.

Windows:Windows:

如果使用 Windows,建议启用适用于 Linux 的 Windows 文件系统If using windows, we recommend you enable the Windows filesystem for Linux. 然后即可按照以下 linux 命令进行操作。You can then follow the linux commands below.

Unix/Linux/Mac:Unix/Linux/Mac:

# Install default-jre and default-jdk
sudo apt install default-jre
sudo apt-get update
sudo apt install default-jdk

# Import the Baltimore CyberTrust root certificate:
curl https://cacert.omniroot.com/bc2025.crt > bc2025.crt
keytool -importcert -alias bc2025ca -file bc2025.crt

# Install the Cassandra libraries in order to get CQLSH:
echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
sudo apt-get update
sudo apt-get install cassandra

# Export the SSL variables:
export SSL_VERSION=TLSv1_2
export SSL_VALIDATE=false

# Connect to Azure Cosmos DB API for Cassandra:
cqlsh <YOUR_ACCOUNT_NAME>.cassandra.cosmos.azure.cn 10350 -u <YOUR_ACCOUNT_NAME> -p <YOUR_ACCOUNT_PASSWORD> --ssl

通过兼容 CQL V4 的 SDK 执行的所有 CRUD 操作都将返回有关错误及已使用请求单位的其他信息。All CRUD operations that are executed through a CQL v4 compatible SDK will return extra information about error and request units consumed. 处理 DELETE 和 UPDATE 命令时应考虑资源治理,以确保最有效地使用预配的吞吐量。The DELETE and UPDATE commands should be handled with resource governance taken into consideration, to ensure the most efficient use of the provisioned throughput.

  • 请注意:如果指定,gc_grace_seconds 值必须为零。Note gc_grace_seconds value must be zero if specified.

    var tableInsertStatement = table.Insert(sampleEntity); 
    var insertResult = await tableInsertStatement.ExecuteAsync(); 
    
    foreach (string key in insertResult.Info.IncomingPayload) 
            { 
                byte[] valueInBytes = customPayload[key]; 
                double value = Encoding.UTF8.GetString(valueInBytes); 
                Console.WriteLine($"CustomPayload:  {key}: {value}"); 
            } 
    

一致性映射Consistency mapping

Azure Cosmos DB Cassandra API 为读取操作提供了一致性选择。Azure Cosmos DB Cassandra API provides choice of consistency for read operations. 一致性映射的信息详见此文The consistency mapping is detailed here.

权限和角色管理Permission and role management

Azure Cosmos DB 支持基于角色的访问控制 (RBAC) 用于预配、旋转密钥、查看指标以及读写和只读密码/密钥(可通过 Azure 门户获取)。Azure Cosmos DB supports role-based access control (RBAC) for provisioning, rotating keys, viewing metrics and read-write and read-only passwords/keys that can be obtained through the Azure portal. Azure Cosmos DB 不支持 CRUD 活动的角色。Azure Cosmos DB does not support roles for CRUD activities.

密钥空间和表选项Keyspace and Table options

目前会忽略“创建密钥空间”命令中针对区域名称、类、replication_factor 和数据中心的选项。The options for region name, class, replication_factor, and datacenter in the "Create Keyspace" command are ignored currently. 系统使用基础 Azure Cosmos DB 的多区域分发复制方法来添加区域。The system uses the underlying Azure Cosmos DB's multiple-region distribution replication method to add the regions. 如果需要数据跨区域存在,可以使用 PowerShell、CLI 或门户在帐户级别启用它。若要了解详细信息,请参阅如何添加区域一文。If you need the cross-region presence of data, you can enable it at the account level with PowerShell, CLI, or portal, to learn more, see the how to add regions article. Durable_writes 不能禁用,因为 Azure Cosmos DB 需确保每次写入都是持久的。Durable_writes can't be disabled because Azure Cosmos DB ensures every write is durable. 在每个区域,Azure Cosmos DB 都会跨副本集(由四个副本组成)来复制数据。该副本集配置不能修改。In every region, Azure Cosmos DB replicates the data across the replica set that is made up of four replicas and this replica set configuration can't be modified.

在创建表时,会忽略所有选项,但 gc_grace_seconds 除外,后者应设置为零。All the options are ignored when creating the table, except gc_grace_seconds, which should be set to zero. 密钥空间和表有一个名为“cosmosdb_provisioned_throughput”的额外选项,该选项的最小值为 400 RU/秒。The Keyspace and table have an extra option named "cosmosdb_provisioned_throughput" with a minimum value of 400 RU/s. 密钥空间吞吐量允许跨多个表共享吞吐量,这适用于所有表都不利用预配的吞吐量的情况。The Keyspace throughput allows sharing throughput across multiple tables and it is useful for scenarios when all tables are not utilizing the provisioned throughput. “更改表”命令允许跨区域更改预配的吞吐量。Alter Table command allows changing the provisioned throughput across the regions.

CREATE  KEYSPACE  sampleks WITH REPLICATION = {  'class' : 'SimpleStrategy'}   AND cosmosdb_provisioned_throughput=2000;  

CREATE TABLE sampleks.t1(user_id int PRIMARY KEY, lastname text) WITH cosmosdb_provisioned_throughput=2000; 

ALTER TABLE gks1.t1 WITH cosmosdb_provisioned_throughput=10000 ;

使用 Cassandra 重试连接策略Usage of Cassandra retry connection policy

Azure Cosmos DB 是一种资源治理系统。Azure Cosmos DB is a resource governed system. 这意味着,你可以根据操作消耗的请求单位数在给定的秒内执行特定数目的操作。This means you can do a certain number of operations in a given second based on the request units consumed by the operations. 如果应用程序在给定的秒内超出该限制,则请求会受到速率限制,并会引发异常。If an application exceeds that limit in a given second, requests are rate-limited and exceptions will be thrown. Azure Cosmos DB 中的 Cassandra API 在 Cassandra 本机协议中将这些异常解释为过载错误。The Cassandra API in Azure Cosmos DB translates these exceptions to overloaded errors on the Cassandra native protocol. 为了确保应用程序在速率受限的情况下能够截获并重试请求,我们提供了 sparkJava 扩展。To ensure that your application can intercept and retry requests in case of rate limitation, the spark and the Java extensions are provided. 当连接到 Azure Cosmos DB 中的 Cassandra API 时,另请参阅 Datastax 驱动程序版本 3版本 4 的 Java 代码示例。See also Java code samples for version 3 and version 4 Datastax drivers, when connecting to Cassandra API in Azure Cosmos DB. 在 Azure Cosmos DB 中,如果使用其他 SDK 来访问 Cassandra API,请创建一项连接策略,以便在出现这些异常时进行重试。If you use other SDKs to access Cassandra API in Azure Cosmos DB, create a connection policy to retry on these exceptions.

后续步骤Next steps