Azure Cosmos DB Gremlin 图形支持Azure Cosmos DB Gremlin graph support

Azure Cosmos DB 支持 Apache Tinkerpop 的图形遍历语言(称为 Gremlin)。Azure Cosmos DB supports Apache Tinkerpop's graph traversal language, known as Gremlin. 可以使用 Gremlin 语言创建图形实体(顶点和边缘)、修改这些实体内部的属性、执行查询和遍历,以及删除实体。You can use the Gremlin language to create graph entities (vertices and edges), modify properties within those entities, perform queries and traversals, and delete entities.

本文提供 Gremlin 的快速演练,并列举 Gremlin API 支持的 Gremlin 功能。In this article, we provide a quick walkthrough of Gremlin and enumerate the Gremlin features that are supported by the Gremlin API.

兼容的客户端库Compatible client libraries

下表显示可以对 Azure Cosmos DB 使用的常用 Gremlin 驱动程序:The following table shows popular Gremlin drivers that you can use against Azure Cosmos DB:

下载Download Source 入门Getting Started 支持的连接器版本Supported connector version
.NET.NET GitHub 上的 Gremlin.NETGremlin.NET on GitHub 使用 .NET 创建图形Create Graph using .NET
JavaJava Gremlin JavaDocGremlin JavaDoc 使用 Java 创建图形Create Graph using Java 3.2.0+3.2.0+
Node.jsNode.js GitHub 上的 Gremlin-JavaScriptGremlin-JavaScript on GitHub 使用 Node.js 创建图形Create Graph using Node.js 3.3.4+3.3.4+
PythonPython GitHub 上的 Gremlin-PythonGremlin-Python on GitHub 使用 Python 创建图形Create Graph using Python
PHPPHP GitHub 上的 Gremlin-PHPGremlin-PHP on GitHub 使用 PHP 创建图形Create Graph using PHP
Gremlin 控制台Gremlin console TinkerPop 文档TinkerPop docs 使用 Gremlin 控制台创建图形Create Graph using Gremlin Console 3.2.0 +3.2.0 +

支持的图对象Supported Graph Objects

TinkerPop 是涵盖多种图形技术的标准。TinkerPop is a standard that covers a wide range of graph technologies. 因此,它使用标准的术语来描述图形提供程序提供的功能。Therefore, it has standard terminology to describe what features are provided by a graph provider. Azure Cosmos DB 提供一个可跨多个服务器或群集分区的持久性、高并发性、可写的图形数据库。Azure Cosmos DB provides a persistent, high concurrency, writeable graph database that can be partitioned across multiple servers or clusters.

下表列出了 Azure Cosmos DB 实现的 TinkerPop 功能:The following table lists the TinkerPop features that are implemented by Azure Cosmos DB:

类别Category Azure Cosmos DB 实现Azure Cosmos DB implementation 说明Notes
图形功能Graph features 提供持久性和并发访问。Provides Persistence and ConcurrentAccess. 旨在支持事务Designed to support Transactions 可通过 Spark 连接器实现计算机方法。Computer methods can be implemented via the Spark connector.
变量功能Variable features 支持布尔值、整数、字节、双精度值、浮点值、长整数和字符串Supports Boolean, Integer, Byte, Double, Float, Integer, Long, String 支持基元类型,通过数据模型与复杂类型兼容Supports primitive types, is compatible with complex types via data model
顶点功能Vertex features 支持 RemoveVertices、MetaProperties、AddVertices、MultiProperties、StringIds、UserSuppliedIds、AddProperty、RemovePropertySupports RemoveVertices, MetaProperties, AddVertices, MultiProperties, StringIds, UserSuppliedIds, AddProperty, RemoveProperty 支持创建、修改和删除顶点Supports creating, modifying, and deleting vertices
顶点属性功能Vertex property features StringIds、UserSuppliedIds、AddProperty、RemoveProperty、BooleanValues、ByteValues、DoubleValues、FloatValues、IntegerValues、LongValues、StringValuesStringIds, UserSuppliedIds, AddProperty, RemoveProperty, BooleanValues, ByteValues, DoubleValues, FloatValues, IntegerValues, LongValues, StringValues 支持创建、修改和删除顶点属性Supports creating, modifying, and deleting vertex properties
边缘功能Edge features AddEdges、RemoveEdges、StringIds、UserSuppliedIds、AddProperty、RemovePropertyAddEdges, RemoveEdges, StringIds, UserSuppliedIds, AddProperty, RemoveProperty 支持创建、修改和删除边缘Supports creating, modifying, and deleting edges
边缘属性功能Edge property features Properties、BooleanValues、ByteValues、DoubleValues、FloatValues、IntegerValues、LongValues、StringValuesProperties, BooleanValues, ByteValues, DoubleValues, FloatValues, IntegerValues, LongValues, StringValues 支持创建、修改和删除边缘属性Supports creating, modifying, and deleting edge properties

Gremlin 网络格式Gremlin wire format

从 Gremlin 操作返回结果时,Azure Cosmos DB 使用 JSON 格式。Azure Cosmos DB uses the JSON format when returning results from Gremlin operations. Azure Cosmos DB 目前支持 JSON 格式。Azure Cosmos DB currently supports the JSON format. 例如,以下代码片段显示了从 Azure Cosmos DB 返回到客户端的某个顶点的 JSON 表示形式。 For example, the following snippet shows a JSON representation of a vertex returned to the client from Azure Cosmos DB:

    "id": "a7111ba7-0ea1-43c9-b6b2-efc5e3aea4c0",
    "label": "person",
    "type": "vertex",
    "outE": {
      "knows": [
          "id": "3ee53a60-c561-4c5e-9a9f-9c7924bc9aef",
          "inV": "04779300-1c8e-489d-9493-50fd1325a658"
          "id": "21984248-ee9e-43a8-a7f6-30642bc14609",
          "inV": "a8e3e741-2ef7-4c01-b7c8-199f8e43e3bc"
    "properties": {
      "firstName": [
          "value": "Thomas"
      "lastName": [
          "value": "Andersen"
      "age": [
          "value": 45

下面介绍了顶点的 JSON 格式使用的属性:The properties used by the JSON format for vertices are described below:

属性Property 说明Description
id 顶点的 ID。The ID for the vertex. 必须唯一(在适用的情况下,可与 _partition 的值合并)。Must be unique (in combination with the value of _partition if applicable). 如果未提供任何值,则系统会自动提供一个包含 GUID 的值If no value is provided, it will be automatically supplied with a GUID
label 顶点的标签。The label of the vertex. 此属性用于描述实体类型。This property is used to describe the entity type.
type 用于将顶点与非图形文档相区分Used to distinguish vertices from non-graph documents
properties 与顶点关联的用户定义属性包。Bag of user-defined properties associated with the vertex. 每个属性可以有多个值。Each property can have multiple values.
_partition 顶点的分区键。The partition key of the vertex. 用于图形分区Used for graph partitioning.
outE 此属性包含顶点中外部边缘的列表。This property contains a list of out edges from a vertex. 存储顶点的相邻信息,以便快速执行遍历。Storing the adjacency information with vertex allows for fast execution of traversals. 边缘根据其标签分组。Edges are grouped based on their labels.

边缘包含以下信息,以方便导航到图形的其他部件。And the edge contains the following information to help with navigation to other parts of the graph.

propertiesProperty 说明Description
id 边缘的 ID。The ID for the edge. 必须唯一(在适用的情况下,可与 _partition 的值合并)Must be unique (in combination with the value of _partition if applicable)
label 边缘的标签。The label of the edge. 此属性是可选的,用于描述关系类型。This property is optional, and used to describe the relationship type.
inV 此属性包含边缘的一系列顶点。This property contains a list of in vertices for an edge. 存储顶点的相邻信息可以快速执行遍历。Storing the adjacency information with the edge allows for fast execution of traversals. 顶点根据其标签分组。Vertices are grouped based on their labels.
properties 与边缘关联的用户定义属性包。Bag of user-defined properties associated with the edge. 每个属性可以有多个值。Each property can have multiple values.

每个属性可在一个数组中存储多个值。Each property can store multiple values within an array.

propertiesProperty 说明Description
value 属性的值The value of the property

Gremlin 的步骤Gremlin steps

现在,让我们了解 Azure Cosmos DB 支持的 Gremlin 步骤。Now let's look at the Gremlin steps supported by Azure Cosmos DB. 有关 Gremlin 的完整参考信息,请参阅 TinkerPop 参考For a complete reference on Gremlin, see TinkerPop reference.

步骤step 说明Description TinkerPop 3.2 文档TinkerPop 3.2 Documentation
addE 在两个顶点之间添加边缘Adds an edge between two vertices addE 步骤addE step
addV 将顶点添加到图形Adds a vertex to the graph addV 步骤addV step
and 确保所有遍历都返回值Ensures that all the traversals return a value and 步骤and step
as 用于向步骤的输出分配变量的步骤调制器A step modulator to assign a variable to the output of a step as 步骤as step
by grouporder 配合使用的步骤调制器A step modulator used with group and order by 步骤by step
coalesce 返回第一个返回结果的遍历Returns the first traversal that returns a result coalesce 步骤coalesce step
constant 返回常量值。Returns a constant value. coalesce 配合使用Used with coalesce constant 步骤constant step
count 从遍历返回计数Returns the count from the traversal count 步骤count step
dedup 返回已删除重复内容的值Returns the values with the duplicates removed dedup 步骤dedup step
drop 丢弃值(顶点/边缘)Drops the values (vertex/edge) drop 步骤drop step
executionProfile 创建执行的 Gremlin 步骤生成的所有操作的说明Creates a description of all operations generated by the executed Gremlin step executionProfile 步骤executionProfile step
fold 充当用于计算结果聚合值的屏障Acts as a barrier that computes the aggregate of results fold 步骤fold step
group 根据指定的标签将值分组Groups the values based on the labels specified group 步骤group step
has 用于筛选属性、顶点和边缘。Used to filter properties, vertices, and edges. 支持 hasLabelhasIdhasNothas 变体。Supports hasLabel, hasId, hasNot, and has variants. has 步骤has step
inject 将值注入流中Inject values into a stream inject 步骤inject step
is 用于通过布尔表达式执行筛选器Used to perform a filter using a boolean expression is 步骤is step
limit 用于限制遍历中的项数Used to limit number of items in the traversal limit 步骤limit step
local 本地包装遍历的某个部分,类似于子查询Local wraps a section of a traversal, similar to a subquery local 步骤local step
not 用于生成筛选器的求反结果Used to produce the negation of a filter not 步骤not step
optional 如果生成了某个结果,则返回指定遍历的结果,否则返回调用元素Returns the result of the specified traversal if it yields a result else it returns the calling element optional 步骤optional step
or 确保至少有一个遍历会返回值Ensures at least one of the traversals returns a value or 步骤or step
order 按指定的排序顺序返回结果Returns results in the specified sort order order 步骤order step
path 返回遍历的完整路径Returns the full path of the traversal path 步骤path step
project 将属性投影为映射Projects the properties as a Map project 步骤project step
properties 返回指定标签的属性Returns the properties for the specified labels properties 步骤properties step
range 根据指定的值范围进行筛选Filters to the specified range of values range 步骤range step
repeat 将步骤重复指定的次数。Repeats the step for the specified number of times. 用于循环Used for looping repeat 步骤repeat step
sample 用于对遍历返回的结果采样Used to sample results from the traversal sample 步骤sample step
select 用于投影遍历返回的结果Used to project results from the traversal select 步骤select step
store 用于遍历返回的非阻塞聚合Used for non-blocking aggregates from the traversal store 步骤store step
TextP.startingWith(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来将某个属性与给定字符串的开头进行匹配This function is used as a predicate for the has() step to match a property with the beginning of a given string TextP 谓词TextP predicates
TextP.endingWith(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来将某个属性与给定字符串的结尾进行匹配This function is used as a predicate for the has() step to match a property with the ending of a given string TextP 谓词TextP predicates
TextP.containing(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来将某个属性与给定字符串的内容进行匹配This function is used as a predicate for the has() step to match a property with the contents of a given string TextP 谓词TextP predicates
TextP.notStartingWith(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来匹配不以给定字符串开头的属性This function is used as a predicate for the has() step to match a property that doesn't start with a given string TextP 谓词TextP predicates
TextP.notEndingWith(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来匹配不以给定字符串结尾的属性This function is used as a predicate for the has() step to match a property that doesn't end with a given string TextP 谓词TextP predicates
TextP.notContaining(string) 字符串筛选函数。String filtering function. 此函数用作 has() 步骤的谓词来匹配不包含给定字符串的属性This function is used as a predicate for the has() step to match a property that doesn't contain a given string TextP 谓词TextP predicates
tree 将顶点中的路径聚合到树中Aggregate paths from a vertex into a tree tree 步骤tree step
unfold 将迭代器作为步骤展开Unroll an iterator as a step unfold 步骤unfold step
union 合并多个遍历返回的结果Merge results from multiple traversals union 步骤union step
V 包括顶点与边缘之间的遍历所需的步骤:VEoutinbothoutEinEbothEoutVinVbothVotherVIncludes the steps necessary for traversals between vertices and edges V, E, out, in, both, outE, inE, bothE, outV, inV, bothV, and otherV for vertex 步骤vertex steps
where 用于筛选遍历返回的结果。Used to filter results from the traversal. 支持 eqneqltltegtgtebetween 运算符Supports eq, neq, lt, lte, gt, gte, and between operators where 步骤where step

Azure Cosmos DB 提供的写入优化引擎默认支持自动对顶点和边缘中的所有属性编制索引。The write-optimized engine provided by Azure Cosmos DB supports automatic indexing of all properties within vertices and edges by default. 因此,使用筛选器、范围查询、排序或聚合对任何属性执行的查询将从索引处理,并可有效完成。Therefore, queries with filters, range queries, sorting, or aggregates on any property are processed from the index, and served efficiently. 有关 Azure Cosmos DB 中索引编制的工作原理的详细信息,请参阅有关架构不可知的索引编制的文章。For more information on how indexing works in Azure Cosmos DB, see our paper on schema-agnostic indexing.

后续步骤Next steps