将应用程序从 Amazon DynamoDB 迁移到 Azure Cosmos DBMigrate your application from Amazon DynamoDB to Azure Cosmos DB

Azure Cosmos DB 是可缩放的多区域分布式完全托管型数据库。Azure Cosmos DB is a scalable, multiple-regionally distributed, fully managed database. 它能够确保数据访问延迟较低。It provides guaranteed low latency access to your data. 若要详细了解 Azure Cosmos DB,请参阅概述一文。To learn more about Azure Cosmos DB, see the overview article. 本文介绍如何在最低限度更改代码的情况下将 .NET 应用程序从 DynamoDB 迁移到 Azure Cosmos DB。This article describes how to migrate your .NET application from DynamoDB to Azure Cosmos DB with minimal code changes.

概念差异Conceptual differences

下面是 Azure Cosmos DB 与 DynamoDB 之间的主要概念差异:The following are the key conceptual differences between Azure Cosmos DB and DynamoDB:

DynamoDBDynamoDB Azure Cosmos DBAzure Cosmos DB
不适用Not applicable 数据库Database
Table 集合Collection
项目Item 文档Document
属性Attribute 字段Field
辅助索引Secondary Index 辅助索引Secondary Index
主键 - 分区键Primary Key - Partition key 分区键Partition Key
主键 - 排序键Primary Key - Sort Key 不需要Not Required
StreamStream ChangeFeedChangeFeed
写入计算单位Write Compute Unit 请求单位(灵活,可用于读取或写入)Request Unit (Flexible, can be used for reads or writes)
读取计算单位Read Compute Unit 请求单位(灵活,可用于读取或写入)Request Unit (Flexible, can be used for reads or writes)
全局表Global Tables 不需要。Not Required. 预配 Azure Cosmos 帐户时可直接选择区域(稍后可更改区域)You can directly select the region while provisioning the Azure Cosmos account (you can change the region later)

结构差异Structural differences

与 DynamoDB 相比,Azure Cosmos DB 具有更简单的 JSON 结构。Azure Cosmos DB has a simpler JSON structure when compared to that of DynamoDB. 以下示例显示了两者的差异The following example shows the differences

DynamoDBDynamoDB:

以下 JSON 对象表示 DynamoDB 中的数据格式The following JSON object represents the data format in DynamoDB

{
TableName: "Music",
KeySchema: [
{ 
  AttributeName: "Artist",
  KeyType: "HASH", //Partition key
},
{ 
  AttributeName: "SongTitle",
  KeyType: "RANGE" //Sort key
}
],
AttributeDefinitions: [
{ 
  AttributeName: "Artist",
  AttributeType: "S"
},
{ 
  AttributeName: "SongTitle",
  AttributeType: "S"
}
],
ProvisionedThroughput: {
  ReadCapacityUnits: 1,
  WriteCapacityUnits: 1
 }
}

Azure Cosmos DBAzure Cosmos DB:

以下 JSON 对象表示 Azure Cosmos DB 中的数据格式The following JSON object represents the data format in Azure Cosmos DB

{
"Artist": "",
"SongTitle": "",
"AlbumTitle": "",
"Year": 9999,
"Price": 0.0,
"Genre": "",
"Tags": ""
}

迁移数据Migrate your data

有多种选项可用于将数据迁移到 Azure Cosmos DB。There are various options available to migrate your data to Azure Cosmos DB. 要了解详细信息,请参阅用于将本地或云数据迁移到 Azure Cosmos DB 的选项一文。To learn more, see the Options to migrate your on-premises or cloud data to Azure Cosmos DB article.

迁移代码Migrate your code

本文主要讲的是将应用程序的代码迁移到 Azure Cosmos DB,这是数据库迁移的关键方面。This article is scoped to migrate an application's code to Azure Cosmos DB, which is the critical aspect of database migration. 为帮助你更快获得知识,以下部分中有 Amazon DynamoDB 与 Azure Cosmos DB 的等效代码片段之间的代码并排比较。To help you reduce learning curve, the following sections include a side-by-side code comparison between Amazon DynamoDB and Azure Cosmos DB's equivalent code snippet.

要下载源代码,请克隆以下存储库:To download the source code, clone the following repo:

git clone https://github.com/Azure-Samples/DynamoDB-to-CosmosDB

先决条件Pre-requisites

  • .NET Framework 4.7.2.NET Framework 4.7.2
  • Visual Studio 2019Visual Studio 2019
  • 对 Azure Cosmos DB SQL API 帐户的访问权限Access to Azure Cosmos DB SQL API Account
  • Amazon DynamoDB 的本地安装Local installation of Amazon DynamoDB
  • Java 8Java 8
  • 在端口 8000 上运行 Amazon DynamoDB 的可下载版本(可更改和配置代码)Run the downloadable version of Amazon DynamoDB at port 8000 (you can change and configure the code)

设置代码Set up your code

在项目中添加以下 NuGet 包:Add the following "NuGet package" to your project:

Install-Package Microsoft.Azure.Cosmos 

建立连接Establish connection

DynamoDBDynamoDB:

在 Amazon DynamoDB 中,下列代码用于连接:In Amazon DynamoDB, the following code is used to connect:

    AmazonDynamoDBConfig addbConfig = new AmazonDynamoDBConfig() ;
        addbConfig.ServiceURL = "endpoint";
        try { aws_dynamodbclient = new AmazonDynamoDBClient( addbConfig ); }

Azure Cosmos DBAzure Cosmos DB:

若要连接 Azure Cosmos DB,请将代码更新为:To connect Azure Cosmos DB, update your code to:

client_documentDB = new CosmosClient("your connectionstring from the Azure portal");

在 Azure Cosmos DB 中优化连接Optimize the connection in Azure Cosmos DB

借助 Azure Cosmos DB,可使用以下代码来优化连接:With Azure Cosmos DB, you can use the following options to optimize your connection:

  • ConnectionMode 使用直接连接模式连接到 Azure Cosmos DB 服务中的数据节点。ConnectionMode - Use direct connection mode to connect to the data nodes in the Azure Cosmos DB service. 使用网关模式仅初始化和缓存逻辑地址,并在更新时进行刷新。Use gateway mode only to initialize and cache the logical addresses and refresh on updates. 有关更多详细信息,请查看连接模式一文。See the connectivity modes article for more details.

  • ApplicationRegion - 此选项用于设置首选异地复制区域,该区域用来与 Azure Cosmos DB 进行交互。ApplicationRegion - This option is used to set the preferred geo-replicated region that is used to interact with Azure Cosmos DB. 若要了解详细信息,请参阅多区域分布一文。To learn more see the multiple-region distribution article.

  • ConsistencyLevel - 此选项用于替代默认一致性级别。ConsistencyLevel - This option is used to override default consistency level. 要了解详细信息,请查看一致性级别一文。To learn more, see the Consistency levels article.

  • BulkExecutionMode - 此选项通过将 AllowBulkExecution 属性设置为 true 来执行批量操作。BulkExecutionMode - This option is used to execute bulk operations by setting the AllowBulkExecution property to true. 若要了解详细信息,请参阅运行作业文章。To learn more see the Bulk import article.

    client_cosmosDB = new CosmosClient(" Your connection string ",new CosmosClientOptions()
    { 
      ConnectionMode=ConnectionMode.Direct,
      ApplicationRegion=Regions.ChinaEast2,
      ConsistencyLevel=ConsistencyLevel.Session,
      AllowBulkExecution=true  
    });
    

预配容器Provision the container

DynamoDBDynamoDB:

若要将数据存储到 Amazon DynamoDB 中,需要先创建表。To store the data into Amazon DynamoDB you need to create the table first. 在此过程中,要定义架构、键类型和属性,如以下代码中所示:In this process you define the schema, key type, and attributes as shown in the following code:

// movies_key_schema
public static List<KeySchemaElement> movies_key_schema
  = new List<KeySchemaElement>
{
  new KeySchemaElement
  {
    AttributeName = partition_key_name,
    KeyType = "HASH"
  },
  new KeySchemaElement
  {
    AttributeName = sort_key_name,
    KeyType = "RANGE"
  }
};

// key names for the Movies table
public const string partition_key_name = "year";
public const string sort_key_name      = "title";
  public const int readUnits=1, writeUnits=1; 

    // movie_items_attributes
    public static List<AttributeDefinition> movie_items_attributes
  = new List<AttributeDefinition>
{
  new AttributeDefinition
  {
    AttributeName = partition_key_name,
    AttributeType = "N"
  },
  new AttributeDefinition
  {
    AttributeName = sort_key_name,
    AttributeType = "S"
  }

CreateTableRequest  request;
CreateTableResponse response;

// Build the 'CreateTableRequest' structure for the new table
request = new CreateTableRequest
{
  TableName             = table_name,
  AttributeDefinitions  = table_attributes,
  KeySchema             = table_key_schema,
  // Provisioned-throughput settings are always required,
  // although the local test version of DynamoDB ignores them.
  ProvisionedThroughput = new ProvisionedThroughput( readUnits, writeUnits );
};

Azure Cosmos DBAzure Cosmos DB:

在 Amazon DynamoDB 中,需要预配读取计算单位和写入计算单位。In Amazon DynamoDB, you need to provision the read compute units & write compute units. 而在 Azure Cosmos DB 中,则要将吞吐量指定为请求单位 (RU/s),它可动态用于任何操作。Whereas in Azure Cosmos DB you specify the throughput as Request Units (RU/s), which can be used for any operations dynamically. 数据按“数据库”-->“容器”-->“项”的顺序整理。The data is organized as database --> container--> item. 你可在数据库级别和/或集合级别指定吞吐量。You can specify the throughput at database level or at collection level or both.

若要创建数据库:To create a database:

await client_cosmosDB.CreateDatabaseIfNotExistsAsync(movies_table_name);

若要创建容器:To create the container:

await cosmosDatabase.CreateContainerIfNotExistsAsync(new ContainerProperties() { PartitionKeyPath = "/" + partitionKey, Id = new_collection_name }, provisionedThroughput);

加载数据Load the data

DynamoDBDynamoDB:

以下代码演示了如何在 Amazon DynamoDB 中加载数据。The following code shows how to load the data in Amazon DynamoDB. moviesArray 包含 JSON 文档的列表,你需要遍历它并将 JSON 文档加载到 Amazon DynamoDB 中:The moviesArray consists of list of JSON document then you need to iterate through and load the JSON document into Amazon DynamoDB:

int n = moviesArray.Count;
for( int i = 0, j = 99; i < n; i++ )
    {
  try
  {
    string itemJson = moviesArray[i].ToString();
    Document doc = Document.FromJson(itemJson);
    Task putItem = moviesTable.PutItemAsync(doc);
    if( i >= j )
    {
      j++;
      Console.Write( "{0,5:#,##0}, ", j );
      if( j % 1000 == 0 )
        Console.Write( "\n " );
      j += 99;
    }
    await putItem;

Azure Cosmos DBAzure Cosmos DB:

在 Azure Cosmos DB,你可使用 moviesContainer.CreateItemStreamAsync() 选择进行流式传输和写入。In Azure Cosmos DB, you can opt for stream and write with moviesContainer.CreateItemStreamAsync(). 但在本例中,JSON 将反序列化为 MovieModel 类型,以演示类型强制转换功能。However, in this sample, the JSON will be deserialized into the MovieModel type to demonstrate type-casting feature. 代码是多线程的,这将使用 Azure Cosmos DB 的分布式体系结构并加快加载速度:The code is multi-threaded, which will use Azure Cosmos DB's distributed architecture and speed-up the loading:

List<Task> concurrentTasks = new List<Task>();
for (int i = 0, j = 99; i < n; i++)
{
  try
  {
      MovieModel doc= JsonConvert.DeserializeObject<MovieModel>(moviesArray[i].ToString());
      doc.Id = Guid.NewGuid().ToString();
      concurrentTasks.Add(moviesContainer.CreateItemAsync(doc,new PartitionKey(doc.Year)));
      {
          j++;
          Console.Write("{0,5:#,##0}, ", j);
          if (j % 1000 == 0)
              Console.Write("\n               ");
          j += 99;
      }

  }
  catch (Exception ex)
  {
      Console.WriteLine("\n     ERROR: Could not write the movie record #{0:#,##0}, because:\n       {1}",
                          i, ex.Message);
      operationFailed = true;
      break;
  }
}
await Task.WhenAll(concurrentTasks);

创建文档Create a document

DynamoDBDynamoDB:

在 Amazon DynamoDB 中写入新文档不是类型安全的,以下示例使用 newItem 作为文档类型:Writing a new document in Amazon DynamoDB isn't type safe, the following example uses newItem as document type:

Task<Document> writeNew = moviesTable.PutItemAsync(newItem, token);
await writeNew;

Azure Cosmos DBAzure Cosmos DB:

Azure Cosmos DB 通过数据模型来确保类型安全。Azure Cosmos DB provides you type safety via data model. 我们使用名为“MovieModel”的数据模型:We use data model named 'MovieModel':

public class MovieModel
{
    [JsonProperty("id")]
    public string Id { get; set; }
    [JsonProperty("title")]
    public string Title{ get; set; }
    [JsonProperty("year")]
    public int Year { get; set; }
    public MovieModel(string title, int year)
    {
        this.Title = title;
        this.Year = year;
    }
    public MovieModel()
    {

    }
    [JsonProperty("info")]
    public   MovieInfo MovieInfo { get; set; }

    internal string PrintInfo()
    {
        if(this.MovieInfo!=null)
        return            string.Format("\nMovie with title:{1}\n Year: {2}, Actors: {3}\n Directors:{4}\n Rating:{5}\n", this.Id, this.Title, this.Year, String.Join(",",this.MovieInfo.Actors), this.MovieInfo, this.MovieInfo.Rating);
        else
            return string.Format("\nMovie with  title:{0}\n Year: {1}\n",  this.Title, this.Year);
    }
}

在 Azure Cosmos DB 中,newItem 将成为 MovieModel:In Azure Cosmos DB newItem will be MovieModel:

 MovieModel movieModel = new MovieModel()
            {
                Id = Guid.NewGuid().ToString(),
                Title = "The Big New Movie",
                Year = 2018,
                MovieInfo = new MovieInfo() { Plot = "Nothing happens at all.", Rating = 0 }
            };
    var writeNew= moviesContainer.CreateItemAsync(movieModel, new Microsoft.Azure.Cosmos.PartitionKey(movieModel.Year));
    await writeNew;

读取文档Read a document

DynamoDBDynamoDB:

若要在 Amazon DynamoDB 中进行读取,需要定义基元:To read in Amazon DynamoDB, you need to define primitives:

// Create Primitives for the HASH and RANGE portions of the primary key
Primitive hash = new Primitive(year.ToString(), true);
Primitive range = new Primitive(title, false);

  Task<Document> readMovie = moviesTable.GetItemAsync(hash, range, token);
  movie_record = await readMovie;

Azure Cosmos DBAzure Cosmos DB:

但是,使用 Azure Cosmos DB 时,查询是一种自然查询 (linq):However, with Azure Cosmos DB the query is natural (linq):

IQueryable<MovieModel> movieQuery = moviesContainer.GetItemLinqQueryable<MovieModel>(true)
                        .Where(f => f.Year == year && f.Title == title);
// The query is executed synchronously here, but can also be executed asynchronously via the IDocumentQuery<T> interface
    foreach (MovieModel movie in movieQuery)
    {
      movie_record_cosmosdb = movie;
    }

上例中的文档集合将:The documents collection in the above example will be:

  • 是类型安全的type safe
  • 提供一个自然查询选项。provide a natural query option.

更新项Update an item

DynamoDB:更新 Amazon DynamoDB 中的项:DynamoDB: To update the item in Amazon DynamoDB:

updateResponse = await client.UpdateItemAsync( updateRequest );

Azure Cosmos DBAzure Cosmos DB:

在 Azure Cosmos DB,更新将被看作是一项更新插入操作,即在没有文档时插入文档:In Azure Cosmos DB, update will be treated as Upsert operation meaning insert the document if it doesn't exist:

await moviesContainer.UpsertItemAsync<MovieModel>(updatedMovieModel);

删除文档Delete a document

DynamoDBDynamoDB:

若要删除 Amazon DynamoDB 中的项,需要再次使用基元:To delete an item in Amazon DynamoDB, you again need to fall on primitives:

Primitive hash = new Primitive(year.ToString(), true);
      Primitive range = new Primitive(title, false);
      DeleteItemOperationConfig deleteConfig = new DeleteItemOperationConfig( );
      deleteConfig.ConditionalExpression = condition;
      deleteConfig.ReturnValues = ReturnValues.AllOldAttributes;

  Task<Document> delItem = table.DeleteItemAsync( hash, range, deleteConfig );
        deletedItem = await delItem;

Azure Cosmos DBAzure Cosmos DB:

在 Azure Cosmos DB 中,可获取文档并异步删除它们:In Azure Cosmos DB, we can get the document and delete them asynchronously:

var result= ReadingMovieItem_async_List_CosmosDB("select * from c where c.info.rating>7 AND c.year=2018 AND c.title='The Big New Movie'");
while (result.HasMoreResults)
{
  var resultModel = await result.ReadNextAsync();
  foreach (var movie in resultModel.ToList<MovieModel>())
  {
    await moviesContainer.DeleteItemAsync<MovieModel>(movie.Id, new PartitionKey(movie.Year));
  }
  }

查询文档Query documents

DynamoDBDynamoDB:

在 Amazon DynamoDB 中,需使用 API 函数来查询数据:In Amazon DynamoDB, api functions are required to query the data:

QueryOperationConfig config = new QueryOperationConfig( );
  config.Filter = new QueryFilter( );
  config.Filter.AddCondition( "year", QueryOperator.Equal, new DynamoDBEntry[ ] { 1992 } );
  config.Filter.AddCondition( "title", QueryOperator.Between, new DynamoDBEntry[ ] { "B", "Hzz" } );
  config.AttributesToGet = new List<string> { "year", "title", "info" };
  config.Select = SelectValues.SpecificAttributes;
  search = moviesTable.Query( config ); 

Azure Cosmos DBAzure Cosmos DB:

在 Azure Cosmos DB 中,可在简单的 SQL 查询中进行投影和筛选:In Azure Cosmos DB, you can do projection and filter inside a simple sql query:

var result = moviesContainer.GetItemQueryIterator<MovieModel>( 
  "select c.Year, c.Title, c.info from c where Year=1998 AND (CONTAINS(Title,'B') OR CONTAINS(Title,'Hzz'))");

对于范围操作(例如针对“介于”范围),需要在 Amazon DynamoDB 中进行扫描:For range operations, for example, 'between', you need to do a scan in Amazon DynamoDB:

ScanRequest sRequest = new ScanRequest
{
  TableName = "Movies",
  ExpressionAttributeNames = new Dictionary<string, string>
  {
    { "#yr", "year" }
  },
  ExpressionAttributeValues = new Dictionary<string, AttributeValue>
  {
      { ":y_a", new AttributeValue { N = "1960" } },
      { ":y_z", new AttributeValue { N = "1969" } },
  },
  FilterExpression = "#yr between :y_a and :y_z",
  ProjectionExpression = "#yr, title, info.actors[0], info.directors, info.running_time_secs"
};

ClientScanning_async( sRequest ).Wait( );

在 Azure Cosmos DB 中,可使用 SQL 查询和单行语句:In Azure Cosmos DB, you can use SQL query and a single-line statement:

var result = moviesContainer.GetItemQueryIterator<MovieModel>( 
  "select c.title, c.info.actors[0], c.info.directors,c.info.running_time_secs from c where BETWEEN year 1960 AND 1969");

删除容器Delete a container

DynamoDBDynamoDB:

若要删除 Amazon DynamoDB 中的表,可指定:To delete the table in Amazon DynamoDB, you can specify:

client.DeleteTableAsync( tableName );

Azure Cosmos DBAzure Cosmos DB:

若要删除 Azure Cosmos DB 中的集合,可指定:To delete the collection in Azure Cosmos DB, you can specify:

await moviesContainer.DeleteContainerAsync();

需要时,还可删除数据库:Then delete the database too if you need:

await cosmosDatabase.DeleteAsync();

正如你所见,Azure Cosmos DB 支持自然查询 (SQL)、采用异步操作且操作简单得多。As you can see, Azure Cosmos DB supports natural queries (SQL), operations are asynchronous and much easier. 你可将复杂的代码轻松迁移到 Azure Cosmos DB,代码在迁移后会变得简单。You can easily migrate your complex code to Azure Cosmos DB, which becomes simpler after the migration.

后续步骤Next Steps