Delta Lake 和 Delta Engine 指南Delta Lake and Delta Engine guide
Delta Lake 是可以提高 Data Lake 可靠性的开源存储层。Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake 提供 ACID 事务和可缩放的元数据处理,并可以统一流处理和批数据处理。Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake 在现有 Data Lake 的顶层运行,与 Apache Spark API 完全兼容。Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs. 利用 Azure Databricks 上的 Delta Lake,便可以根据工作负载模式配置 Delta Lake。Delta Lake on Azure Databricks allows you to configure Delta Lake based on your workload patterns.
Azure Databricks 还包括 Delta Engine,这为快速交互式查询提供了优化的布局和索引。Azure Databricks also includes Delta Engine, which provides optimized layouts and indexes for fast interactive queries.
本指南介绍 Azure Databricks 上的 Delta Lake 和 Delta Engine。This guide covers Delta Lake on Azure Databricks and Delta Engine.
- 介绍Introduction
- Delta Lake 快速入门Delta Lake quickstart
- 介绍性笔记本Introductory notebooks
- 将数据引入到 Delta LakeIngest data into Delta Lake
- 表批量读取和写入Table batch reads and writes
- 表流读取和写入Table streaming reads and writes
- 表删除、更新和合并Table deletes, updates, and merges
- 表实用工具命令Table utility commands
- 约束Constraints
- 表版本控制Table versioning
- API 参考API reference
- 并发控制Concurrency control
- 迁移指南Migration guide
- 最佳做法Best practices
- 常见问题 (FAQ)Frequently asked questions (FAQ)
- 资源Resources
- Delta EngineDelta Engine