文档智能入门
本文内容
重要
- Azure 认知服务表单识别器现称为 Azure AI 文档智能。
- 某些平台仍在等待命名更新。
- 我们的文档中提及的所有表单识别器或文档智能均指同一项 Azure 服务。
Azure AI 文档智能/表单识别器是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本、表和关键数据。
可以使用编程语言 SDK 或调用 REST API 轻松将文档处理模型集成到工作流和应用程序中。
对于此快速入门,我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。
若要详细了解 API 功能和开发选项,请访问我们的概述页。
客户端库 | SDK 参考 | REST API 参考 | 包 | 示例 |支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
Azure 订阅 - 创建试用订阅。
Visual Studio IDE 的当前版本。
Azure AI 服务或表单识别器资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或 Azure AI 多服务资源以获取密钥和终结点。
可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 若要仅访问表单识别器,请创建表单识别器资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需从创建的资源获取密钥和终结点,以便将应用程序连接到表单识别器 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
启动 Visual Studio。
在“开始”页上,选择“创建新项目”。
在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。
- 在“配置新项目”对话框中,在项目名称框中输入
form_recognizer_quickstart
。 然后选择“下一步”。
在“附加信息”对话框窗口中,选择“.NET 8.0 (长期支持)”,然后选择“创建”。
右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.1.0
右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.0.0
若要与表单识别器服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你将使用 Azure 门户的 key
创建一个 AzureKeyCredential
,使用 AzureKeyCredential
和表单识别器 endpoint
创建一个 DocumentAnalysisClient
实例。
备注
- 从 .NET 6 开始,使用
console
模板的新项目将生成与以前版本不同的新程序样式。 - 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
- 使用较新版本时,只需编写
Main
方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。 - 有关详细信息,请参阅新的 C# 模板生成顶级语句。
打开 Program.cs 文件。
删除现有的代码(包括
Console.Writeline("Hello World!")
行)并选择以下代码示例之一,以复制并粘贴到应用程序的 Program.cs 文件中:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到脚本顶部的
Uri fileUri
变量中。 - 若要从 URI 中的特定文件中提取布局,请使用
StartAnalyzeDocumentFromUri
方法,并传递prebuilt-layout
作为模型 ID。 返回的值是一个AnalyzeResult
对象,其中包含来自已提交文档的数据。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding box is:");
Console.WriteLine($" Upper left => X: {line.BoundingPolygon[0].X}, Y= {line.BoundingPolygon[0].Y}");
Console.WriteLine($" Upper right => X: {line.BoundingPolygon[1].X}, Y= {line.BoundingPolygon[1].Y}");
Console.WriteLine($" Lower right => X: {line.BoundingPolygon[2].X}, Y= {line.BoundingPolygon[2].Y}");
Console.WriteLine($" Lower left => X: {line.BoundingPolygon[3].X}, Y= {line.BoundingPolygon[3].Y}");
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding box is:");
Console.WriteLine($" Upper left => X: {selectionMark.BoundingPolygon[0].X}, Y= {selectionMark.BoundingPolygon[0].Y}");
Console.WriteLine($" Upper right => X: {selectionMark.BoundingPolygon[1].X}, Y= {selectionMark.BoundingPolygon[1].Y}");
Console.WriteLine($" Lower right => X: {selectionMark.BoundingPolygon[2].X}, Y= {selectionMark.BoundingPolygon[2].Y}");
Console.WriteLine($" Lower left => X: {selectionMark.BoundingPolygon[3].X}, Y= {selectionMark.BoundingPolygon[3].Y}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
下面是预期输出的代码段:
Document Page 1 has 69 line(s), 425 word(s), and 15 selection mark(s).
Line 0 has content: 'UNITED STATES'.
Its bounding box is:
Upper left => X: 3.4915, Y= 0.6828
Upper right => X: 5.0116, Y= 0.6828
Lower right => X: 5.0116, Y= 0.8265
Lower left => X: 3.4915, Y= 0.8265
Line 1 has content: 'SECURITIES AND EXCHANGE COMMISSION'.
Its bounding box is:
Upper left => X: 2.1937, Y= 0.9061
Upper right => X: 6.297, Y= 0.9061
Lower right => X: 6.297, Y= 1.0498
Lower left => X: 2.1937, Y= 1.0498
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < line.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
}
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
}
}
}
Console.WriteLine("Paragraphs:");
foreach (DocumentParagraph paragraph in result.Paragraphs)
{
Console.WriteLine($" Paragraph content: {paragraph.Content}");
if (paragraph.Role != null)
{
Console.WriteLine($" Role: {paragraph.Role}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
Extract the layout of a document from a file stream
To extract the layout from a given file at a file stream, use the AnalyzeDocument method and pass prebuilt-layout as the model ID. The returned value is an AnalyzeResult object containing data about the submitted document.
string filePath = "<filePath>";
using var stream = new FileStream(filePath, FileMode.Open);
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-layout", stream);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < line.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
}
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
}
}
}
Console.WriteLine("Paragraphs:");
foreach (DocumentParagraph paragraph in result.Paragraphs)
{
Console.WriteLine($" Paragraph content: {paragraph.Content}");
if (paragraph.Role != null)
{
Console.WriteLine($" Role: {paragraph.Role}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample invoice document
Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");
Operation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
Console.WriteLine($"Document {i}:");
AnalyzedDocument document = result.Documents[i];
if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
{
if (vendorNameField.FieldType == DocumentFieldType.String)
{
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
{
if (customerNameField.FieldType == DocumentFieldType.String)
{
string customerName = customerNameField.Value.AsString();
Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
{
if (itemsField.FieldType == DocumentFieldType.List)
{
foreach (DocumentField itemField in itemsField.Value.AsList())
{
Console.WriteLine("Item:");
if (itemField.FieldType == DocumentFieldType.Dictionary)
{
IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();
if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
{
if (itemDescriptionField.FieldType == DocumentFieldType.String)
{
string itemDescription = itemDescriptionField.Value.AsString();
Console.WriteLine($" Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
}
}
if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
{
if (itemAmountField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();
Console.WriteLine($" Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
}
}
}
}
}
}
if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
{
if (subTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue subTotal = subTotalField.Value.AsCurrency();
Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
}
}
if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
{
if (totalTaxField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
}
}
if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
{
if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
}
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
下面是预期输出的代码段:
Document 0:
Vendor Name: 'CONTOSO LTD.', with confidence 0.962
Customer Name: 'MICROSOFT CORPORATION', with confidence 0.951
Item:
Description: 'Test for 23 fields', with confidence 0.899
Amount: '100', with confidence 0.902
Sub Total: '100', with confidence 0.979
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample invoice document
Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
Console.WriteLine($"Document {i}:");
AnalyzedDocument document = result.Documents[i];
if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
{
if (vendorNameField.FieldType == DocumentFieldType.String)
{
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
{
if (customerNameField.FieldType == DocumentFieldType.String)
{
string customerName = customerNameField.Value.AsString();
Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
{
if (itemsField.FieldType == DocumentFieldType.List)
{
foreach (DocumentField itemField in itemsField.Value.AsList())
{
Console.WriteLine("Item:");
if (itemField.FieldType == DocumentFieldType.Dictionary)
{
IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();
if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
{
if (itemDescriptionField.FieldType == DocumentFieldType.String)
{
string itemDescription = itemDescriptionField.Value.AsString();
Console.WriteLine($" Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
}
}
if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
{
if (itemAmountField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();
Console.WriteLine($" Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
}
}
}
}
}
}
if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
{
if (subTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue subTotal = subTotalField.Value.AsCurrency();
Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
}
}
if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
{
if (totalTaxField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
}
}
if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
{
if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
}
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
客户端库 | SDK 参考 | REST API 参考 | 包 (Maven) | 示例| 支持的 REST API 版本
客户端库 | SDK 参考 | REST API 参考 | 包 (Maven) | 示例|支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。 请参阅 Visual Studio Code 中的 Java。
提示
- Visual Studio Code 提供适用于 Windows 和 macOS 的 "Java 编码包"。该编码包是 VS Code、Java 开发工具包 (JDK) 和 Microsoft 建议扩展的集合。 编码包还可用于修复现有开发环境。
- 如果使用 VS Code 和适用于 Java 的编码包,请安装 Gradle for Java 扩展。
如果不使用 Visual Studio Code,请确保在开发环境中安装了以下内容:
Java 开发工具包 (JDK) 版本 8 或更高版本。 有关详细信息,请参阅 OpenJDK 的 Microsoft 版本。
Gradle 版本 6.8 或更高版本。
Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后将密钥和终结点粘贴到代码中:
在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognize-app 的新目录,并导航到该目录。
mkdir form-recognize-app && form-recognize-app
mkdir form-recognize-app; cd form-recognize-app
从工作目录运行
gradle init
命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。gradle init --type basic
当提示你选择一个 DSL 时,选择 Kotlin。
通过选择 Return 或 Enter 接受默认项目名称 (form-recognize-app)。
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation
语句与所需的插件和设置一起包括在内。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.1.0'
}
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation
语句与所需的插件和设置一起包括在内。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.0.0'
}
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你要通过 Azure 门户使用 key
创建一个 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建一个 DocumentAnalysisClient
实例。
从 doc-intel-app 目录运行以下命令:
mkdir -p src/main/java
创建以下目录结构:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 若要分析 URI 上的给定文件,请使用
beginAnalyzeDocumentFromUrl
方法并传递prebuilt-layout
作为模型 ID。返回的值是一个AnalyzeResult
对象,其中包含有关提交的文档的数据。 - 我们已将文件 URI 值添加到 main 方法的
documentUrl
变量中。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
String modelId = "prebuilt-layout";
SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutResultPoller =
client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage -> {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine ->
System.out.printf("Line %s is within a bounding polygon %s.%n",
documentLine.getContent(),
documentLine.getBoundingPolygon().toString()));
// words
documentPage.getWords().forEach(documentWord ->
System.out.printf("Word '%s' has a confidence score of %.2f%n",
documentWord.getContent(),
documentWord.getConfidence()));
// selection marks
documentPage.getSelectionMarks().forEach(documentSelectionMark ->
System.out.printf("Selection mark is %s and is within a bounding polygon %s with confidence %.2f.%n",
documentSelectionMark.getState().toString(),
documentSelectionMark.getBoundingPolygon().toString(),
documentSelectionMark.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeLayoutResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell -> {
System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
}
// Utility function to get the bounding polygon coordinates
private static String getBoundingCoordinates(List < Point > boundingPolygon) {
return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
point.getY())).collect(Collectors.joining(", "));
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
下面是预期输出的代码段:
Table 0 has 5 rows and 3 columns.
Cell 'Title of each class', has row index 0 and column index 0.
Cell 'Trading Symbol', has row index 0 and column index 1.
Cell 'Name of exchange on which registered', has row index 0 and column index 2.
Cell 'Common stock, $0.00000625 par value per share', has row index 1 and column index 0.
Cell 'MSFT', has row index 1 and column index 1.
Cell 'NASDAQ', has row index 1 and column index 2.
Cell '2.125% Notes due 2021', has row index 2 and column index 0.
Cell 'MSFT', has row index 2 and column index 1.
Cell 'NASDAQ', has row index 2 and column index 2.
Cell '3.125% Notes due 2028', has row index 3 and column index 0.
Cell 'MSFT', has row index 3 and column index 1.
Cell 'NASDAQ', has row index 3 and column index 2.
Cell '2.625% Notes due 2033', has row index 4 and column index 0.
Cell 'MSFT', has row index 4 and column index 1.
Cell 'NASDAQ', has row index 4 and column index 2.
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentTable;
import com.azure.ai.formrecognizer.documentanalysis.models.Point;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.util.List;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
String modelId = "prebuilt-layout";
SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutPoller =
client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage -> {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine ->
System.out.printf("Line '%s' is within a bounding polygon %s.%n",
documentLine.getContent(),
getBoundingCoordinates(documentLine.getBoundingPolygon())));
// words
documentPage.getWords().forEach(documentWord ->
System.out.printf("Word '%s' has a confidence score of %.2f.%n",
documentWord.getContent(),
documentWord.getConfidence()));
// selection marks
documentPage.getSelectionMarks().forEach(documentSelectionMark ->
System.out.printf("Selection mark is '%s' and is within a bounding polygon %s with confidence %.2f.%n",
documentSelectionMark.getSelectionMarkState().toString(),
getBoundingCoordinates(documentSelectionMark.getBoundingPolygon()),
documentSelectionMark.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeLayoutResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell -> {
System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
// styles
analyzeLayoutResult.getStyles().forEach(documentStyle -
> System.out.printf("Document is handwritten %s.%n", documentStyle.isHandwritten()));
}
/**
* Utility function to get the bounding polygon coordinates.
*/
private static String getBoundingCoordinates(List < Point > boundingPolygon) {
return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
point.getY())).collect(Collectors.joining(", "));
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(final String[] args) throws IOException {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String modelId = "prebuilt-invoice";
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);
AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();
for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
System.out.printf("----------- Analyzing invoice %d -----------%n", i);
DocumentField vendorNameField = invoiceFields.get("VendorName");
if (vendorNameField != null) {
if (DocumentFieldType.STRING == vendorNameField.getType()) {
String merchantName = vendorNameField.getValueAsString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n",
merchantName, vendorNameField.getConfidence());
}
}
DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
if (vendorAddressField != null) {
if (DocumentFieldType.STRING == vendorAddressField.getType()) {
String merchantAddress = vendorAddressField.getValueAsString();
System.out.printf("Vendor address: %s, confidence: %.2f%n",
merchantAddress, vendorAddressField.getConfidence());
}
}
DocumentField customerNameField = invoiceFields.get("CustomerName");
if (customerNameField != null) {
if (DocumentFieldType.STRING == customerNameField.getType()) {
String merchantAddress = customerNameField.getValueAsString();
System.out.printf("Customer Name: %s, confidence: %.2f%n",
merchantAddress, customerNameField.getConfidence());
}
}
DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
String customerAddr = customerAddressRecipientField.getValueAsString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
customerAddr, customerAddressRecipientField.getConfidence());
}
}
DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
if (invoiceIdField != null) {
if (DocumentFieldType.STRING == invoiceIdField.getType()) {
String invoiceId = invoiceIdField.getValueAsString();
System.out.printf("Invoice ID: %s, confidence: %.2f%n",
invoiceId, invoiceIdField.getConfidence());
}
}
DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
if (customerNameField != null) {
if (DocumentFieldType.DATE == invoiceDateField.getType()) {
LocalDate invoiceDate = invoiceDateField.getValueAsDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n",
invoiceDate, invoiceDateField.getConfidence());
}
}
DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
Double invoiceTotal = invoiceTotalField.getValueAsDouble();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
invoiceTotal, invoiceTotalField.getConfidence());
}
}
DocumentField invoiceItemsField = invoiceFields.get("Items");
if (invoiceItemsField != null) {
System.out.printf("Invoice Items: %n");
if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
invoiceItems.stream()
.filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
.map(documentField -> documentField.getValueAsMap())
.forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {
// See a full list of fields found on an invoice here:
// https://aka.ms/formrecognizer/invoicefields
if ("Description".equals(key)) {
if (DocumentFieldType.STRING == documentField.getType()) {
String name = documentField.getValueAsString();
System.out.printf("Description: %s, confidence: %.2fs%n",
name, documentField.getConfidence());
}
}
if ("Quantity".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double quantity = documentField.getValueAsDouble();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, documentField.getConfidence());
}
}
if ("UnitPrice".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double unitPrice = documentField.getValueAsDouble();
System.out.printf("Unit Price: %f, confidence: %.2f%n",
unitPrice, documentField.getConfidence());
}
}
if ("ProductCode".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double productCode = documentField.getValueAsDouble();
System.out.printf("Product Code: %f, confidence: %.2f%n",
productCode, documentField.getConfidence());
}
}
}));
}
}
}
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
下面是预期输出的代码段:
----------- Analyzing invoice 0 -----------
Analyzed document has doc type invoice with confidence : 1.00
Vendor Name: CONTOSO LTD., confidence: 0.92
Vendor address: 123 456th St New York, NY, 10001, confidence: 0.91
Customer Name: MICROSOFT CORPORATION, confidence: 0.84
Customer Address Recipient: Microsoft Corp, confidence: 0.92
Invoice ID: INV-100, confidence: 0.97
Invoice Date: 2019-11-15, confidence: 0.97
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzedDocument;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentField;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentFieldType;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.time.LocalDate;
import java.util.List;
import java.util.Map;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String modelId = "prebuilt-invoice";
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);
AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();
for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
System.out.printf("----------- Analyzing invoice %d -----------%n", i);
DocumentField vendorNameField = invoiceFields.get("VendorName");
if (vendorNameField != null) {
if (DocumentFieldType.STRING == vendorNameField.getType()) {
String merchantName = vendorNameField.getValueAsString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n",
merchantName, vendorNameField.getConfidence());
}
}
DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
if (vendorAddressField != null) {
if (DocumentFieldType.STRING == vendorAddressField.getType()) {
String merchantAddress = vendorAddressField.getValueAsString();
System.out.printf("Vendor address: %s, confidence: %.2f%n",
merchantAddress, vendorAddressField.getConfidence());
}
}
DocumentField customerNameField = invoiceFields.get("CustomerName");
if (customerNameField != null) {
if (DocumentFieldType.STRING == customerNameField.getType()) {
String merchantAddress = customerNameField.getValueAsString();
System.out.printf("Customer Name: %s, confidence: %.2f%n",
merchantAddress, customerNameField.getConfidence());
}
}
DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
String customerAddr = customerAddressRecipientField.getValueAsString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
customerAddr, customerAddressRecipientField.getConfidence());
}
}
DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
if (invoiceIdField != null) {
if (DocumentFieldType.STRING == invoiceIdField.getType()) {
String invoiceId = invoiceIdField.getValueAsString();
System.out.printf("Invoice ID: %s, confidence: %.2f%n",
invoiceId, invoiceIdField.getConfidence());
}
}
DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
if (customerNameField != null) {
if (DocumentFieldType.DATE == invoiceDateField.getType()) {
LocalDate invoiceDate = invoiceDateField.getValueAsDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n",
invoiceDate, invoiceDateField.getConfidence());
}
}
DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
Double invoiceTotal = invoiceTotalField.getValueAsDouble();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
invoiceTotal, invoiceTotalField.getConfidence());
}
}
DocumentField invoiceItemsField = invoiceFields.get("Items");
if (invoiceItemsField != null) {
System.out.printf("Invoice Items: %n");
if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
invoiceItems.stream()
.filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
.map(documentField -> documentField.getValueAsMap())
.forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {
// See a full list of fields found on an invoice here:
// https://aka.ms/formrecognizer/invoicefields
if ("Description".equals(key)) {
if (DocumentFieldType.STRING == documentField.getType()) {
String name = documentField.getValueAsString();
System.out.printf("Description: %s, confidence: %.2fs%n",
name, documentField.getConfidence());
}
}
if ("Quantity".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double quantity = documentField.getValueAsDouble();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, documentField.getConfidence());
}
}
if ("UnitPrice".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double unitPrice = documentField.getValueAsDouble();
System.out.printf("Unit Price: %f, confidence: %.2f%n",
unitPrice, documentField.getConfidence());
}
}
if ("ProductCode".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double productCode = documentField.getValueAsDouble();
System.out.printf("Product Code: %f, confidence: %.2f%n",
productCode, documentField.getConfidence());
}
}
}));
}
}
}
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
客户端库 | SDK 参考 | REST API 参考 | 包 (npm) | 示例 |支持的 REST API 版本
客户端库 | SDK 参考 | REST API 参考 | 包 (npm) | 示例 |支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Node.js。
Node.js 的最新
LTS
版本。Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
创建新的 Node.js Express 应用程序:在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为
doc-intel-app
的新目录并导航到该目录。mkdir doc-intel-app && cd doc-intel-app
运行
npm init
命令以初始化应用程序并为项目构建基架。npm init
使用终端中提供的提示指定项目的属性。
- 最重要的属性包括名称、版本号和入口点。
- 建议保留
index.js
作为入口点名称。 描述、测试命令、GitHub 存储库、关键字、作者和许可证信息均为可选属性 —— 在此项目中可跳过。 - 通过选择“返回”或“Enter”,接受括号中的建议。
- 完成提示后,doc-intel-app 目录中会创建一个
package.json
文件。
安装
ai-form-recognizer
客户端库和azure/identity
npm 包:npm i @azure/ai-form-recognizer@5.0.0 @azure/identity
- 应用的
package.json
文件将使用依赖项进行更新。
- 应用的
安装
ai-form-recognizer
客户端库和azure/identity
npm 包:npm i @azure/ai-form-recognizer@4.0.0 @azure/identity
在应用程序目录中创建一个名为
index.js
的文件。提示
- 可以使用 Powershell 创建新文件。
- 按住 Shift 键并右键单击该文件夹,在项目目录中打开 Powershell 窗口。
- 键入以下命令:New-Item index.js。
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你将使用 Azure 门户的 key
创建一个 AzureKeyCredential
,使用 AzureKeyCredential
和表单识别器 endpoint
创建一个 DocumentAnalysisClient
实例。
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URL 值添加到文件顶部附近的
formUrl
变量中。- 若要分析 URL 中的特定文件,请使用
beginAnalyzeDocuments
方法,并传入prebuilt-layout
作为模型 ID。
将以下代码示例添加到 index.js
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-layout", formUrl);
const {
pages,
tables
} = await poller.pollUntilDone();
if (pages.length <= 0) {
console.log("No pages were extracted from the document.");
} else {
console.log("Pages:");
for (const page of pages) {
console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
console.log(` ${page.width}x${page.height}, angle: ${page.angle}`);
console.log(` ${page.lines.length} lines, ${page.words.length} words`);
}
}
if (tables.length <= 0) {
console.log("No tables were extracted from the document.");
} else {
console.log("Tables:");
for (const table of tables) {
console.log(
`- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
);
}
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
下面是预期输出的代码段:
Pages:
- Page 1 (unit: inch)
8.5x11, angle: 0
69 lines, 425 words
Tables:
- Extracted table: 3 columns, 5 rows (15 cells)
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
const {
AzureKeyCredential,
DocumentAnalysisClient
} = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-invoice", invoiceUrl);
const {
pages,
tables
} = await poller.pollUntilDone();
if (pages.length <= 0) {
console.log("No pages were extracted from the document.");
} else {
console.log("Pages:");
for (const page of pages) {
console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
console.log(` ${page.width}x${page.height}, angle: ${page.angle}`);
console.log(` ${page.lines.length} lines, ${page.words.length} words`);
if (page.lines && page.lines.length > 0) {
console.log(" Lines:");
for (const line of page.lines) {
console.log(` - "${line.content}"`);
// The words of the line can also be iterated independently. The words are computed based on their
// corresponding spans.
for (const word of line.words()) {
console.log(` - "${word.content}"`);
}
}
}
}
}
if (tables.length <= 0) {
console.log("No tables were extracted from the document.");
} else {
console.log("Tables:");
for (const table of tables) {
console.log(
`- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
);
}
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
下面是预期输出的代码段:
Vendor Name: CONTOSO LTD.
Customer Name: MICROSOFT CORPORATION
Invoice Date: 2019-11-15T00:00:00.000Z
Due Date: 2019-12-15T00:00:00.000Z
Items:
- <no product code>
Description: Test for 23 fields
Quantity: 1
Date: undefined
Unit: undefined
Unit Price: 1
Tax: undefined
Amount: 100
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocument("prebuilt-invoice", invoiceUrl);
const {
documents: [document],
} = await poller.pollUntilDone();
if (document) {
const {
vendorName,
customerName,
invoiceDate,
dueDate,
items,
subTotal,
previousUnpaidBalance,
totalTax,
amountDue,
} = document.fields;
// The invoice model has many fields. For details, *see* [Invoice model field extraction](../../prebuilt/invoice.md#field-extraction)
console.log("Vendor Name:", vendorName && vendorName.value);
console.log("Customer Name:", customerName && customerName.value);
console.log("Invoice Date:", invoiceDate && invoiceDate.value);
console.log("Due Date:", dueDate && dueDate.value);
console.log("Items:");
for (const item of (items && items.values) || []) {
const { productCode, description, quantity, date, unit, unitPrice, tax, amount } =
item.properties;
console.log("-", (productCode && productCode.value) || "<no product code>");
console.log(" Description:", description && description.value);
console.log(" Quantity:", quantity && quantity.value);
console.log(" Date:", date && date.value);
console.log(" Unit:", unit && unit.value);
console.log(" Unit Price:", unitPrice && unitPrice.value);
console.log(" Tax:", tax && tax.value);
console.log(" Amount:", amount && amount.value);
}
console.log("Subtotal:", subTotal && subTotal.value);
console.log("Previous Unpaid Balance:", previousUnpaidBalance && previousUnpaidBalance.value);
console.log("Tax:", totalTax && totalTax.value);
console.log("Amount Due:", amountDue && amountDue.value);
} else {
throw new Error("Expected at least one receipt in the result.");
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据:
Azure 订阅 - 创建试用订阅
-
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
pip --version
来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Python 入门。
Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:
pip install azure-ai-formrecognizer==3.3.0
pip install azure-ai-formrecognizer==3.2.0b6
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你要通过 Azure 门户使用 key
创建一个 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建一个 DocumentAnalysisClient
实例。
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URL 值添加到
analyze_layout
函数的formUrl
变量中。
若要分析某个 URL 的特定文件,请使用 begin_analyze_document_from_url
方法,并将 prebuilt-layout
作为模型 ID 传入。 返回的值是一个 result
对象,其中包含有关已提交文档的数据。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding box '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding box '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
下面是预期输出的代码段:
----Analyzing layout from page #1----
Page has width: 8.5 and height: 11.0, measured with unit: inch
...Line # 0 has word count 2 and text 'UNITED STATES' within bounding box '[3.4915, 0.6828], [5.0116, 0.6828], [5.0116, 0.8265], [3.4915, 0.8265]'
......Word 'UNITED' has a confidence of 1.0
......Word 'STATES' has a confidence of 1.0
...Line # 1 has word count 4 and text 'SECURITIES AND EXCHANGE COMMISSION' within bounding box '[2.1937, 0.9061], [6.297, 0.9061], [6.297, 1.0498], [2.1937, 1.0498]'
......Word 'SECURITIES' has a confidence of 1.0
......Word 'AND' has a confidence of 1.0
......Word 'EXCHANGE' has a confidence of 1.0
......Word 'COMMISSION' has a confidence of 1.0
...Line # 2 has word count 3 and text 'Washington, D.C. 20549' within bounding box '[3.4629, 1.1179], [5.031, 1.1179], [5.031, 1.2483], [3.4629, 1.2483]'
......Word 'Washington,' has a confidence of 1.0
......Word 'D.C.' has a confidence of 1.0
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl
)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding polygon '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
若要分析 URI 中的特定文件,请使用 begin_analyze_document_from_url
方法,并传递 prebuilt-invoice
作为模型 ID。 返回的值是一个 result
对象,其中包含有关已提交文档的数据。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_bounding_region(bounding_regions):
if not bounding_regions:
return "N/A"
return ", ".join(
"Page #{}: {}".format(region.page_number, format_polygon(region.polygon))
for region in bounding_regions
)
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_invoice():
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-invoice", invoiceUrl
)
invoices = poller.result()
for idx, invoice in enumerate(invoices.documents):
print("--------Recognizing invoice #{}--------".format(idx + 1))
vendor_name = invoice.fields.get("VendorName")
if vendor_name:
print(
"Vendor Name: {} has confidence: {}".format(
vendor_name.value, vendor_name.confidence
)
)
vendor_address = invoice.fields.get("VendorAddress")
if vendor_address:
print(
"Vendor Address: {} has confidence: {}".format(
vendor_address.value, vendor_address.confidence
)
)
vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
if vendor_address_recipient:
print(
"Vendor Address Recipient: {} has confidence: {}".format(
vendor_address_recipient.value, vendor_address_recipient.confidence
)
)
customer_name = invoice.fields.get("CustomerName")
if customer_name:
print(
"Customer Name: {} has confidence: {}".format(
customer_name.value, customer_name.confidence
)
)
customer_id = invoice.fields.get("CustomerId")
if customer_id:
print(
"Customer Id: {} has confidence: {}".format(
customer_id.value, customer_id.confidence
)
)
customer_address = invoice.fields.get("CustomerAddress")
if customer_address:
print(
"Customer Address: {} has confidence: {}".format(
customer_address.value, customer_address.confidence
)
)
customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
if customer_address_recipient:
print(
"Customer Address Recipient: {} has confidence: {}".format(
customer_address_recipient.value,
customer_address_recipient.confidence,
)
)
invoice_id = invoice.fields.get("InvoiceId")
if invoice_id:
print(
"Invoice Id: {} has confidence: {}".format(
invoice_id.value, invoice_id.confidence
)
)
invoice_date = invoice.fields.get("InvoiceDate")
if invoice_date:
print(
"Invoice Date: {} has confidence: {}".format(
invoice_date.value, invoice_date.confidence
)
)
invoice_total = invoice.fields.get("InvoiceTotal")
if invoice_total:
print(
"Invoice Total: {} has confidence: {}".format(
invoice_total.value, invoice_total.confidence
)
)
due_date = invoice.fields.get("DueDate")
if due_date:
print(
"Due Date: {} has confidence: {}".format(
due_date.value, due_date.confidence
)
)
purchase_order = invoice.fields.get("PurchaseOrder")
if purchase_order:
print(
"Purchase Order: {} has confidence: {}".format(
purchase_order.value, purchase_order.confidence
)
)
billing_address = invoice.fields.get("BillingAddress")
if billing_address:
print(
"Billing Address: {} has confidence: {}".format(
billing_address.value, billing_address.confidence
)
)
billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
if billing_address_recipient:
print(
"Billing Address Recipient: {} has confidence: {}".format(
billing_address_recipient.value,
billing_address_recipient.confidence,
)
)
shipping_address = invoice.fields.get("ShippingAddress")
if shipping_address:
print(
"Shipping Address: {} has confidence: {}".format(
shipping_address.value, shipping_address.confidence
)
)
shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
if shipping_address_recipient:
print(
"Shipping Address Recipient: {} has confidence: {}".format(
shipping_address_recipient.value,
shipping_address_recipient.confidence,
)
)
print("Invoice items:")
for idx, item in enumerate(invoice.fields.get("Items").value):
print("...Item #{}".format(idx + 1))
item_description = item.value.get("Description")
if item_description:
print(
"......Description: {} has confidence: {}".format(
item_description.value, item_description.confidence
)
)
item_quantity = item.value.get("Quantity")
if item_quantity:
print(
"......Quantity: {} has confidence: {}".format(
item_quantity.value, item_quantity.confidence
)
)
unit = item.value.get("Unit")
if unit:
print(
"......Unit: {} has confidence: {}".format(
unit.value, unit.confidence
)
)
unit_price = item.value.get("UnitPrice")
if unit_price:
print(
"......Unit Price: {} has confidence: {}".format(
unit_price.value, unit_price.confidence
)
)
product_code = item.value.get("ProductCode")
if product_code:
print(
"......Product Code: {} has confidence: {}".format(
product_code.value, product_code.confidence
)
)
item_date = item.value.get("Date")
if item_date:
print(
"......Date: {} has confidence: {}".format(
item_date.value, item_date.confidence
)
)
tax = item.value.get("Tax")
if tax:
print(
"......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
)
amount = item.value.get("Amount")
if amount:
print(
"......Amount: {} has confidence: {}".format(
amount.value, amount.confidence
)
)
subtotal = invoice.fields.get("SubTotal")
if subtotal:
print(
"Subtotal: {} has confidence: {}".format(
subtotal.value, subtotal.confidence
)
)
total_tax = invoice.fields.get("TotalTax")
if total_tax:
print(
"Total Tax: {} has confidence: {}".format(
total_tax.value, total_tax.confidence
)
)
previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
if previous_unpaid_balance:
print(
"Previous Unpaid Balance: {} has confidence: {}".format(
previous_unpaid_balance.value, previous_unpaid_balance.confidence
)
)
amount_due = invoice.fields.get("AmountDue")
if amount_due:
print(
"Amount Due: {} has confidence: {}".format(
amount_due.value, amount_due.confidence
)
)
service_start_date = invoice.fields.get("ServiceStartDate")
if service_start_date:
print(
"Service Start Date: {} has confidence: {}".format(
service_start_date.value, service_start_date.confidence
)
)
service_end_date = invoice.fields.get("ServiceEndDate")
if service_end_date:
print(
"Service End Date: {} has confidence: {}".format(
service_end_date.value, service_end_date.confidence
)
)
service_address = invoice.fields.get("ServiceAddress")
if service_address:
print(
"Service Address: {} has confidence: {}".format(
service_address.value, service_address.confidence
)
)
service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
if service_address_recipient:
print(
"Service Address Recipient: {} has confidence: {}".format(
service_address_recipient.value,
service_address_recipient.confidence,
)
)
remittance_address = invoice.fields.get("RemittanceAddress")
if remittance_address:
print(
"Remittance Address: {} has confidence: {}".format(
remittance_address.value, remittance_address.confidence
)
)
remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
if remittance_address_recipient:
print(
"Remittance Address Recipient: {} has confidence: {}".format(
remittance_address_recipient.value,
remittance_address_recipient.confidence,
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_invoice()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
下面是预期输出的代码段:
--------Recognizing invoice #1--------
Vendor Name: CONTOSO LTD. has confidence: 0.919
Vendor Address: 123 456th St New York, NY, 10001 has confidence: 0.907
Vendor Address Recipient: Contoso Headquarters has confidence: 0.919
Customer Name: MICROSOFT CORPORATION has confidence: 0.84
Customer Id: CID-12345 has confidence: 0.956
Customer Address: 123 Other St, Redmond WA, 98052 has confidence: 0.909
Customer Address Recipient: Microsoft Corp has confidence: 0.917
Invoice Id: INV-100 has confidence: 0.972
Invoice Date: 2019-11-15 has confidence: 0.971
Invoice Total: CurrencyValue(amount=110.0, symbol=$) has confidence: 0.97
Due Date: 2019-12-15 has confidence: 0.973
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl
)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding polygon '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
| 文档智能 REST API | 支持的 Azure SDK |
| 文档智能 REST API | 支持的 Azure SDK |
在本快速入门中,了解如何使用文档智能 REST API 分析和提取文档中的数据和值:
Azure 订阅 - 创建试用订阅
已安装 curl 命令行工具。
PowerShell 7.* 版本(或类似的命令行应用程序。):
若要检查 PowerShell 版本,请键入与操作系统相关的以下命令:
- Windows:
Get-Host | Select-Object Version
- macOS 或 Linux:
$PSVersionTable
- Windows:
文档智能(单服务)或 Azure AI 服务(多服务)资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
POST 请求用于通过预生成或自定义模型分析文档。 GET 请求用于检索文档分析调用的结果。 modelId
与 POST 一起使用,resultId
与 GET 操作一起使用。
在运行 cURL 命令之前,请对 POST 请求进行以下更改:
将
{endpoint}
替换为你的 Azure 门户文档智能实例中的终结点值。将
{key}
替换为你的 Azure 门户文档智能实例中的密钥值。使用下表作为参考,将
{modelID}
和{your-document-url}
替换为所需的值。你需要在 URL 中提供文档文件。 对于本快速入门,你可以使用下表中为每项功能提供的示例表单:
示例文档
功能 | {modelID} | {your-document-url} |
---|---|---|
常规文档 | 预生成文档 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf |
读取 | prebuilt-read | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png |
布局 | 预生成布局 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png |
医疗保险卡 | prebuilt-healthInsuranceCard.us | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/insurance-card.png |
W-2 | prebuilt-tax.us.w2 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/w2.png |
发票 | 预生成的发票 | https://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf |
回执 | prebuilt-receipt | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/receipt.png |
身份文档 | prebuilt-idDocument | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/identity_documents.png |
名片 | prebuilt-businessCard | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/de5e0d8982ab754823c54de47a47e8e499351523/curl/form-recognizer/rest-api/business_card.jpg |
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelID}:analyze?api-version=2023-07-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelId}:analyze?api-version=2022-08-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
你将收到 202 (Success)
响应,其中包含只读“Operation-Location”标头。 此标头的值包含 resultID
,可以查询该 ID,从而使用包含你同一资源订阅密钥的 GET 请求来获取异步操作的状态并检索结果:
调用 Analyze document
API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
调用 Analyze document
API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
替换 POST 响应中的
{resultID}
Operation-location 标头。将
{key}
替换为 Azure 门户中你的文档智能实例中的密钥值。
curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2023-07-31" -H "Ocp-Apim-Subscription-Key: {key}"
curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2022-08-31" -H "Ocp-Apim-Subscription-Key: {key}"
你将收到包含 JSON 输出的 200 (Success)
响应。 第一个字段 "status"
指示操作的状态。 如果操作未完成,"status"
的值为 "running"
或 "notStarted"
,你应当采用手动方式或通过脚本再次调用该 API。 我们建议两次调用间隔一秒或更长时间。
{
"status": "succeeded",
"createdDateTime": "2023-08-25T19:31:37Z",
"lastUpdatedDateTime": "2023-08-25T19:31:43Z",
"analyzeResult": {
"apiVersion": "2023-07-31",
"modelId": "prebuilt-invoice",
"stringIndexType": "textElements"...
..."pages": [
{
"pageNumber": 1,
"angle": 0,
"width": 8.5,
"height": 11,
"unit": "inch",
"words": [
{
"content": "CONTOSO",
"boundingBox": [
0.5911,
0.6857,
1.7451,
0.6857,
1.7451,
0.8664,
0.5911,
0.8664
],
"confidence": 1,
"span": {
"offset": 0,
"length": 7
}
}],
}]
}
}
{
"status": "succeeded",
"createdDateTime": "2022-09-25T19:31:37Z",
"lastUpdatedDateTime": "2022-09-25T19:31:43Z",
"analyzeResult": {
"apiVersion": "2022-08-31",
"modelId": "prebuilt-invoice",
"stringIndexType": "textElements"...
..."pages": [
{
"pageNumber": 1,
"angle": 0,
"width": 8.5,
"height": 11,
"unit": "inch",
"words": [
{
"content": "CONTOSO",
"boundingBox": [
0.5911,
0.6857,
1.7451,
0.6857,
1.7451,
0.8664,
0.5911,
0.8664
],
"confidence": 1,
"span": {
"offset": 0,
"length": 7
}
}],
}]
}
}
预生成模型提取预定义的文档字段集。 请参阅模型数据提取来了解提取的字段名称、类型、说明和示例。
此内容适用于: v2.1
开始通过你选择的编程语言或 REST API 来使用 Azure AI 文档智能。 文档智能是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本和表。 我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。
若要详细了解文档智能功能和开发选项,请访问我们的概述页。
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
Azure 订阅 - 创建试用订阅。
Visual Studio IDE 的当前版本。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
启动 Visual Studio 2019。
在“开始”页上,选择“创建新项目”。
在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。
在“配置新项目”对话框中,在项目名称框中输入
formRecognizer_quickstart
。 然后选择“下一步”。在“附加信息”对话框窗口中,选择“.NET 5.0 (当前版)”,然后选择“创建”。
右键单击 formRecognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并键入 Azure.AI.FormRecognizer。
从下拉菜单中选择版本 3.1.1,然后选择安装。
若要与文档智能服务交互,需要创建 FormRecognizerClient
类的实例。 为此,你要使用密钥创建 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建 FormRecognizerClient
实例。
备注
- 从 .NET 6 开始,使用
console
模板的新项目将生成与以前版本不同的新程序样式。 - 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
- 使用较新版本时,只需编写
Main
方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。 - 有关详细信息,请参阅新的 C# 模板生成顶级语句。
打开 Program.cs 文件。
包括以下 using 指令:
using Azure;
using Azure.AI.FormRecognizer;
using Azure.AI.FormRecognizer.Models;
using System.Threading.Tasks;
- 设置
endpoint
和key
环境变量,并创建AzureKeyCredential
和FormRecognizerClient
实例:
private static readonly string endpoint = "your-form-recognizer-endpoint";
private static readonly string key = "your-api-key";
private static readonly AzureKeyCredential credential = new AzureKeyCredential(key);
删除
Console.Writeline("Hello World!");
行,并向 Program.cs 文件中添加下列“试用”代码示例之一:选择要复制并粘贴到应用程序的 Main 方法中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性一文。
从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到
formUri
变量中。 - 若要从 URI 中的特定文件中提取布局,请使用
StartRecognizeContentFromUriAsync
方法。
FormRecognizerClient recognizerClient = AuthenticateClient();
Task recognizeContent = RecognizeContent(recognizerClient);
Task.WaitAll(recognizeContent);
private static FormRecognizerClient AuthenticateClient()
{
var credential = new AzureKeyCredential(key);
var client = new FormRecognizerClient(new Uri(endpoint), credential);
return client;
}
private static async Task RecognizeContent(FormRecognizerClient recognizerClient)
{
string formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
FormPageCollection formPages = await recognizerClient
.StartRecognizeContentFromUri(new Uri(formUrl))
.WaitForCompletionAsync();
foreach (FormPage page in formPages)
{
Console.WriteLine($"Form Page {page.PageNumber} has {page.Lines.Count} lines.");
for (int i = 0; i < page.Lines.Count; i++)
{
FormLine line = page.Lines[i];
Console.WriteLine($" Line {i} has {line.Words.Count} word{(line.Words.Count > 1 ? "s" : "")}, and text: '{line.Text}'.");
}
for (int i = 0; i < page.Tables.Count; i++)
{
FormTable table = page.Tables[i];
Console.WriteLine($"Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (FormTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) contains text: '{cell.Text}'.");
}
}
}
}
}
}
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
FormRecognizerClient recognizerClient = AuthenticateClient();
Task analyzeinvoice = AnalyzeInvoice(recognizerClient, invoiceUrl);
Task.WaitAll(analyzeinvoice);
private static FormRecognizerClient AuthenticateClient() {
var credential = new AzureKeyCredential(key);
var client = new FormRecognizerClient(new Uri(endpoint), credential);
return client;
}
static string invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
private static async Task AnalyzeInvoice(FormRecognizerClient recognizerClient, string invoiceUrl) {
var options = new RecognizeInvoicesOptions() {
Locale = "en-US"
};
RecognizedFormCollection invoices = await recognizerClient.StartRecognizeInvoicesFromUriAsync(new Uri(invoiceUrl), options).WaitForCompletionAsync();
RecognizedForm invoice = invoices[0];
FormField invoiceIdField;
if (invoice.Fields.TryGetValue("InvoiceId", out invoiceIdField)) {
if (invoiceIdField.Value.ValueType == FieldValueType.String) {
string invoiceId = invoiceIdField.Value.AsString();
Console.WriteLine($" Invoice Id: '{invoiceId}', with confidence {invoiceIdField.Confidence}");
}
}
FormField invoiceDateField;
if (invoice.Fields.TryGetValue("InvoiceDate", out invoiceDateField)) {
if (invoiceDateField.Value.ValueType == FieldValueType.Date) {
DateTime invoiceDate = invoiceDateField.Value.AsDate();
Console.WriteLine($" Invoice Date: '{invoiceDate}', with confidence {invoiceDateField.Confidence}");
}
}
FormField dueDateField;
if (invoice.Fields.TryGetValue("DueDate", out dueDateField)) {
if (dueDateField.Value.ValueType == FieldValueType.Date) {
DateTime dueDate = dueDateField.Value.AsDate();
Console.WriteLine($" Due Date: '{dueDate}', with confidence {dueDateField.Confidence}");
}
}
FormField vendorNameField;
if (invoice.Fields.TryGetValue("VendorName", out vendorNameField)) {
if (vendorNameField.Value.ValueType == FieldValueType.String) {
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($" Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
FormField vendorAddressField;
if (invoice.Fields.TryGetValue("VendorAddress", out vendorAddressField)) {
if (vendorAddressField.Value.ValueType == FieldValueType.String) {
string vendorAddress = vendorAddressField.Value.AsString();
Console.WriteLine($" Vendor Address: '{vendorAddress}', with confidence {vendorAddressField.Confidence}");
}
}
FormField customerNameField;
if (invoice.Fields.TryGetValue("CustomerName", out customerNameField)) {
if (customerNameField.Value.ValueType == FieldValueType.String) {
string customerName = customerNameField.Value.AsString();
Console.WriteLine($" Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
FormField customerAddressField;
if (invoice.Fields.TryGetValue("CustomerAddress", out customerAddressField)) {
if (customerAddressField.Value.ValueType == FieldValueType.String) {
string customerAddress = customerAddressField.Value.AsString();
Console.WriteLine($" Customer Address: '{customerAddress}', with confidence {customerAddressField.Confidence}");
}
}
FormField customerAddressRecipientField;
if (invoice.Fields.TryGetValue("CustomerAddressRecipient", out customerAddressRecipientField)) {
if (customerAddressRecipientField.Value.ValueType == FieldValueType.String) {
string customerAddressRecipient = customerAddressRecipientField.Value.AsString();
Console.WriteLine($" Customer address recipient: '{customerAddressRecipient}', with confidence {customerAddressRecipientField.Confidence}");
}
}
FormField invoiceTotalField;
if (invoice.Fields.TryGetValue("InvoiceTotal", out invoiceTotalField)) {
if (invoiceTotalField.Value.ValueType == FieldValueType.Float) {
float invoiceTotal = invoiceTotalField.Value.AsFloat();
Console.WriteLine($" Invoice Total: '{invoiceTotal}', with confidence {invoiceTotalField.Confidence}");
}
}
}
}
}
选择 formRecognizer_quickstart 旁的绿色“启动”按钮,来生成并运行程序,或者按 F5。
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
Azure 订阅 - 创建试用订阅。
Java 开发工具包 (JDK) 版本 8 或更高版本。 有关详细信息,请参阅支持的 Java 版本和更新计划。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognizer-app 的新目录,并导航到该目录。
mkdir form-recognizer-app && form-recognizer-app
从工作目录运行
gradle init
命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。gradle init --type basic
当提示你选择一个 DSL 时,选择 Kotlin。
接受默认项目名称 (form-recognizer-app)
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在项目的 build.gradle.kts 文件中,以 implementation
语句的形式包含客户端库及所需的插件和设置。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "3.1.1")
}
在工作目录中运行以下命令:
mkdir -p src/main/java
创建以下目录结构:
导航到 Java 目录,创建一个名为 FormRecognizer.java 的文件。 在你偏好的编辑器或 IDE 中打开该文件,并添加以下包声明和 import
语句:
import com.azure.ai.formrecognizer.*;
import com.azure.ai.formrecognizer.models.*;
import java.util.concurrent.atomic.AtomicReference;
import java.util.List;
import java.util.Map;
import java.time.LocalDate;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.http.rest.PagedIterable;
import com.azure.core.util.Context;
import com.azure.core.util.polling.SyncPoller;
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 若要分析 URI 中的特定文件,将使用
beginRecognizeContentFromUrl
方法。 - 我们已将文件 URI 值添加到 main 方法的
formUrl
变量中。
使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder()
.credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();
String formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
System.out.println("Get form content...");
GetContent(recognizerClient, formUrl);
}
private static void GetContent(FormRecognizerClient recognizerClient, String invoiceUri) {
String analyzeFilePath = invoiceUri;
SyncPoller<FormRecognizerOperationResult, List<FormPage>> recognizeContentPoller = recognizerClient
.beginRecognizeContentFromUrl(analyzeFilePath);
List<FormPage> contentResult = recognizeContentPoller.getFinalResult();
// </snippet_getcontent_call>
// <snippet_getcontent_print>
contentResult.forEach(formPage -> {
// Table information
System.out.println("----Recognizing content ----");
System.out.printf("Has width: %f and height: %f, measured with unit: %s.%n", formPage.getWidth(),
formPage.getHeight(), formPage.getUnit());
formPage.getTables().forEach(formTable -> {
System.out.printf("Table has %d rows and %d columns.%n", formTable.getRowCount(),
formTable.getColumnCount());
formTable.getCells().forEach(formTableCell -> {
System.out.printf("Cell has text %s.%n", formTableCell.getText());
});
System.out.println();
});
});
}
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 下面是文档智能服务目前支持的预生成模型:
使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {
FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder().credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
System.out.println("Analyze invoice...");
AnalyzeInvoice(recognizerClient, invoiceUrl);
}
private static void AnalyzeInvoice(FormRecognizerClient recognizerClient, String invoiceUrl) {
SyncPoller < FormRecognizerOperationResult,
List < RecognizedForm >> recognizeInvoicesPoller = recognizerClient.beginRecognizeInvoicesFromUrl(invoiceUrl);
List < RecognizedForm > recognizedInvoices = recognizeInvoicesPoller.getFinalResult();
for (int i = 0; i < recognizedInvoices.size(); i++) {
RecognizedForm recognizedInvoice = recognizedInvoices.get(i);
Map < String,
FormField > recognizedFields = recognizedInvoice.getFields();
System.out.printf("----------- Recognized invoice info for page %d -----------%n", i);
FormField vendorNameField = recognizedFields.get("VendorName");
if (vendorNameField != null) {
if (FieldValueType.STRING == vendorNameField.getValue().getValueType()) {
String merchantName = vendorNameField.getValue().asString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n", merchantName, vendorNameField.getConfidence());
}
}
FormField vendorAddressField = recognizedFields.get("VendorAddress");
if (vendorAddressField != null) {
if (FieldValueType.STRING == vendorAddressField.getValue().getValueType()) {
String merchantAddress = vendorAddressField.getValue().asString();
System.out.printf("Vendor address: %s, confidence: %.2f%n", merchantAddress, vendorAddressField.getConfidence());
}
}
FormField customerNameField = recognizedFields.get("CustomerName");
if (customerNameField != null) {
if (FieldValueType.STRING == customerNameField.getValue().getValueType()) {
String merchantAddress = customerNameField.getValue().asString();
System.out.printf("Customer Name: %s, confidence: %.2f%n", merchantAddress, customerNameField.getConfidence());
}
}
FormField customerAddressRecipientField = recognizedFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (FieldValueType.STRING == customerAddressRecipientField.getValue().getValueType()) {
String customerAddr = customerAddressRecipientField.getValue().asString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n", customerAddr, customerAddressRecipientField.getConfidence());
}
}
FormField invoiceIdField = recognizedFields.get("InvoiceId");
if (invoiceIdField != null) {
if (FieldValueType.STRING == invoiceIdField.getValue().getValueType()) {
String invoiceId = invoiceIdField.getValue().asString();
System.out.printf("Invoice Id: %s, confidence: %.2f%n", invoiceId, invoiceIdField.getConfidence());
}
}
FormField invoiceDateField = recognizedFields.get("InvoiceDate");
if (customerNameField != null) {
if (FieldValueType.DATE == invoiceDateField.getValue().getValueType()) {
LocalDate invoiceDate = invoiceDateField.getValue().asDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n", invoiceDate, invoiceDateField.getConfidence());
}
}
FormField invoiceTotalField = recognizedFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (FieldValueType.FLOAT == invoiceTotalField.getValue().getValueType()) {
Float invoiceTotal = invoiceTotalField.getValue().asFloat();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n", invoiceTotal, invoiceTotalField.getConfidence());
}
}
}
}
导航回到主项目目录 form-recognizer-app。
- 使用
build
命令生成应用程序:
gradle build
- 使用
run
命令运行应用程序:
gradle run
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。
最新
LTS
版本的 Node.jsAzure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
创建新的 Node.js 应用程序。 在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。
mkdir form-recognizer-app && cd form-recognizer-app
运行
npm init
命令以使用package.json
文件创建一个 node 应用程序。npm init
安装
ai-form-recognizer
客户端库 npm 包:npm install @azure/ai-form-recognizer
应用的
package.json
文件将使用依赖项进行更新。创建一个名为
index.js
的文件,将其打开,并导入以下库:const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
为资源的 Azure 终结点和密钥创建变量:
const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE"; const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
此时,JavaScript 应用程序应包含以下代码行:
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer"); const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE"; const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到文件顶部附近的
formUrl
变量中。 - 若要分析 URI 中的特定文件,将使用
beginRecognizeContent
方法。
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
async function recognizeContent() {
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginRecognizeContentFromUrl(formUrl);
const pages = await poller.pollUntilDone();
if (!pages || pages.length === 0) {
throw new Error("Expecting non-empty list of pages!");
}
for (const page of pages) {
console.log(
`Page ${page.pageNumber}: width ${page.width} and height ${page.height} with unit ${page.unit}`
);
for (const table of page.tables) {
for (const cell of table.cells) {
console.log(`cell [${cell.rowIndex},${cell.columnIndex}] has text ${cell.text}`);
}
}
}
}
recognizeContent().catch((err) => {
console.error("The sample encountered an error:", err);
});
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 下面是文档智能服务目前支持的预生成模型:
const invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
async function recognizeInvoices() {
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginRecognizeInvoicesFromUrl(invoiceUrl);
const [invoice] = await poller.pollUntilDone();
if (invoice === undefined) {
throw new Error("Failed to extract data from at least one invoice.");
}
/**
* This is a helper function for printing a simple field with an elemental type.
*/
function fieldToString(field) {
const {
name,
valueType,
value,
confidence
} = field;
return `${name} (${valueType}): '${value}' with confidence ${confidence}'`;
}
console.log("Invoice fields:");
/**
* Invoices contain a lot of optional fields, but they are all of elemental types
* such as strings, numbers, and dates, so we will just enumerate them all.
*/
for (const [name, field] of Object.entries(invoice.fields)) {
if (field.valueType !== "array" && field.valueType !== "object") {
console.log(`- ${name} ${fieldToString(field)}`);
}
}
// Invoices also support nested line items, so we can iterate over them.
let idx = 0;
console.log("- Items:");
const items = invoice.fields["Items"]?.value;
for (const item of items ?? []) {
const value = item.value;
// Each item has several subfields that are nested within the item. We'll
// map over this list of the subfields and filter out any fields that
// weren't found. Not all fields will be returned every time, only those
// that the service identified for the particular document in question.
const subFields = [
"Description",
"Quantity",
"Unit",
"UnitPrice",
"ProductCode",
"Date",
"Tax",
"Amount"
]
.map((fieldName) => value[fieldName])
.filter((field) => field !== undefined);
console.log(
[
` - Item #${idx}`,
// Now we will convert those fields into strings to display
...subFields.map((field) => ` - ${fieldToString(field)}`)
].join("\n")
);
}
}
recognizeInvoices().catch((err) => {
console.error("The sample encountered an error:", err);
});
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
Azure 订阅 - 创建试用订阅
-
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
pip --version
来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:
pip install azure-ai-formrecognizer
在首选编辑器或 IDE 中创建一个名为 form_recognizer_quickstart.py 的新 Python 应用程序。 然后导入以下库:
import os
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"
此时,Python 应用程序应包含以下代码行:
import os
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到文件顶部附近的
formUrl
变量中。 - 若要分析 URI 中的特定文件,将使用
begin_recognize_content_from_url
方法。
def format_bounding_box(bounding_box):
if not bounding_box:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])
def recognize_content():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
form_recognizer_client = FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = form_recognizer_client.begin_recognize_content_from_url(formUrl)
form_pages = poller.result()
for idx, content in enumerate(form_pages):
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
content.width, content.height, content.unit
)
)
for table_idx, table in enumerate(content.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
print(
"Table # {} location on page: {}".format(
table_idx, format_bounding_box(table.bounding_box)
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has text '{}' within bounding box '{}'".format(
cell.row_index,
cell.column_index,
cell.text,
format_bounding_box(cell.bounding_box),
)
)
for line_idx, line in enumerate(content.lines):
print(
"Line # {} has word count '{}' and text '{}' within bounding box '{}'".format(
line_idx,
len(line.words),
line.text,
format_bounding_box(line.bounding_box),
)
)
if line.appearance:
if (
line.appearance.style_name == "handwriting"
and line.appearance.style_confidence > 0.8
):
print(
"Text line '{}' is handwritten and might be a signature.".format(
line.text
)
)
for word in line.words:
print(
"...Word '{}' has a confidence of {}".format(
word.text, word.confidence
)
)
for selection_mark in content.selection_marks:
print(
"Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
selection_mark.state,
format_bounding_box(selection_mark.bounding_box),
selection_mark.confidence,
)
)
print("----------------------------------------")
if __name__ == "__main__":
recognize_content()
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 下面是文档智能服务目前支持的预生成模型:
def recognize_invoice():
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
form_recognizer_client = FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = form_recognizer_client.begin_recognize_invoices_from_url(
invoiceUrl, locale="en-US"
)
invoices = poller.result()
for idx, invoice in enumerate(invoices):
vendor_name = invoice.fields.get("VendorName")
if vendor_name:
print(
"Vendor Name: {} has confidence: {}".format(
vendor_name.value, vendor_name.confidence
)
)
vendor_address = invoice.fields.get("VendorAddress")
if vendor_address:
print(
"Vendor Address: {} has confidence: {}".format(
vendor_address.value, vendor_address.confidence
)
)
vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
if vendor_address_recipient:
print(
"Vendor Address Recipient: {} has confidence: {}".format(
vendor_address_recipient.value, vendor_address_recipient.confidence
)
)
customer_name = invoice.fields.get("CustomerName")
if customer_name:
print(
"Customer Name: {} has confidence: {}".format(
customer_name.value, customer_name.confidence
)
)
customer_id = invoice.fields.get("CustomerId")
if customer_id:
print(
"Customer Id: {} has confidence: {}".format(
customer_id.value, customer_id.confidence
)
)
customer_address = invoice.fields.get("CustomerAddress")
if customer_address:
print(
"Customer Address: {} has confidence: {}".format(
customer_address.value, customer_address.confidence
)
)
customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
if customer_address_recipient:
print(
"Customer Address Recipient: {} has confidence: {}".format(
customer_address_recipient.value,
customer_address_recipient.confidence,
)
)
invoice_id = invoice.fields.get("InvoiceId")
if invoice_id:
print(
"Invoice Id: {} has confidence: {}".format(
invoice_id.value, invoice_id.confidence
)
)
invoice_date = invoice.fields.get("InvoiceDate")
if invoice_date:
print(
"Invoice Date: {} has confidence: {}".format(
invoice_date.value, invoice_date.confidence
)
)
invoice_total = invoice.fields.get("InvoiceTotal")
if invoice_total:
print(
"Invoice Total: {} has confidence: {}".format(
invoice_total.value, invoice_total.confidence
)
)
due_date = invoice.fields.get("DueDate")
if due_date:
print(
"Due Date: {} has confidence: {}".format(
due_date.value, due_date.confidence
)
)
purchase_order = invoice.fields.get("PurchaseOrder")
if purchase_order:
print(
"Purchase Order: {} has confidence: {}".format(
purchase_order.value, purchase_order.confidence
)
)
billing_address = invoice.fields.get("BillingAddress")
if billing_address:
print(
"Billing Address: {} has confidence: {}".format(
billing_address.value, billing_address.confidence
)
)
billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
if billing_address_recipient:
print(
"Billing Address Recipient: {} has confidence: {}".format(
billing_address_recipient.value,
billing_address_recipient.confidence,
)
)
shipping_address = invoice.fields.get("ShippingAddress")
if shipping_address:
print(
"Shipping Address: {} has confidence: {}".format(
shipping_address.value, shipping_address.confidence
)
)
shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
if shipping_address_recipient:
print(
"Shipping Address Recipient: {} has confidence: {}".format(
shipping_address_recipient.value,
shipping_address_recipient.confidence,
)
)
print("Invoice items:")
for idx, item in enumerate(invoice.fields.get("Items").value):
item_description = item.value.get("Description")
if item_description:
print(
"......Description: {} has confidence: {}".format(
item_description.value, item_description.confidence
)
)
item_quantity = item.value.get("Quantity")
if item_quantity:
print(
"......Quantity: {} has confidence: {}".format(
item_quantity.value, item_quantity.confidence
)
)
unit = item.value.get("Unit")
if unit:
print(
"......Unit: {} has confidence: {}".format(
unit.value, unit.confidence
)
)
unit_price = item.value.get("UnitPrice")
if unit_price:
print(
"......Unit Price: {} has confidence: {}".format(
unit_price.value, unit_price.confidence
)
)
product_code = item.value.get("ProductCode")
if product_code:
print(
"......Product Code: {} has confidence: {}".format(
product_code.value, product_code.confidence
)
)
item_date = item.value.get("Date")
if item_date:
print(
"......Date: {} has confidence: {}".format(
item_date.value, item_date.confidence
)
)
tax = item.value.get("Tax")
if tax:
print(
"......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
)
amount = item.value.get("Amount")
if amount:
print(
"......Amount: {} has confidence: {}".format(
amount.value, amount.confidence
)
)
subtotal = invoice.fields.get("SubTotal")
if subtotal:
print(
"Subtotal: {} has confidence: {}".format(
subtotal.value, subtotal.confidence
)
)
total_tax = invoice.fields.get("TotalTax")
if total_tax:
print(
"Total Tax: {} has confidence: {}".format(
total_tax.value, total_tax.confidence
)
)
previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
if previous_unpaid_balance:
print(
"Previous Unpaid Balance: {} has confidence: {}".format(
previous_unpaid_balance.value, previous_unpaid_balance.confidence
)
)
amount_due = invoice.fields.get("AmountDue")
if amount_due:
print(
"Amount Due: {} has confidence: {}".format(
amount_due.value, amount_due.confidence
)
)
service_start_date = invoice.fields.get("ServiceStartDate")
if service_start_date:
print(
"Service Start Date: {} has confidence: {}".format(
service_start_date.value, service_start_date.confidence
)
)
service_end_date = invoice.fields.get("ServiceEndDate")
if service_end_date:
print(
"Service End Date: {} has confidence: {}".format(
service_end_date.value, service_end_date.confidence
)
)
service_address = invoice.fields.get("ServiceAddress")
if service_address:
print(
"Service Address: {} has confidence: {}".format(
service_address.value, service_address.confidence
)
)
service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
if service_address_recipient:
print(
"Service Address Recipient: {} has confidence: {}".format(
service_address_recipient.value,
service_address_recipient.confidence,
)
)
remittance_address = invoice.fields.get("RemittanceAddress")
if remittance_address:
print(
"Remittance Address: {} has confidence: {}".format(
remittance_address.value, remittance_address.confidence
)
)
remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
if remittance_address_recipient:
print(
"Remittance Address Recipient: {} has confidence: {}".format(
remittance_address_recipient.value,
remittance_address_recipient.confidence,
)
)
if __name__ == "__main__":
recognize_invoice()
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
| 文档智能 REST API | Azure REST API 参考 |
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
Azure 订阅 - 创建试用订阅
已安装 cURL。
PowerShell 6.0 及以上版本,或类似的命令行应用程序。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 将
{endpoint}
替换为你通过文档智能订阅获取的终结点。 - 将
{key}
替换为从上一步复制的密钥。 - 将
\"{your-document-url}
替换为示例文档 URL:
https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf
curl -v -i POST "https://{endpoint}/formrecognizer/v2.1/layout/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
你将收到 202 (Success)
响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:
https://cognitiveservice/formrecognizer/v2.1/layout/analyzeResults/{resultId}。
在以下示例中,analyzeResults/
后面作为 URL 一部分的字符串就是结果 ID。
https://cognitiveservice/formrecognizer/v2/layout/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb
调用 Analyze Layout
API 后,调用获取分析布局结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
- 将
{endpoint}
替换为你通过文档智能订阅获取的终结点。 - 将
{key}
替换为从上一步复制的密钥。 - 将
{resultId}
替换为上一步中的结果 ID。
curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/layout/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"
你将收到包含 JSON 内容的 200 (success)
响应。
请查看以下发票图像及其相应的 JSON 输出。
"readResults"
节点包含每一行文本,以及其各自在页面上的边界框位置。selectionMarks
节点显示每个选择标记(复选框、单选按钮)以及其状态是selected
还是unselected
。"pageResults"
部分包含提取的表。 对于每个表,将会提取文本、行和列索引、行和列跨距、边界框等。
- 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档。
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 下面是文档智能服务目前支持的预生成模型:
运行该命令之前,请进行以下更改:
将
{endpoint}
替换为你通过文档智能订阅获取的终结点。将
{key}
替换为从上一步复制的密钥。将
\"{your-document-url}
替换为示例发票 URL:https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf
curl -v -i POST https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your invoice URL}'}"
你将收到 202 (Success)
响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:
https://cognitiveservice/formrecognizer/v2.1/prebuilt/receipt/analyzeResults/{resultId}
在以下示例中,analyzeResults/
后面作为 URL 一部分的字符串就是结果 ID:
https://cognitiveservice/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb
调用 Analyze Invoice
API 后,调用获取分析发票结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
- 将
{endpoint}
替换为你通过文档智能密钥获取的终结点。 可以在文档智能资源“概述”选项卡中找到它。 - 将
{resultId}
替换为上一步中的结果 ID。 - 将
{key}
替换为你的密钥。
curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"
你将收到包含 JSON 输出的 200 (Success)
响应。
"readResults"
字段包含从发票中提取的每行文本。"pageResults"
包含从发票中提取的表和选择标记。"documentResults"
字段包含发票中最相关部分的键/值信息。
请参阅示例发票文档。
请参阅 GitHub 上的完整示例输出。
就是这样,做得很棒!
若要获得增强的体验和高级模型质量,请尝试文档智能工作室。
工作室支持使用 v2.1 标记的数据训练的任何模型。
更改日志提供有关从 v3.1 迁移到 v4.0 的详细信息。