如何:情绪分析和观点挖掘How to: Sentiment analysis and Opinion Mining

文本分析 API 的情绪分析功能提供了两种方法来检测积极和消极情绪。The Text Analytics API's Sentiment Analysis feature provides two ways for detecting positive and negative sentiment. 如果发送情绪分析请求,API 会在句子和文档级别返回情绪标签(如“消极”、“中性”和“积极”)和置信度分数。If you send a Sentiment Analysis request, the API will return sentiment labels (such as "negative", "neutral" and "positive") and confidence scores at the sentence and document-level. 还可使用情绪分析终结点发送观点挖掘请求,它精细地描述了对文本中某些方面(例如产品或服务的属性)的观点。You can also send Opinion Mining requests using the Sentiment Analysis endpoint, which provides granular information about the opinions related to aspects (such as the attributes of products or services) in text.

API 使用的 AI 模型由该服务提供,只需发送内容即可进行分析。The AI models used by the API are provided by the service, you just have to send content for analysis.

情绪分析版本和功能Sentiment Analysis versions and features

FeatureFeature 情绪分析 v3Sentiment Analysis v3 情绪分析 v3.1(预览)Sentiment Analysis v3.1 (Preview)
用于单个请求和批量请求的方法Methods for single, and batch requests XX XX
情绪分析分数和标记Sentiment Analysis scores and labeling XX XX
基于 Linux 的 Docker 容器Linux-based Docker container XX
观点挖掘Opinion Mining XX

情绪分析Sentiment Analysis

情绪分析 3.x 版本将情绪标签应用于文本,然后在句子和文档级别返回标签,每个标签都有一个置信度分数。Sentiment Analysis in version 3.x applies sentiment labels to text, which are returned at a sentence and document level, with a confidence score for each.

标签为积极、消极和中性。The labels are positive, negative, and neutral. 在文档级别,还可能返回混合情绪标签。At the document level, the mixed sentiment label also can be returned. 文档的情绪由以下内容确定:The sentiment of the document is determined below:

句子情绪Sentence sentiment 返回的文档标签Returned document label
文档中至少有一个 positive 句子。At least one positive sentence is in the document. 剩下的句子为 neutralThe rest of the sentences are neutral. positive
文档中至少有一个 negative 句子。At least one negative sentence is in the document. 剩下的句子为 neutralThe rest of the sentences are neutral. negative
文档中至少有一个 negative 句子和一个 positive 句子。At least one negative sentence and at least one positive sentence are in the document. mixed
文档中的所有句子为 neutralAll sentences in the document are neutral. neutral

置信度分数范围介于 1 到 0 之间。Confidence scores range from 1 to 0. 分数越接近于 1 表示标签分类的置信度越高,分数越低表示置信度越低。Scores closer to 1 indicate a higher confidence in the label's classification, while lower scores indicate lower confidence. 对于每个文档或每个句子,与标签(积极、消极和中性)关联的预测分数总和为 1。For each document or each sentence, the predicted scores associated with the labels (positive, negative and neutral) add up to 1.

观点挖掘Opinion Mining

观点挖掘是情绪分析的一项功能,从版本 3.1 的预览版开始提供。Opinion Mining is a feature of Sentiment Analysis, starting in the preview of version 3.1. 此功能在自然语言处理 (NLP) 中也称为基于方面的情绪分析,它更加精细地描述了对文本中某些方面(例如产品或服务的属性)的观点。Also known as Aspect-based Sentiment Analysis in Natural Language Processing (NLP), this feature provides more granular information about the opinions related to aspects (such as the attributes of products or services) in text.

例如,如果客户评论某家酒店,例如“房间很好,但员工不友好”,观点挖掘将查找文本中的各个方面及其相关的观点和情绪。For example, if a customer leaves feedback about a hotel such as "The room was great, but the staff was unfriendly.", Opinion Mining will locate aspects in the text, and their associated opinions and sentiments. 情绪分析可能只报告消极情绪。Sentiment Analysis might only report a negative sentiment.

观点挖掘示例图

若要在结果中获取观点挖掘,必须在情绪分析请求中包含 opinionMining=true 标志。To get Opinion Mining in your results, you must include the opinionMining=true flag in a request for sentiment analysis. 观点挖掘结果将包含在情绪分析响应中。The Opinion Mining results will be included in the sentiment analysis response. 观点挖掘是情绪分析的扩展,包含在你当前的定价层中。Opinion mining is an extension of Sentiment Analysis and is included in your current pricing tier.

发送 REST API 请求Sending a REST API request

准备工作Preparation

当为情绪分析提供较少的文本时,会得到更高质量的结果。Sentiment analysis produces a higher-quality result when you give it smaller amounts of text to work on. 这与关键短语提取相反,关键短语提取在处理较大的文本块时表现更好。This is opposite from key phrase extraction, which performs better on larger blocks of text. 要从两个操作获取最佳结果,请考虑相应地重建输入。To get the best results from both operations, consider restructuring the inputs accordingly.

必须拥有以下格式的 JSON 文档:ID、文本和语言You must have JSON documents in this format: ID, text, and language. 情绪分析支持多种语言,并在预览版中提供了更多的语言。Sentiment Analysis supports a wide range of languages, with more in preview. 有关详细信息,请参阅支持的语言For more information, see Supported languages.

每个文档的大小必须少于 5,120 个字符,Document size must be under 5,120 characters per document. 对于集合中允许的最大文档数,请参阅“概念”下的数据限制一文。For the maximum number of documents permitted in a collection, see the data limits article under Concepts. 集合在请求正文中提交。The collection is submitted in the body of the request.

构造请求Structure the request

创建 POST 请求。Create a POST request. 使用 Postman 或以下参考链接中的“API 测试控制台”来快速构建并发送请求。You can use Postman or the API testing console in the following reference links to quickly structure and send one.

请求终结点Request endpoints

使用 Azure 上的文本分析资源或实例化的文本分析容器设置 HTTPS 终结点,以便进行情绪分析。Set the HTTPS endpoint for sentiment analysis by using either a Text Analytics resource on Azure or an instantiated Text Analytics container. 必须包括要使用的版本的正确 URL。You must include the correct URL for the version you want to use. 例如:For example:

备注

可以在 Azure 门户上找到文本分析资源的密钥和终结点。You can find your key and endpoint for your Text Analytics resource on the Azure portal. 它们将位于资源的“快速启动”页上的“资源管理”下。They will be located on the resource's Quick start page, under resource management.

情绪分析Sentiment Analysis

https://<your-custom-subdomain>.cognitiveservices.azure.cn/text/analytics/v3.1-preview.3/sentiment

观点挖掘Opinion Mining

若要获取观点挖掘结果,必须包含 opinionMining=true 参数。To get Opinion Mining results, you must include the opinionMining=true parameter. 例如:For example:

https://<your-custom-subdomain>.cognitiveservices.azure.cn/text/analytics/v3.1-preview.3/sentiment?opinionMining=true

默认情况下,此参数设置为 falseThis parameter is set to false by default.

发送请求标头以包括文本分析 API 密钥。Set a request header to include your Text Analytics API key. 在请求正文中,提供为此分析准备的 JSON 文档集合。In the request body, provide the JSON documents collection you prepared for this analysis.

情绪分析和观点挖掘的请求示例Example request for Sentiment Analysis and Opinion Mining

下面是可能提交用于情绪分析的内容示例。The following is an example of content you might submit for sentiment analysis. v3.0v3.1-preview 的请求格式相同。The request format is the same for both v3.0 and v3.1-preview.

{
  "documents": [
    {
      "language": "en",
      "id": "1",
      "text": "The restaurant had great food and our waiter was friendly."
    }
  ]
}

发布请求Post the request

在收到请求时执行分析。Analysis is performed upon receipt of the request. 有关每分钟和每秒可以发送的请求的大小和数量的信息,请参阅概述中的数据限制部分。For information on the size and number of requests you can send per minute and second, see the data limits section in the overview.

文本分析 API 是无状态的。The Text Analytics API is stateless. 不会在帐户中存储数据,结果会立即在响应中返回。No data is stored in your account, and results are returned immediately in the response.

查看结果View the results

系统会立即返回输出。Output is returned immediately. 可将结果流式传输到接受 JSON 的应用程序,或者将输出保存到本地系统上的文件中。You can stream the results to an application that accepts JSON or save the output to a file on the local system. 然后,将输出导入到可以用来对数据进行排序、搜索和操作的应用程序。Then, import the output into an application that you can use to sort, search, and manipulate the data. 由于多语言和表情符号支持,响应可能包含文本偏移。Due to multilingual and emoji support, the response may contain text offsets. 有关详细信息,请参阅如何处理偏移See how to process offsets for more information.

情绪分析和观点挖掘示例响应Sentiment Analysis and Opinion Mining example response

重要

下面是 API v3.1 中提供的有关将观点挖掘和情绪分析结合使用的 JSON 示例。The following is a JSON example for using Opinion Mining with Sentiment Analysis, offered in v3.1 of the API. 如果不请求观点挖掘,则 API 响应将与“版本 3.0”选项卡相同。If you don't request Opinion mining, the API response will be the same as the Version 3.0 tab.

情绪分析 v3.1 可以返回情绪分析和观点挖掘的响应对象。Sentiment Analysis v3.1 can return response objects for both Sentiment Analysis and Opinion Mining.

情绪分析为整个文档以及其中的每个句子返回情绪标签和置信度分数。Sentiment analysis returns a sentiment label and confidence score for the entire document, and each sentence within it. 分数越接近于 1 表示标签分类的置信度越高,分数越低表示置信度越低。Scores closer to 1 indicate a higher confidence in the label's classification, while lower scores indicate lower confidence. 一个文档可以有多个句子,每个文档或句子的置信度分数合计为 1。A document can have multiple sentences, and the confidence scores within each document or sentence add up to 1.

观点挖掘将查找文本中的各个方面及其相关的观点和情绪。Opinion Mining will locate aspects in the text, and their associated opinions and sentiments. 在下面的回复中,“餐厅的食物很好,并且服务员很友好”这句话包括两个方面:食物和服务员 。In the below response, the sentence The restaurant had great food and our waiter was friendly has two aspects: food and waiter. 每个方面的 relations 属性都包含一个 ref 值,其中包含对相关 documentssentencesopinions 对象的 URI 引用。Each aspect's relations property contains a ref value with the URI-reference to the associated documents, sentences, and opinions objects.

{
    "documents": [
        {
            "id": "1",
            "sentiment": "positive",
            "confidenceScores": {
                "positive": 1.0,
                "neutral": 0.0,
                "negative": 0.0
            },
            "sentences": [
                {
                    "sentiment": "positive",
                    "confidenceScores": {
                        "positive": 1.0,
                        "neutral": 0.0,
                        "negative": 0.0
                    },
                    "offset": 0,
                    "length": 58,
                    "text": "The restaurant had great food and our waiter was friendly.",
                    "aspects": [
                        {
                            "sentiment": "positive",
                            "confidenceScores": {
                                "positive": 1.0,
                                "negative": 0.0
                            },
                            "offset": 25,
                            "length": 4,
                            "text": "food",
                            "relations": [
                                {
                                    "relationType": "opinion",
                                    "ref": "#/documents/0/sentences/0/opinions/0"
                                }
                            ]
                        },
                        {
                            "sentiment": "positive",
                            "confidenceScores": {
                                "positive": 1.0,
                                "negative": 0.0
                            },
                            "offset": 38,
                            "length": 6,
                            "text": "waiter",
                            "relations": [
                                {
                                    "relationType": "opinion",
                                    "ref": "#/documents/0/sentences/0/opinions/1"
                                }
                            ]
                        }
                    ],
                    "opinions": [
                        {
                            "sentiment": "positive",
                            "confidenceScores": {
                                "positive": 1.0,
                                "negative": 0.0
                            },
                            "offset": 19,
                            "length": 5,
                            "text": "great",
                            "isNegated": false
                        },
                        {
                            "sentiment": "positive",
                            "confidenceScores": {
                                "positive": 1.0,
                                "negative": 0.0
                            },
                            "offset": 49,
                            "length": 8,
                            "text": "friendly",
                            "isNegated": false
                        }
                    ]
                }
            ],
            "warnings": []
        }
    ],
    "errors": [],
    "modelVersion": "2020-04-01"
}

总结Summary

本文介绍了使用文本分析 API 进行情绪分析的概念和工作流。In this article, you learned concepts and workflow for sentiment analysis using the Text Analytics API. 综上所述:In summary:

  • 情绪分析和观点挖掘适用于选定的语言。Sentiment Analysis and Opinion Mining is available for select languages.
  • 请求正文中的 JSON 文档包括 ID、文本和语言代码。JSON documents in the request body include an ID, text, and language code.
  • POST 请求的目标是 /sentiment 终结点,方法是使用对订阅有效的个性化访问密钥和终结点The POST request is to a /sentiment endpoint by using a personalized access key and an endpoint that's valid for your subscription.
  • 在情绪分析请求中使用 opinionMining=true 来获取观点挖掘结果。Use opinionMining=true in Sentient Analysis requests to get Opinion Mining results.
  • 响应输出包含每个文档 ID 的情绪得分,可流式传输到接受 JSON 的任何应用。Response output, which consists of a sentiment score for each document ID, can be streamed to any app that accepts JSON. 例如,Excel 和 Power BI。For example, Excel and Power BI.

另请参阅See also