如何:使用健康状况文本分析(预览)How to: Use Text Analytics for health (preview)

重要

健康状况文本分析是一项预览功能,其“按原样”提供并在“不保证没有缺点”情况下提供。Text Analytics for health is a preview capability provided “AS IS” and “WITH ALL FAULTS.” 因此,不应在任何生产用途中实施或部署健康状况文本分析(预览版)。As such, Text Analytics for health (preview) should not be implemented or deployed in any production use. 健康状况文本分析不应用于或不可供用于医疗设备、临床支持、诊断工具或者其他旨在用于诊断、治愈、缓解、治疗或预防疾病或其他健康问题的技术,Microsoft 不授予将此功能用于此类目的的任何许可或权利。Text Analytics for health is not intended or made available for use as a medical device, clinical support, diagnostic tool, or other technology intended to be used in the diagnosis, cure, mitigation, treatment, or prevention of disease or other conditions, and no license or right is granted by Microsoft to use this capability for such purposes. 此功能不旨在代替专业人员医疗建议或保健意见、诊断、治疗或医疗保健专业人员临床判断而实施或部署,并且不应用作此用途。This capability is not designed or intended to be implemented or deployed as a substitute for professional medical advice or healthcare opinion, diagnosis, treatment, or the clinical judgment of a healthcare professional, and should not be used as such. 客户独自负责健康状况文本分析的任何使用。The customer is solely responsible for any use of Text Analytics for health. Microsoft 不保证健康状况文本分析或提供的与该功能相关的任何材料足够充分用于任何医疗目的,或者满足任何人的健康或医疗要求。Microsoft does not warrant that Text Analytics for health or any materials provided in connection with the capability will be sufficient for any medical purposes or otherwise meet the health or medical requirements of any person.

健康状况文本分析是一种容器化服务,它从非结构化文本(如医生的备注、出院摘要、临床文档和电子健康状况记录)中提取和标记相关医疗信息。Text Analytics for health is a containerized service that extracts and labels relevant medical information from unstructured texts such as doctor's notes, discharge summaries, clinical documents, and electronic health records.

功能Features

健康状况文本分析容器当前在满足特定安全和数据管理要求的用户自身开发环境中执行英语文本的命名实体识别 (NER)、关系提取、实体否定和实体链接。The Text Analytics for health container currently performs Named Entity Recognition (NER), relation extraction, entity negation and entity linking for English-language text in your own development environment that meets your specific security and data governance requirements.

命名实体识别检测非结构化文本中提及的可与一个或多个语义类型关联的字词和短语,如诊断、药物名称、症状/体征或年龄。Named Entity Recognition detects words and phrases mentioned in unstructured text that can be associated with one or more semantic types, such as diagnosis, medication name, symptom/sign, or age.

健康状况 NERHealth NER

支持的语言Supported languages

健康状况文本分析仅支持英语文档。Text Analytics for health only supports English language documents.

请求访问容器注册表Request access to the container registry

填写并提交认知服务容器请求表单以请求访问容器。Fill out and submit the Cognitive Services containers request form to request access to the container. 目前,将不会对健康状况文本分析容器的使收费。Currently you will not be billed for Text Analytics for health usage.

通过该表单请求有关你、你的公司以及要使用该容器的用户方案的信息。The form requests information about you, your company, and the user scenario for which you'll use the container. 提交表单后,Azure 认知服务团队可以检查它,确保你满足访问专用容器注册表的条件。After you've submitted the form, the Azure Cognitive Services team reviews it to ensure that you meet the criteria for access to the private container registry.

重要

必须使用与表单中的 Microsoft 帐户 (MSA) 或 Azure Active Directory (Azure AD) 帐户关联的电子邮件地址。You must use an email address that's associated with either a Microsoft Account (MSA) or Azure Active Directory (Azure AD) account in the form.

如果请求获得批准,则你会收到一封电子邮件,其中说明了如何获取凭据和访问专用容器注册表。If your request is approved, you'll receive an email with instructions that describe how to obtain your credentials and access the private container registry.

使用 Docker CLI 对专用容器注册表进行身份验证Use the Docker CLI to authenticate the private container registry

可通过多种方法中的任何一种使用认知服务容器的专用容器注册表进行身份验证,但建议的方法是在 Docker CLI 中使用命令行。You can authenticate with the private container registry for Cognitive Services Containers in any of several ways, but the recommended method from the command line is to use the Docker CLI.

使用 docker login 命令(如以下示例所示)登录到 containerpreview.azurecr.cn,即认知服务容器的专用容器注册表。Use the docker login command, as shown in the following example, to log in to containerpreview.azurecr.cn, the private container registry for Cognitive Services Containers. 将 <username> 替换为用户名,将 <password> 替换为从 Azure 认知服务团队收到的凭据中提供的密码 。Replace <username> with the user name and <password> with the password that's provided in the credentials you received from the Azure Cognitive Services team.

docker login containerpreview.azurecr.cn -u <username> -p <password>

如果已在文本文件中保护了凭据,则可以使用 cat 命令将该文本文件的内容连接到 docker login 命令,如以下示例所示。If you've secured your credentials in a text file, you can concatenate the contents of that text file, by using the cat command, to the docker login command, as shown in the following example. 将 <passwordFile> 替换为包含密码的文本文件的路径和名称,将 <username> 替换为凭据中提供的用户名 。Replace <passwordFile> with the path and name of the text file that contains the password and <username> with the user name that's provided in your credentials.

cat <passwordFile> | docker login containerpreview.azurecr.cn -u <username> --password-stdin

安装容器Install the container

可以通过多种方式来安装和运行容器。There are multiple ways you can install and run the container.

  • 使用 Azure 门户创建文本分析资源,并使用 Docker 获取容器。Use the Azure portal to create a Text Analytics resource, and use Docker to get your container.
  • 使用以下 PowerShell 和 Azure CLI 脚本来自动执行资源部署容器配置。Use the following PowerShell and Azure CLI scripts to automate resource deployment container configuration.

使用 Azure 用于容器的 Web 应用安装容器Install the container using Azure Web App for Containers

Azure 用于容器的 Web 应用是专用于在云中运行容器的 Azure 资源。Azure Web App for Containers is an Azure resource dedicated to running containers in the cloud. 它提供自动缩放、Docker 容器和 Docker 编写支持、HTTPS 支持等开箱即用功能。It brings out-of-the-box capabilities such as autoscaling, support of docker containers and docker compose, HTTPS support and much more.

备注

使用 Azure Web 应用时,将自动获取 <appservice_name>.chinacloudsites.cn 格式的域Using Azure Web App you will automatically get a domain in the form of <appservice_name>.chinacloudsites.cn

通过 HTTPS 使用订阅和容器映像,从而利用 Azure CLI 运行此 PowerShell 脚本来创建用于容器的 Web 应用。Run this PowerShell script using the Azure CLI to create a Web App for Containers, using your subscription and the container image over HTTPS. 等待脚本完成(大约 20 分钟),然后提交第一个请求。Wait for the script to complete (approximately 20 minutes) before submitting the first request.

$subscription_name = ""                    # THe name of the subscription you want you resource to be created on.
$resource_group_name = ""                  # The name of the resource group you want the AppServicePlan
                                           #    and AppSerivce to be attached to.
$resources_location = ""                   # This is the location you wish the AppServicePlan to be deployed to.
                                           #    You can use the "az account list-locations -o table" command to
                                           #    get the list of available locations and location code names.
$appservice_plan_name = ""                 # This is the AppServicePlan name you wish to have.
$appservice_name = ""                      # This is the AppService resource name you wish to have.
$TEXT_ANALYTICS_RESOURCE_API_KEY = ""      # This should be taken from the Text Analytics resource.
$TEXT_ANALYTICS_RESOURCE_API_ENDPOINT = "" # This should be taken from the Text Analytics resource.
$DOCKER_REGISTRY_SERVER_PASSWORD = ""      # This will be provided separately.
$DOCKER_REGISTRY_SERVER_USERNAME = ""      # This will be provided separately.
$DOCKER_IMAGE_NAME = "containerpreview.azurecr.io/microsoft/cognitive-services-healthcare:latest"

az login
az account set -s $subscription_name
az appservice plan create -n $appservice_plan_name -g $resource_group_name --is-linux -l $resources_location --sku P3V2
az webapp create -g $resource_group_name -p $appservice_plan_name -n $appservice_name -i $DOCKER_IMAGE_NAME -s $DOCKER_REGISTRY_SERVER_USERNAME -w $DOCKER_REGISTRY_SERVER_PASSWORD
az webapp config appsettings set -g $resource_group_name -n $appservice_name --settings Eula=accept Billing=$TEXT_ANALYTICS_RESOURCE_API_ENDPOINT ApiKey=$TEXT_ANALYTICS_RESOURCE_API_KEY

# Once deployment complete, the resource should be available at: https://<appservice_name>.chinacloudsites.cn

使用 Azure 容器实例安装容器Install the container using Azure Container Instance

还可以使用 Azure 容器实例 (ACI) 更轻松地部署。You can also use an Azure Container Instance (ACI) to make deployment easier. ACI 资源允许在托管的无服务器 Azure 环境中按需运行 Docker 容器。ACI is a resource that allows you to run Docker containers on-demand in a managed, serverless Azure environment.

有关使用 Azure 门户部署 ACI 资源的步骤,请参阅如何使用 Azure 容器实例See How to use Azure Container Instances for steps on deploying an ACI resource using the Azure portal. 还可以通过 Azure CLI 使用以下 PowerShell 脚本,这将使用容器映像在订阅上创建 ACI。You can also use the below PowerShell script using Azure CLI, which will create a ACI on your subscription using the container image. 等待脚本完成(大约 20 分钟),然后提交第一个请求。Wait for the script to complete (approximately 20 minutes) before submitting the first request.

备注

Azure 容器实例不包括对内置域的 HTTPS 支持。Azure Container Instances don't include HTTPS support for the builtin domains. 如果需要 HTTPS,则需要手动配置它,包括创建证书和注册域。If you need HTTPS, you will need to manually configure it, including creating a certificate and registering a domain. 可以通过下面的 NGINX 查找有关如何执行此操作的说明。You can find instructions to do this with NGINX below.

$subscription_name = ""                    # The name of the subscription you want you resource to be created on.
$resource_group_name = ""                  # The name of the resource group you want the AppServicePlan
                                           # and AppService to be attached to.
$resources_location = ""                   # This is the location you wish the web app to be deployed to.
                                           # You can use the "az account list-locations -o table" command to
                                           # Get the list of available locations and location code names.
$azure_container_instance_name = ""        # This is the AzureContainerInstance name you wish to have.
$TEXT_ANALYTICS_RESOURCE_API_KEY = ""      # This should be taken from the Text Analytics resource.
$TEXT_ANALYTICS_RESOURCE_API_ENDPOINT = "" # This should be taken from the Text Analytics resource.
$DOCKER_REGISTRY_SERVER_PASSWORD = ""      # This will be provided separately.
$DOCKER_REGISTRY_SERVER_USERNAME = ""      # This will be provided separately.
$DNS_LABEL = ""                            # This is the DNS label name you wish your ACI will have
$DOCKER_REGISTRY_LOGIN_SERVER = "containerpreview.azurecr.io"
$DOCKER_IMAGE_NAME = "containerpreview.azurecr.io/microsoft/cognitive-services-healthcare:latest"

az login
az account set -s $subscription_name
az container create --resource-group $resource_group_name --name $azure_container_instance_name --image $DOCKER_IMAGE_NAME --cpu 5 --memory 12 --registry-login-server $DOCKER_REGISTRY_LOGIN_SERVER --registry-username $DOCKER_REGISTRY_SERVER_USERNAME --registry-password $DOCKER_REGISTRY_SERVER_PASSWORD --port 5000 --dns-name-label $DNS_LABEL --environment-variables Eula=accept Billing=$TEXT_ANALYTICS_RESOURCE_API_ENDPOINT ApiKey=$TEXT_ANALYTICS_RESOURCE_API_KEY

# Once deployment complete, the resource should be available at: http://<unique_dns_label>.<resource_group_region>.azurecontainer.io:5000

保护 ACI 连接Secure ACI connectivity

默认情况下,将 ACI 与容器 API 一起使用时不提供安全性。By default there is no security provided when using ACI with container API. 这是因为大多数情况下容器会作为 Pod 的一部分运行,而 Pod 受网络桥的保护,与外部隔离。This is because typically containers will run as part of a pod which is protected from the outside by a network bridge. 但是,可以使用前端组件来修改容器,使容器终结点保持专用。You can however modify a container with a front-facing component, keeping the container endpoint private. 以下示例使用 NGINX 作为入口网关,以支持 HTTPS/SSL 和客户端证书身份验证。The following examples use NGINX as an ingress gateway to support HTTPS/SSL and client-certificate authentication.

备注

NGINX 是高性能的开源 HTTP 服务器和代理。NGINX is an open-source, high-performance HTTP server and proxy. NGINX 容器可用于终止单个容器的 TLS 连接。An NGINX container can be used to terminate a TLS connection for a single container. 还可能会有更复杂的基于 NGINX 入口的 TLS 终止解决方案。More complex NGINX ingress-based TLS termination solutions are also possible.

将 NGINX 设置为入口网关Set up NGINX as an ingress gateway

NGINX 使用配置文件在运行时启用功能。NGINX uses configuration files to enable features at runtime. 若要为另一个服务启用 TLS 终止,必须指定用于终止 TLS 连接的 SSL 证书,以及用于指定服务地址的 proxy_passIn order to enable TLS termination for another service, you must specify an SSL certificate to terminate the TLS connection and proxy_pass to specify an address for the service. 下面提供了示例。A sample is provided below.

备注

ssl_certificate 需要路径在 NGINX 容器的本地文件系统中进行指定。ssl_certificate expects a path to be specified within the NGINX container's local filesystem. proxy_pass 指定的地址必须在 NGINX 容器的网络中可用。The address specified for proxy_pass must be available from within the NGINX container's network.

NGINX 容器将 /etc/nginx/conf.d/ 下装载的 _.conf_ 中的所有文件加载到 HTTP 配置路径。The NGINX container will load all of the files in the _.conf_ that are mounted under /etc/nginx/conf.d/ into the HTTP configuration path.

server {
  listen              80;
  return 301 https://$host$request_uri;
}
server {
  listen              443 ssl;
  # replace with .crt and .key paths
  ssl_certificate     /cert/Local.crt;
  ssl_certificate_key /cert/Local.key;

  location / {
    proxy_pass http://cognitive-service:5000;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Real-IP  $remote_addr;
  }
}

示例 Docker compose 文件Example Docker compose file

下面的示例演示如何创建 docker compose 文件来部署 NGINX 和健康状况文本分析容器:The below example shows how a docker compose file can be created to deploy the NGINX and Text Analytics for health containers:

version: "3.7"
services:
  cognitive-service:
    image: {IMAGE_ID}
    ports:
      - 5000:5000
    environment:
      - eula=accept
      - billing={ENDPOINT_URI}
      - apikey={API_KEY}
      - Logging:Disk:Format=json
    volumes:
        # replace with path to logs folder
      - <path-to-logs-folder>:/output
  nginx:
    image: nginx
    ports:
      - 443:443
    volumes:
        # replace with paths for certs and conf folders
      - <path-to-certs-folder>:/cert
      - <path-to-conf-folder>:/etc/nginx/conf.d/

若要启动此 Docker compose 文件,请在文件根级别从控制台中执行以下命令:To initiate this Docker compose file, execute the following command from a console at the root level of the file:

docker-compose up

有关详细信息,请参阅有关 NGINX SSL 终止的 NGINX 文档。For more information, see NGINX's documentation on NGINX SSL Termination.

示例 API 请求Example API request

该容器提供基于 REST 的查询预测终结点 API。The container provides REST-based query prediction endpoint APIs.

使用下面的示例 cURL 请求将查询提交到已部署的容器,并使用适当的值替换 serverURL 变量。Use the example cURL request below to submit a query to the container you have deployed replacing the serverURL variable with the appropriate value.

curl -X POST 'http://<serverURL>:5000/text/analytics/v3.0-preview.1/domains/health' --header 'Content-Type: application/json' --header 'accept: application/json' --data-binary @example.json

以下 JSON 是附加到健康状况文本分析 API 请求 POST 正文的 JSON 文件示例:The following JSON is an example of a JSON file attached to the Text Analytics for health API request's POST body:

example.json

{
  "documents": [
    {
      "language": "en",
      "id": "1",
      "text": "Patient reported itchy sores after swimming in the lake."
    },
    {
      "language": "en",
      "id": "2",
      "text": "Prescribed 50mg benadryl, taken twice daily."
    }
  ]
}

API 响应正文API response body

以下 JSON 是健康状况文本分析 API 响应正文的示例:The following JSON is an example of the Text Analytics for health API response body:

{
    "documents": [
        {
            "id": "1",
            "entities": [
                {
                    "id": "0",
                    "offset": 17,
                    "length": 11,
                    "text": "itchy sores",
                    "type": "SYMPTOM_OR_SIGN",
                    "score": 0.97,
                    "isNegated": false
                }
            ]
        },
        {
            "id": "2",
            "entities": [
                {
                    "id": "0",
                    "offset": 11,
                    "length": 4,
                    "text": "50mg",
                    "type": "DOSAGE",
                    "score": 1.0,
                    "isNegated": false
                },
                {
                    "id": "1",
                    "offset": 16,
                    "length": 8,
                    "text": "benadryl",
                    "type": "MEDICATION_NAME",
                    "score": 0.99,
                    "isNegated": false,
                    "links": [
                        {
                            "dataSource": "UMLS",
                            "id": "C0700899"
                        },
                        {
                            "dataSource": "CHV",
                            "id": "0000044903"
                        },
                        {
                            "dataSource": "MMSL",
                            "id": "899"
                        },
                        {
                            "dataSource": "MSH",
                            "id": "D004155"
                        },
                        {
                            "dataSource": "NCI",
                            "id": "C300"
                        },
                        {
                            "dataSource": "NCI_DTP",
                            "id": "NSC0033299"
                        },
                        {
                            "dataSource": "PDQ",
                            "id": "CDR0000039163"
                        },
                        {
                            "dataSource": "PSY",
                            "id": "05760"
                        },
                        {
                            "dataSource": "RXNORM",
                            "id": "203457"
                        }
                    ]
                },
                {
                    "id": "2",
                    "offset": 32,
                    "length": 11,
                    "text": "twice daily",
                    "type": "FREQUENCY",
                    "score": 1.0,
                    "isNegated": false
                }
            ],
            "relations": [
                {
                    "relationType": "DOSAGE_OF_MEDICATION",
                    "score": 1.0,
                    "entities": [
                        {
                            "id": "0",
                            "role": "ATTRIBUTE"
                        },
                        {
                            "id": "1",
                            "role": "ENTITY"
                        }
                    ]
                },
                {
                    "relationType": "FREQUENCY_OF_MEDICATION",
                    "score": 1.0,
                    "entities": [
                        {
                            "id": "1",
                            "role": "ENTITY"
                        },
                        {
                            "id": "2",
                            "role": "ATTRIBUTE"
                        }
                    ]
                }
            ]
        }
    ],
    "errors": [],
    "modelVersion": "2020-05-08"
}

备注

在某些情况下,通过否定检测,单个否定词语一次可以处理多个词语。With negation detection, in some cases a single negation term may address several terms at once. 通过 isNegated 标志的布尔值将已识别实体的否定表示在 JSON 输出中:The negation of a recognized entity is represented in the JSON output by the boolean value of the isNegated flag:

{
  "id": "2",
  "offset": 90,
  "length": 10,
  "text": "chest pain",
  "type": "SYMPTOM_OR_SIGN",
  "score": 0.9972,
  "isNegated": true,
  "links": [
    {
      "dataSource": "UMLS",
      "id": "C0008031"
    },
    {
      "dataSource": "CHV",
      "id": "0000023593"
    },
    ...

另请参阅See also