Quickstart: Detect Personally Identifiable Information (PII)

Note

This quickstart only covers PII detection in documents. To learn more about detecting PII in conversations, see How to detect and redact PII in conversations.

Reference documentation | More samples | Package (NuGet) | Library source code

Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for .NET. In the following example, you create a C# application that can identify recognized sensitive information in text.

Tip

You can use Language Studio to try PII detection in documents without needing to write code.

Prerequisites

Setting up

Create environment variables

Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.

To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.

  • To set the LANGUAGE_KEY environment variable, replace your-key with one of the keys for your resource.
  • To set the LANGUAGE_ENDPOINT environment variable, replace your-endpoint with the endpoint for your resource.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint

Note

If you only need to access the environment variables in the current running console, you can set the environment variable with set instead of setx.

After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.

Create a new .NET Core application

Using the Visual Studio IDE, create a new .NET Core console app. This creates a "Hello World" project with a single C# source file: program.cs.

Install the client library by right-clicking on the solution in the Solution Explorer and selecting Manage NuGet Packages. In the package manager that opens select Browse and search for Azure.AI.TextAnalytics. Select version 5.2.0, and then Install. You can also use the Package Manager Console.

Code example

Copy the following code into your program.cs file and run the code.

using Azure;
using System;
using Azure.AI.TextAnalytics;

namespace Example
{
    class Program
    {
        // This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
        static string languageKey = Environment.GetEnvironmentVariable("LANGUAGE_KEY");
        static string languageEndpoint = Environment.GetEnvironmentVariable("LANGUAGE_ENDPOINT");

        private static readonly AzureKeyCredential credentials = new AzureKeyCredential(languageKey);
        private static readonly Uri endpoint = new Uri(languageEndpoint);

        // Example method for detecting sensitive information (PII) from text 
        static void RecognizePIIExample(TextAnalyticsClient client)
        {
            string document = "Call our office at 312-555-1234, or send an email to support@contoso.com.";
        
            PiiEntityCollection entities = client.RecognizePiiEntities(document).Value;
        
            Console.WriteLine($"Redacted Text: {entities.RedactedText}");
            if (entities.Count > 0)
            {
                Console.WriteLine($"Recognized {entities.Count} PII entit{(entities.Count > 1 ? "ies" : "y")}:");
                foreach (PiiEntity entity in entities)
                {
                    Console.WriteLine($"Text: {entity.Text}, Category: {entity.Category}, SubCategory: {entity.SubCategory}, Confidence score: {entity.ConfidenceScore}");
                }
            }
            else
            {
                Console.WriteLine("No entities were found.");
            }
        }

        static void Main(string[] args)
        {
            var client = new TextAnalyticsClient(endpoint, credentials);
            RecognizePIIExample(client);

            Console.Write("Press any key to exit.");
            Console.ReadKey();
        }

    }
}

Output

Redacted Text: Call our office at ************, or send an email to *******************.
Recognized 2 PII entities:
Text: 312-555-1234, Category: PhoneNumber, SubCategory: , Confidence score: 0.8
Text: support@contoso.com, Category: Email, SubCategory: , Confidence score: 0.8

Reference documentation | More samples | Package (Maven) | Library source code

Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Java. In the following example, you create a Java application that can identify recognized sensitive information in text.

Tip

You can use Language Studio to try PII detection in documents without needing to write code.

Prerequisites

Setting up

Create environment variables

Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.

To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.

  • To set the LANGUAGE_KEY environment variable, replace your-key with one of the keys for your resource.
  • To set the LANGUAGE_ENDPOINT environment variable, replace your-endpoint with the endpoint for your resource.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint

Note

If you only need to access the environment variables in the current running console, you can set the environment variable with set instead of setx.

After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.

Add the client library

Create a Maven project in your preferred IDE or development environment. Then add the following dependency to your project's pom.xml file. You can find the implementation syntax for other build tools online.

<dependencies>
     <dependency>
        <groupId>com.azure</groupId>
        <artifactId>azure-ai-textanalytics</artifactId>
        <version>5.2.0</version>
    </dependency>
</dependencies>

Code example

Create a Java file named Example.java. Open the file and copy the below code. Then run the code.

import com.azure.core.credential.AzureKeyCredential;
import com.azure.ai.textanalytics.models.*;
import com.azure.ai.textanalytics.TextAnalyticsClientBuilder;
import com.azure.ai.textanalytics.TextAnalyticsClient;

public class Example {

    // This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
    private static String languageKey = System.getenv("LANGUAGE_KEY");
    private static String languageEndpoint = System.getenv("LANGUAGE_ENDPOINT");

    public static void main(String[] args) {
        TextAnalyticsClient client = authenticateClient(languageKey, languageEndpoint);
        recognizePiiEntitiesExample(client);
    }
    // Method to authenticate the client object with your key and endpoint
    static TextAnalyticsClient authenticateClient(String key, String endpoint) {
        return new TextAnalyticsClientBuilder()
                .credential(new AzureKeyCredential(key))
                .endpoint(endpoint)
                .buildClient();
    }

    // Example method for detecting sensitive information (PII) from text 
    static void recognizePiiEntitiesExample(TextAnalyticsClient client)
    {
        // The text that need be analyzed.
        String document = "My SSN is 859-98-0987";
        PiiEntityCollection piiEntityCollection = client.recognizePiiEntities(document);
        System.out.printf("Redacted Text: %s%n", piiEntityCollection.getRedactedText());
        piiEntityCollection.forEach(entity -> System.out.printf(
            "Recognized Personally Identifiable Information entity: %s, entity category: %s, entity subcategory: %s,"
                + " confidence score: %f.%n",
            entity.getText(), entity.getCategory(), entity.getSubcategory(), entity.getConfidenceScore()));
    }
}

Output

Redacted Text: My SSN is ***********
Recognized Personally Identifiable Information entity: 859-98-0987, entity category: USSocialSecurityNumber, entity subcategory: null, confidence score: 0.650000.

Reference documentation | More samples | Package (npm) | Library source code

Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Node.js. In the following example, you create a JavaScript application that can identify recognized sensitive information in text.

Prerequisites

Setting up

Create environment variables

Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.

To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.

  • To set the LANGUAGE_KEY environment variable, replace your-key with one of the keys for your resource.
  • To set the LANGUAGE_ENDPOINT environment variable, replace your-endpoint with the endpoint for your resource.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint

Note

If you only need to access the environment variables in the current running console, you can set the environment variable with set instead of setx.

After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.

Create a new Node.js application

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp 

cd myapp

Run the npm init command to create a node application with a package.json file.

npm init

Install the client library

Install the npm package:

npm install @azure/ai-text-analytics

Code example

Open the file and copy the below code. Then run the code.

"use strict";

const { TextAnalyticsClient, AzureKeyCredential } = require("@azure/ai-text-analytics");

// This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
const key = process.env.LANGUAGE_KEY;
const endpoint = process.env.LANGUAGE_ENDPOINT;

//an example document for pii recognition
const documents = [ "The employee's phone number is (555) 555-5555." ];

async function main() {
    console.log(`PII recognition sample`);
  
    const client = new TextAnalyticsClient(endpoint, new AzureKeyCredential(key));
  
    const documents = ["My phone number is 555-555-5555"];
  
    const [result] = await client.analyze("PiiEntityRecognition", documents, "en");
  
    if (!result.error) {
      console.log(`Redacted text: "${result.redactedText}"`);
      console.log("Pii Entities: ");
      for (const entity of result.entities) {
        console.log(`\t- "${entity.text}" of type ${entity.category}`);
      }
    }
}

main().catch((err) => {
console.error("The sample encountered an error:", err);
});

Output

PII recognition sample
Redacted text: "My phone number is ************"
Pii Entities:
        - "555-555-5555" of type PhoneNumber

Reference documentation | More samples | Package (PyPi) | Library source code

Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Python. In the following example, you'll create a Python application that can identify recognized sensitive information in text.

Prerequisites

Setting up

Create environment variables

Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.

To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.

  • To set the LANGUAGE_KEY environment variable, replace your-key with one of the keys for your resource.
  • To set the LANGUAGE_ENDPOINT environment variable, replace your-endpoint with the endpoint for your resource.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint

Note

If you only need to access the environment variables in the current running console, you can set the environment variable with set instead of setx.

After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.

Install the client library

After installing Python, you can install the client library with:

pip install azure-ai-textanalytics==5.2.0

Code example

Create a new Python file and copy the below code. Then run the code.

# This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
language_key = os.environ.get('LANGUAGE_KEY')
language_endpoint = os.environ.get('LANGUAGE_ENDPOINT')

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

# Authenticate the client using your key and endpoint 
def authenticate_client():
    ta_credential = AzureKeyCredential(language_key)
    text_analytics_client = TextAnalyticsClient(
            endpoint=language_endpoint, 
            credential=ta_credential)
    return text_analytics_client

client = authenticate_client()

# Example method for detecting sensitive information (PII) from text 
def pii_recognition_example(client):
    documents = [
        "The employee's SSN is 859-98-0987.",
        "The employee's phone number is 555-555-5555."
    ]
    response = client.recognize_pii_entities(documents, language="en")
    result = [doc for doc in response if not doc.is_error]
    for doc in result:
        print("Redacted Text: {}".format(doc.redacted_text))
        for entity in doc.entities:
            print("Entity: {}".format(entity.text))
            print("\tCategory: {}".format(entity.category))
            print("\tConfidence Score: {}".format(entity.confidence_score))
            print("\tOffset: {}".format(entity.offset))
            print("\tLength: {}".format(entity.length))
pii_recognition_example(client)

Output

Redacted Text: The ********'s SSN is ***********.
Entity: employee
        Category: PersonType
        Confidence Score: 0.97
        Offset: 4
        Length: 8
Entity: 859-98-0987
        Category: USSocialSecurityNumber
        Confidence Score: 0.65
        Offset: 22
        Length: 11
Redacted Text: The ********'s phone number is ************.
Entity: employee
        Category: PersonType
        Confidence Score: 0.96
        Offset: 4
        Length: 8
Entity: 555-555-5555
        Category: PhoneNumber
        Confidence Score: 0.8
        Offset: 31
        Length: 12

Reference documentation

Use this quickstart to send Personally Identifiable Information (PII) detection requests using the REST API. In the following example, you will use cURL to identify recognized sensitive information in text.

Prerequisites

Setting up

Create environment variables

Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.

To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.

  • To set the LANGUAGE_KEY environment variable, replace your-key with one of the keys for your resource.
  • To set the LANGUAGE_ENDPOINT environment variable, replace your-endpoint with the endpoint for your resource.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint

Note

If you only need to access the environment variables in the current running console, you can set the environment variable with set instead of setx.

After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.

Create a JSON file with the example request body

In a code editor, create a new file named test_pii_payload.json and copy the following JSON example. This example request will be sent to the API in the next step.

{
    "kind": "PiiEntityRecognition",
    "parameters": {
        "modelVersion": "latest"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "language": "en",
                "text": "Call our office at 312-555-1234, or send an email to support@contoso.com"
            }
        ]
    }
}
'

Save test_pii_payload.json somewhere on your computer. For example, your desktop.

Send a personally identifying information (PII) detection API request

Use the following commands to send the API request using the program you're using. Copy the command into your terminal, and run it.

parameter Description
-X POST <endpoint> Specifies your endpoint for accessing the API.
-H Content-Type: application/json The content type for sending JSON data.
-H "Ocp-Apim-Subscription-Key:<key> Specifies the key for accessing the API.
-d <documents> The JSON containing the documents you want to send.

Replace C:\Users\<myaccount>\Desktop\test_pii_payload.json with the location of the example JSON request file you created in the previous step.

Command prompt

curl -X POST "%LANGUAGE_ENDPOINT%/language/:analyze-text?api-version=2022-05-01" ^
-H "Content-Type: application/json" ^
-H "Ocp-Apim-Subscription-Key: %LANGUAGE_KEY%" ^
-d "@C:\Users\<myaccount>\Desktop\test_pii_payload.json"

PowerShell

curl.exe -X POST $env:LANGUAGE_ENDPOINT/language/:analyze-text?api-version=2022-05-01 `
-H "Content-Type: application/json" `
-H "Ocp-Apim-Subscription-Key: $env:LANGUAGE_KEY" `
-d "@C:\Users\<myaccount>\Desktop\test_pii_payload.json"

JSON response

{
	"kind": "PiiEntityRecognitionResults",
	"results": {
		"documents": [{
			"redactedText": "Call our office at ************, or send an email to *******************",
			"id": "1",
			"entities": [{
				"text": "312-555-1234",
				"category": "PhoneNumber",
				"offset": 19,
				"length": 12,
				"confidenceScore": 0.8
			}, {
				"text": "support@contoso.com",
				"category": "Email",
				"offset": 53,
				"length": 19,
				"confidenceScore": 0.8
			}],
			"warnings": []
		}],
		"errors": [],
		"modelVersion": "2021-01-15"
	}
}

Clean up resources

If you want to clean up and remove an Azure AI services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps