Tutorial: Configure Apache HBase policies in HDInsight with Enterprise Security Package

Learn how to configure Apache Ranger policies for Enterprise Security Package (ESP) Apache HBase clusters. ESP clusters are connected to a domain allowing users to authenticate with domain credentials. In this tutorial, you create two Ranger policies to restrict access to different column-families in an HBase table.

In this tutorial, you learn how to:

  • Create domain users.
  • Create Ranger policies.
  • Create tables in an HBase cluster.
  • Test Ranger policies.

Before you begin

Connect to Apache Ranger Admin UI

  1. From a browser, connect to the Ranger Admin user interface (UI) by using the URL https://<ClusterName>.azurehdinsight.cn/Ranger/. Remember to change <ClusterName> to the name of your HBase cluster.

    Note

    Ranger credentials aren't the same as Hadoop cluster credentials. To prevent browsers from using cached Hadoop credentials, use a new InPrivate browser window to connect to the Ranger Admin UI.

  2. Sign in by using your Microsoft Entra admin credentials. The Microsoft Entra admin credentials aren't the same as HDInsight cluster credentials or Linux HDInsight node Secure Shell (SSH) credentials.

Create domain users

To learn how to create the sales_user1 and marketing_user1 domain users, see Create an HDInsight cluster with Enterprise Security Package. In a production scenario, domain users come from your Active Directory tenant.

Create HBase tables and import sample data

You can use SSH to connect to HBase clusters and then use Apache HBase Shell to create HBase tables, insert data, and query data. For more information, see Use SSH with HDInsight.

Use the HBase shell

  1. From SSH, run the following HBase command:

    hbase shell
    
  2. Create an HBase table Customers with two column-families: Name and Contact.

    create 'Customers', 'Name', 'Contact'
    list
    
  3. Insert some data:

    put 'Customers','1001','Name:First','Alice'
    put 'Customers','1001','Name:Last','Johnson'
    put 'Customers','1001','Contact:Phone','333-333-3333'
    put 'Customers','1001','Contact:Address','313 133rd Place'
    put 'Customers','1001','Contact:City','Redmond'
    put 'Customers','1001','Contact:State','WA'
    put 'Customers','1001','Contact:ZipCode','98052'
    put 'Customers','1002','Name:First','Robert'
    put 'Customers','1002','Name:Last','Stevens'
    put 'Customers','1002','Contact:Phone','777-777-7777'
    put 'Customers','1002','Contact:Address','717 177th Ave'
    put 'Customers','1002','Contact:City','Bellevue'
    put 'Customers','1002','Contact:State','WA'
    put 'Customers','1002','Contact:ZipCode','98008'
    
  4. View the contents of the table:

    scan 'Customers'
    

    Screenshot that shows the HDInsight Hadoop HBase shell output.

Create Ranger policies

Create a Ranger policy for sales_user1 and marketing_user1.

  1. Open the Ranger Admin UI. Under HBase, select <ClusterName>_hbase.

    Screenshot that shows the HDInsight Apache Ranger Admin UI.

  2. The List of Policies screen shows all Ranger policies created for this cluster. One preconfigured policy might be listed. Select Add New Policy.

    Screenshot that shows the Apache Ranger HBase policies list.

  3. On the Create Policy screen, enter the following values:

    Setting Suggested value
    Policy Name sales_customers_name_contact
    HBase Table Customers
    HBase Column-family Name, Contact
    HBase Column *
    Select Group
    Select User sales_user1
    Permissions Read

    The following wildcards can be included in the topic name:

    • * indicates zero or more occurrences of characters.
    • ? indicates single character.

    Screenshot that shows the Apache Ranger policy Create sales.

    Note

    Wait a few moments for Ranger to sync with Microsoft Entra ID if a domain user isn't automatically populated for Select User.

  4. Select Add to save the policy.

  5. Select Add New Policy and then enter the following values:

    Setting Suggested value
    Policy Name marketing_customers_contact
    HBase Table Customers
    HBase Column-family Contact
    HBase Column *
    Select Group
    Select User marketing_user1
    Permissions Read

    Screenshot that shows the Apache Ranger policy Create marketing.

  6. Select Add to save the policy.

Test the Ranger policies

Based on the Ranger policies configured, sales_user1 can view all the data for the columns in both the Name and Contact column families. The marketing_user1 can view data only in the Contact column family.

Access data as sales_user1

  1. Open a new SSH connection to the cluster. Use the following command to sign in to the cluster:

    ssh sshuser@CLUSTERNAME-ssh.azurehdinsight.cn
    
  2. Use the kinit command to change to the context of the desired user:

    kinit sales_user1
    
  3. Open the HBase shell and scan the table Customers:

    hbase shell
    scan `Customers`
    
  4. Notice that the sales user can view all columns of the Customers table. The user can see the two columns in the Name column-family and the five columns in the Contact column-family.

    ROW                                COLUMN+CELL
     1001                              column=Contact:Address, timestamp=1548894873820, value=313 133rd Place
     1001                              column=Contact:City, timestamp=1548895061523, value=Redmond
     1001                              column=Contact:Phone, timestamp=1548894871759, value=333-333-3333
     1001                              column=Contact:State, timestamp=1548895061613, value=WA
     1001                              column=Contact:ZipCode, timestamp=1548895063111, value=98052
     1001                              column=Name:First, timestamp=1548894871561, value=Alice
     1001                              column=Name:Last, timestamp=1548894871707, value=Johnson
     1002                              column=Contact:Address, timestamp=1548894899174, value=717 177th Ave
     1002                              column=Contact:City, timestamp=1548895103129, value=Bellevue
     1002                              column=Contact:Phone, timestamp=1548894897524, value=777-777-7777
     1002                              column=Contact:State, timestamp=1548895103231, value=WA
     1002                              column=Contact:ZipCode, timestamp=1548895104804, value=98008
     1002                              column=Name:First, timestamp=1548894897419, value=Robert
     1002                              column=Name:Last, timestamp=1548894897487, value=Stevens
    2 row(s) in 0.1000 seconds
    

Access data as marketing_user1

  1. Open a new SSH connection to the cluster. Use the following command to sign in as marketing_user1:

    ssh sshuser@CLUSTERNAME-ssh.azurehdinsight.cn
    
  2. Use the kinit command to change to the context of our desired user:

    kinit marketing_user1
    
  3. Open the HBase shell and scan the table Customers:

    hbase shell
    scan `Customers`
    
  4. Notice that the marketing user can only view the five columns of the Contact column-family.

    ROW                                COLUMN+CELL
     1001                              column=Contact:Address, timestamp=1548894873820, value=313 133rd Place
     1001                              column=Contact:City, timestamp=1548895061523, value=Redmond
     1001                              column=Contact:Phone, timestamp=1548894871759, value=333-333-3333
     1001                              column=Contact:State, timestamp=1548895061613, value=WA
     1001                              column=Contact:ZipCode, timestamp=1548895063111, value=98052
     1002                              column=Contact:Address, timestamp=1548894899174, value=717 177th Ave
     1002                              column=Contact:City, timestamp=1548895103129, value=Bellevue
     1002                              column=Contact:Phone, timestamp=1548894897524, value=777-777-7777
     1002                              column=Contact:State, timestamp=1548895103231, value=WA
     1002                              column=Contact:ZipCode, timestamp=1548895104804, value=98008
    2 row(s) in 0.0730 seconds
    
  5. View the audit access events from the Ranger UI.

    Screenshot that shows the HDInsight Ranger UI Policy Audit.

Clean up resources

If you aren't going to continue to use this application, delete the HBase cluster that you created:

  1. Sign in to the Azure portal.
  2. In the Search box at the top, enter HDInsight.
  3. Under Services, select HDInsight clusters.
  4. In the list of HDInsight clusters that appears, select the ... next to the cluster that you created for this tutorial.
  5. Select Delete > Yes.

Next steps