Connect to Labelbox

Labelbox is a training data platform used to create training data from images, video, audio, text, and tiled imagery. Using Labelbox, AI teams can customize a workflow to operate, manage and improve data labeling, data cataloging, and model debugging in a single, unified platform. Labelbox is designed to help AI teams build and operate production-grade machine learning systems.

You can connect your Azure Databricks clusters that have the Machine Learning version of the Databricks Runtime to Labelbox.

Connect to Labelbox manually

The steps in this section describe how to connect Labelbox to an Azure Databricks cluster.

Requirements

You must have an available cluster running Databricks Runtime for Machine Learning. To check this for an existing cluster, look for ML in the Runtime column when you display the cluster in your workspace. If you do not have an available Databricks Runtime ML cluster, create a cluster and for Databricks Runtime Version, choose a version from the ML list.

Steps to connect

To connect to Labelbox manually, do the following:

  1. Go to the Labelbox page to Sign Up for a new Labelbox account or to Log In to your existing Labelbox account.
  2. Create a Labelbox API key for your Labelbox account, if you do not have one. Copy the API key and save it in a secure location, as the key will eventually be hidden from view, and you will need this key later.
  3. Check for a Labelbox starter notebook in your workspace:
    1. In the sidebar, click Workspace > Shared.
    2. If a folder named labelbox_demo does not already exist, create it: i. Click the down arrow next to Shared. ii. Click Create > Folder. iii. Enter labelbox_demo, iv. Click Create Folder.
    3. Click the labelbox_demo folder. If a starter notebook named labelbox_databricks_example.ipynb does not exist in the folder, import it: i. Click the down arrow next to labelbox_demo. ii. Click Import. iii. Click URL. iv. Enter https://github.com/Labelbox/labelbox-python/blob/develop/examples/integrations/databricks/labelbox_databricks_example.ipynb and click Import.
  4. Continue to set up the ML cluster and Labelbox starter notebook.

Set up the ML cluster and Labelbox starter notebook

  1. Check that the required Labelbox libraries are installed in your ML cluster:
    1. In the sidebar, click Compute.

    2. Click your ML cluster. Use the Filter box to find it, if necessary.

    3. Click the Libraries tab.

    4. If the labelbox package is not listed, install it: i. Click Install New. ii. Click PyPI. iii. For Package, enter labelbox. iv. Click Install.

    5. If the labelspark package is not listed, install it: i. Click Install New. ii. Click PyPI. iii. For Package, enter labelspark. iv. Click Install.

  2. Attach your ML cluster to the starter notebook:
    1. In the sidebar, click Workspace > Shared > labelbox_demo > labelbox_databricks_example.ipynb.
    2. Attach your ML cluster to the notebook.
  3. Browse through the notebook to learn how to automate Labelbox.

Additional resources