Test Databricks notebooks
This page briefly describes some techniques that are useful when testing code directly in Databricks notebooks. You can use these methods separately or together.
For a detailed walkthrough of how to set up and organize functions and unit tests in Databricks notebooks, see Unit testing for notebooks.
Many unit testing libraries work directly within the notebook. For example, you can use the built-in Python unittest
package to test notebook code.
def reverse(s):
return s[::-1]
import unittest
class TestHelpers(unittest.TestCase):
def test_reverse(self):
self.assertEqual(reverse('abc'), 'cba')
r = unittest.main(argv=[''], verbosity=2, exit=False)
assert r.result.wasSuccessful(), 'Test failed; see logs above'
Test failures appear in the output area of the cell.
Use Databricks widgets to select notebook mode
You can use widgets to distinguish test invocations from normal invocations in a single notebook. The following code produces the example shown in the screenshot:
dbutils.widgets.dropdown("Mode", "Test", ["Test", "Normal"])
def reverse(s):
return s[::-1]
if dbutils.widgets.get('Mode') == 'Test':
assert reverse('abc') == 'cba'
print('Tests passed')
else:
print(reverse('desrever'))
The first line generates the Mode dropdown menu:
Hide test code and results
To hide test code and results, select Hide Code or Hide Result from the cell actions menu. Errors are displayed even if results are hidden.
Schedule tests to run automatically
To run tests periodically and automatically, you can use scheduled notebooks. You can configure the job to send notification emails to an email address that you specify.
Separate test code from the notebook
You can keep your test code separate from your notebook using either %run
or Databricks Git folders. When you use %run
, test code is included in a separate notebook that you call from another notebook. When you use Databricks Git folders, you can keep test code in non-notebook source code files.
This section shows some examples of using %run
and Databricks Git folders to separate your test code from the notebook.
Use %run
The screenshot below shows how to use %run
to run a notebook from another notebook. For more information about using %run
, see Use %run to import a notebook. The code used to generate the examples is shown following the screenshot.
Here is the code used in the example. This code assumes that the notebooks shared-code-notebook and shared-code-notebook-test are in the same workspace folder.
shared-code-notebook:
def reverse(s):
return s[::-1]
shared-code-notebook-test:
In one cell:
%run ./shared-code-notebook
In a subsequent cell:
import unittest
class TestHelpers(unittest.TestCase):
def test_reverse(self):
self.assertEqual(reverse('abc'), 'cba')
r = unittest.main(argv=[''], verbosity=2, exit=False)
assert r.result.wasSuccessful(), 'Test failed; see logs above'
Use Databricks Git folders
For code stored in a Databricks Git folder , you can call the test and run it directly from a notebook.
You can also use web terminal to run tests in source code files just as you would on your local machine.
Set up a CI/CD-style workflow
For notebooks in a Databricks Git folder , you can set up a CI/CD-style workflow by configuring notebook tests to run for each commit. See Databricks GitHub Actions.