LZO compressed file
Due to licensing restrictions, the LZO compression codec is not available by default on Azure Databricks clusters. To read an LZO compressed file, you must use an init script to install the codec on your cluster at launch time.
Notebook example: Init LZO compressed files
The following notebook:
- Builds the LZO codec.
- Creates an init script that:
- Installs the LZO compression libraries and the
lzop
command, and copies the LZO codec to proper class path. - Configures Spark to use the LZO compression codec.
- Installs the LZO compression libraries and the
Init LZO compressed files notebook
Notebook example: Read LZO compressed files
The following notebook reads LZO compressed files using the codec installed by the init script: