如何在 Azure Databricks 群集上覆盖 log4j 配置How to overwrite log4j configurations on Azure Databricks clusters

在具有自定义配置的群集上,没有用于覆盖 log4j 配置的标准方法。There is no standard way to overwrite log4j configurations on clusters with custom configurations. 必须使用 init 脚本覆盖配置文件。You must overwrite the configuration files using init scripts.

当前配置存储在两个 log4j.properties 文件中:The current configurations are stored in two log4j.properties files:

  • 在驱动程序上:On the driver:

    %sh
    cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j.properties
    
  • 在辅助角色上:On the worker:

    %sh
    cat /home/ubuntu/databricks/spark/dbconf/log4j/executor/log4j.properties
    

若要在驱动程序或辅助角色上设置特定于类的日志记录,请使用以下脚本:To set class-specific logging on the driver or on workers, use the following script:

#!/bin/bash
echo "Executing on Driver: $DB_IS_DRIVER"
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j.properties"
else
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/executor/log4j.properties"
fi
echo "Adjusting log4j.properties here: ${LOG4J_PATH}"
echo "log4j.<custom-prop>=<value>" >> ${LOG4J_PATH}

<custom-prop> 替换为属性名称,将 <value> 替换为属性值。Replace <custom-prop> with the property name, and <value> with the property value.

将脚本上传到 DBFS,并使用群集配置 UI 选择群集。Upload the script to DBFS and select a cluster using the cluster configuration UI.

还可以采用相同的方式为驱动程序设置 log4j.propertiesYou can also set log4j.properties for the driver in the same way.

有关详细信息,请参阅群集节点初始化脚本See Cluster Node Initialization Scripts for more information.