When Spark scripts are executing in the DAS server, you can be see several Spark log files in <DAS_HOME>/work directory. If you execute number of spark scripts often, you might face a disk space issue as there will be a lots of log files.
If you want to allocate a disk space of predefined size for Spark log files, add below configuration to <DAS_home>/repository/conf/analytics/spark/spark-defaults.conf
spark.executor.logs.rolling.strategy size
spark.executor.logs.rolling.maxSize 10000000
spark.executor.logs.rolling.maxRetainedFiles 10
spark.executor.logs.rolling.maxSize contains the maximum size of disk space (in bytes) allocated for log directory. Older log files are deleted when new logs are generated so that the specified maximum size is not exceeded.
spark.executor.logs.rolling.maxRetainedFiles contains the maximum number of log files allowed to be kept in log directory at any give time.
spark.executor.logs.rolling.strategy contains the strategy used to control the log files. From above configuration, the amount of log files is restricted based on the size of log directory.
For other related configuration, you can see Spark docs
If you want to change the default log directory location, change spark.worker.dir config of <DAS_home>/repository/conf/analytics/spark/spark-defaults.conf like below.
spark.worker.dir /home/ubuntu/sparkout
Make sure to restart the server to apply above configuration to the server.
No comments:
Post a Comment