How to redirect entire output of spark-submit to a file
spark-submit
prints most of it's output to STDERR
To redirect the entire output to one file, you can use:
spark-submit something.py > results.txt 2>&1
Or
spark-submit something.py &> results.txt
Output results of spark-submit
You can also locate your spark-defaults.conf
(or spark-defaults.conf.template
and copy it to spark-defaults.conf
)
Create a logging dir (like /tmp/spark-events/
)
Add these 2 lines:
spark.eventLog.enabled true
spark.eventLog.dir file:///tmp/spark-events/
And run sbin/start-history-server.sh
To make all jobs run by spark-submit
log to event dir and overviews available in History Server (http://localhost:18080/) => Web UI, without keeping your spark job running
More info: https://spark.apache.org/docs/latest/monitoring.html
PS: On mac via homebrew this is all in the subdirs /usr/local/Cellar/apache-spark/[version]/libexec/
Related Topics
How to Use Ptrace(2) to Change Behaviour of Syscalls
How to Log from a Non-Root Debian Linux Daemon
Autoconf Check for Program and Fail If Not Found
The Difference Between Wait_Queue_Head and Wait_Queue in Linux Kernel
Linux Core Dumps Are Too Large!
Git Error: Gpg Failed to Sign The Data on Linux
Check The Output of "Make" and Exit Bash Script If It Fails
Automatically Feed Input to Linux Command Line
How to Efficiently Move Many Files to a New Server
How to Send a Mail with a Message in Unix Script
Naming Convention for Posix Flags
What Is an Interface Identifier
Safer Alternative to Matlab's 'system' Command
Jetty Bash Script Works Only with Root User
Docker - Is It Safe to Switch to Non-Root User in Entrypoint