A Networker's Log File: July 2016

Thursday, July 7, 2016

Unstuck Spark/Zeppelin Jobs on Amazon EMR

Apache Zeppelin + Apache Spark is a perfect match. Basically, you can do the following in one console:

Data Ingestion
Data Discovery
Data Analytics
Data Visualization & Collaboration

As it's still under incubation, the error handling is still not as rock solid. Often, I have experienced Spark jobs being stuck for long time. Usually, restarting the Spark interpreter should do the trick. However, there are times that this simple trick won't work and the only way is to restart the Zeppelin daemon. On Amazon EMR console, do the following:

/usr/lib/zeppelin/bin/zeppelin-daemon.sh stop
/usr/lib/zeppelin/bin/zeppelin-daemon.sh start

If you wish to execute the scripts in zepplin account, which has a nologin shell. Execute following instead:

sudo -s /bin/bash -c '/usr/lib/zeppelin/bin/zeppelin-daemon.sh stop' zeppelin
sudo -s /bin/bash -c '/usr/lib/zeppelin/bin/zeppelin-daemon.sh start' zeppelin

If you encounter this Java connection error: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method), it's probably because Zeppelin starts the spark interpreter in a different process.

Edit /etc/spark/conf/spark-defaults.conf
Comment off the following line and restart Zeppelin

#spark.driver.memory              5g

Reference: http://stackoverflow.com/questions/32735645/hello-world-in-zeppelin-failed