Skip to content Skip to sidebar Skip to footer

Connect To Remote Python Kernel From Python Code

I have been using PaperMill for executing my python notebook periodically. To execute compute intensive notebook, I need to connect to remote kernel running in my EMR cluster. In c

Solution 1:

Hacky approach - Set up a shell script to do the following :

  1. Create a python environment on your EMR masternode using the hadoop user
  2. Install sparkmagic in your environment and configure all kernels as described in the README.md file for sparkmagic
  3. Copy your notebook to master node/use it directly from s3 location
  4. Run with papermill :

    papermill s3://path/to/notebook/input.ipynb s3://path/to/notebook/output.ipynb -p param=1

Step 1 and 2 are one time requirements if your cluster master node is the same every time.

A slightly better approach :

  1. Set up a remote kernel in your Jupyter itself : REMOTE KERNEL
  2. Execute with papermill as a normal notebook by selecting this remote kernel

I am using both approaches for different use cases and they seem to work fine for now.

Post a Comment for "Connect To Remote Python Kernel From Python Code"