Every time you are using AWS cluster to open your zeppelin Notebook?

Gauravkumar
2 min readAug 4, 2021

Lets me start by telling you that this is a very common practice in the industry that is using AWS cluster and zeppelin notebook. This is my personal experience that whenever you want to see your script you need to start EMR Cluster and then after that only you can able to view your script.

How does it affect your company?

To start an EMR cluster in AWS is costing to company per hour, so to avoid this unnecessary cost to the company we can use some techniques to avoid this extra cost.

There are many ways to do so but I will discuss two of the easiest way to open your Zeppelin notebook that is a .json file where you don’t have to open the EMR cluster every time.

First approach: Using the Jupyter notebook, please follow the step, and things are done in a second. First Install the zeppelin2nb module in your anaconda and import it. that’s all, I am attaching the screenshot, please follow this and your task will become easy

Jupyter notebook to open your code written in Zeppelin notebook

Second approach: Using Docker, please follow the steps, and things get done within a minute

  1. Install docker hub — https://hub.docker.com/editions/community/docker-ce-desktop-windows
  2. Install Linux kernel — click here — (manual — here is the manual — check step 4 — to download the kernel)
  3. Restart the docker hub
  4. Check docker version in cmd when you run the command — “docker version”
  5. Download & install apache zeppelin using docker — run the command given below –docker run -p 8080:8080 — rm — name zeppelin apache/zeppelin:0.9.0
  6. once you see the statement

7. go in your browser — http://localhost:8080/

8. You will see zeppelin –

Zeppelin Notebook

you have to keep in mind that these processes will only allow you to open your script and view them or show them without starting any EMR cluster, If you want to run the script on a dataset it will not do that, there are many ways to do that also.

--

--