Jupyter Notebook - Data Science from a Practical Perspective

Jupyter Notebook is an Integrated Development Environment (IDE, basically an application that supports writing codes) for multiple languages including Python. Jupyter notebook is increasingly seeing more and more usages in the data science community due to its interactivity, flexibility, and presentation capability. In this post, I will introduce you to the basic usage of Jupyter. We will observe more advance features as we move further into data analytics.

Before we start, please make sure you have had your Python environment set up. If not, you can follow my installation guide. On the other hand, if you are ready, let us first verify your installation of Jupyter. From a CMD/terminal window, you can run the command jupyter notebook. If you see ouput similar to mine below

and also get redirected to localhost:8888/tree in your Internet browser which looks like below

then congratulation! You have had Jupyter up and running. Remember, you must start Jupyter using the console command jupyter notebook. The website at your localhost is where you write codes for your analysis. Next, let us look at the Jupyter user interface.

Basic Jupyter Notebook User Interface

Jupyter home

When first starting Jupyter, you are redirected to localhost:8888/tree which is a list representation of your home folder (if you start Jupyter notebook without changing directory in the console window). You can click on a folder to move into it. I will change to the Python folder that I created earlier in my home directory. Notice that the website address and the path in Jupyter change if you move to a different folder. Then, to create a new notebook, a file that contains codes and outputs of your analysis, click on New (top left corner) and select Python (your version may vary from mine).

Notebook interface

Jupyter will then move you to the new notebook, where you will compose and run codes. The user interface of a new notebook in Jupyter looks like below

There are a lot of things we can do here. First, you can click on the Untitled text next to the Jupyter icon to change the name of the notebook. This will change the file’s name in the folder too.

Cell

Next, focus on a text box with In [ ] on its left side. This is called a cell – where you write your code.

Click on the input area to select the cell and change it to editing mode. You can start typing codes now. Let us start with a simple statement print('hello my analyst buddy!').Then click on the Run button. You will see the the following happened:
1. The output of the statement “hello my analyst buddy!” shows up right below the cell. In general, all outputs of any cells appear directly below them.
2. The number 1 appear in In [1]. This number is the ordinal number that the cell is executed in the current session. The cell that you just ran displays 1 because it is the first cell executed in your session. If you select the cell and run again, you will see it changing to 2, 3, etc. By the way, yes, you can rerun a cell multiple times. This feature will be useful in a lot of use cases later on.
3. Jupyter will create a new cell and move the highlight there. It is very convenient!

Saving your notebook

Finally, you can save your notebook with the save button below the File option. All codes and outputs will be kept.

Note that while appearing as a website, Jupyter notebook runs locally in your computer. Any folders and files you created from Jupyter are there in your system. For example, I can find the hello jupyter notebook in the Python folder in my home directory (because I created the Python folder there).

Bye for now

Jupyter notebook has much more to offer than what we have just discussed. We will eventually discover all the advance features as well move deeper in to data analytics. For now, this is fairly enough for you to start working with Python which is what we will next. So, I will see you in the next post!