So please be a bit patient, because before diving in data analytics, we still have something to do. As you may have guessed, that is to get used to the Python programming language first. As a data analyst, you do have to write codes. Therefore, understanding concepts like Python variables, conditions, loops, functions, etc., is essential. For the next few posts, I will explain each of them, then, we can finally start doing data stuffs! First, let talk about math operations in Python. You can get the complete notebook for this post here.
Mathematical operations
Among the simplest types of codes you can write in Python are mathematical operations. Just type any math expressions into a Jupyter cell and run it, you will see the result appear immediately. As you can see in the example below, basic operators include +
for addition, -
for subtraction, *
for multiplication, /
for division, and **
for raising power.
10 + 5
15
15 - 8
7
8 * 9
72
90 / 10
9.0
4**3
64
You can combine operations to have a longer expression, as well as use parentheses just like normal math. By the way, the spaces between numbers and operators are not necessary, however, you should organize your code so it looks clear and easy to read.
10 * (5+7) / 6
20.0
Python variables
Python variables are symbolic names that can be assigned values and reused later. Value assignments are done with the =
operator. The variable should be on the left side of =
, and the value to assign, the right side. After assignment, the variables carries the value, and Python uses the assigned values whenever it encounters the variables. The example below does the following
1. Assign the value 1
to the variable names x
2. Assign the value 2
to the variable names y
3. Calculate the sum of x + y
which is equivalent to 1 + 2
, so the result 3
shows below the cell.
x = 1
y = 2
x + y
3
You can name variables with pretty much anything, as long as 1) the name starts with a letter or a underscore “_
“; and 2) only letters, numbers, and _
are allowed. A good practice is to name variable as descriptive as possible. So, instead of things like a
, b
, or x
, y
, give your variables names that reflect what they are. For example, a you can have numeric variables as num1
, num2
, or text variables as string1
, string2
, or lists as list_of_nums
, list_of_strings
, etc. Now, you should try to play around with variable assignments and calculation a bit to get used to the concepts. For example, try changing the variable names, the number values, and the math operator in the code below and rerun. You can also create new variables to add to the expression. Remember to change the variable names in both the assignments and the operations to avoid unexpected errors.
num1 = 15
num2 = 20
num1 * num2
Python variables in data analytics
We use variables to store almost everything necessary during an analytical process, for example, a dataset, a model, predicited labels, etc. While we will not write codes for any analysis at this moment, I can still give you a small example on usage of variables in data analytics. The code below perform the regression task on a data set call “diabetes” from the scikit-learn library which uses the following variables:
– data
stores the dataset diabetes from the scikit-learn library
– features
is assigned with the feature part of the diabete data
– targets
gets the target part of the diabete data
– linear_model
represents a regression model
– predicted_targets
stores the prediction made by linear_model
By the way, any lines starting with the #
sign is called a comment. Comments are parts of your codes that are ignored by Python when running. It is useful for inline notes or explanations of codes. Writing good comments is also essential, especially if you work in a team.
#load necessary libraries
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
#load data
data = load_diabetes()
#extract features and labels
features = data['data']
targets = data['target']
#create model
linear_model = LinearRegression()
#train model
linear_model.fit(features,targets)
#make prediction
predicted_targets = linear_model.predict(features)
Conclusion
This post provides you with a first and brief view on Python math expressions and variables. These two components are the most basic but most important in any programming languages. While math expressions are straightforward, the concept of Python variables may take some times to get used to if you are new to programming. So, please take your time, do some practice with variables. We will next move on to if-else in Python, and I will see you there!
Pingback: print() Function - Data Science from a Practical Perspective
Pingback: Functions in Python - Data Science from a Practical Perspective