Skip to content
Data Science from a Practical Perspective
  • Home
  • About Me
  • Blog
  • Presentations
  • Contact

Preprocessing

Home » Data Science » Preprocessing

Data Pipeline

Posted by By tienlinhle32 March 20, 2023Posted inData Science, Preprocessing2 Comments
The pipeline that we have learned previously is very useful but only performs a fixed sequence of transformation on the input. However, more often than not, we want to apply…
Read More

Processing Pipeline

Posted by By tienlinhle32 March 18, 2023Posted inData Science, Preprocessing1 Comment
At this point, we have gone through quite some preprocessing methods for different issues in data such as handling outliers, scaling, imputation, encoding, etc. So, I think it is now…
Read More

Train-test Split

Posted by By tienlinhle32 March 15, 2023Posted inData Science, Preprocessing1 Comment
Predictive analysis is a major branch of data analytics where we want to apply knowledge learned from historical data on new data. However, there is one potential issue in this…
Read More

Encode Categorical Data

Posted by By tienlinhle32 March 14, 2023Posted inData Science, Preprocessing1 Comment
Categories are a big part in tabular data. You will see them more often than not, and it is just inevitable. However, a lot of analytical models cannot handle categorical…
Read More

Scale Numeric Data

Posted by By tienlinhle32 March 10, 2023Posted inData Science, Preprocessing1 Comment
It is very often that we have numeric columns with very different scales in the same data set. For example, a data set may have people's income in the range…
Read More

Handle Missing Data

Posted by By tienlinhle32 March 6, 2023Posted inData Science, Preprocessing2 Comments
Missing data is prevalent in analytics. They are fields in your data without a valid value, and they must be addressed. Otherwise, most analytical models would omit data that has…
Read More

Handling Outliers

Posted by By tienlinhle32 March 1, 2023Posted inData Science, Preprocessing2 Comments
Now that we have had a good idea on what to do or can be done during an exploratory analysis, it is time to move on to data preprocessing! So,…
Read More

Handle Skewed Data

Posted by By tienlinhle32 February 25, 2023Posted inData Science, Preprocessing
A while ago, we discussed distributions of numeric columns. Depending on the types of analysis, sometimes a symmetrical distribution is preferred over a skewed one. So, in this post, I…
Read More

Recent Posts

  • Tree Ensemble
  • Decision Tree Pipeline
  • Tuning Decision Trees
  • Regression Tree
  • Decision Tree Splits in Classification

Recent Comments

  1. Tree Ensemble - Data Science from a Practical Perspective on Decision Tree Pipeline
  2. Regression Tree - Data Science from a Practical Perspective on Decision Tree for Classification
  3. Decision Tree Splits in Classification - Data Science from a Practical Perspective on Decision Tree for Classification
  4. Support Vector Machine Pipeline - Data Science from a Practical Perspective on Logistic Regression
  5. Support Vector Regression - Data Science from a Practical Perspective on Support Vector Machine for Binary Classification

Archives

  • December 2023
  • October 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022

Categories

  • Classification
  • Data Science
  • Data Set Basics
  • Deep Learning
  • Exploratory Analysis
  • Graph Analysis
  • Machine Learning
  • Model Evaluation
  • Model tuning
  • Preprocessing
  • Presentation
  • Python
  • Python Basics
  • Regression Analysis
  • Regularization
  • Text Analysis
  • Tree Ensemble
  • Decision Tree Pipeline
  • Tuning Decision Trees
  • Regression Tree
  • Decision Tree Splits in Classification
  • Decision Tree for Classification
  • Support Vector Machine Pipeline
  • Support Vector Regression
Copyright 2025 — Data Science from a Practical Perspective. All rights reserved. Sinatra WordPress Theme
Scroll to Top